Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
The Third Workshop for NExT++
CHUA Tat-Seng/ Sun Maosong/ Wendy HallSINGAPORE16 May 2018
NEXT SEARCH CENTRE下一代搜索技术联合研究中心a NUS-Tsinghua joint centre on extreme search
• NExT++: NUS-Tsinghua Centre on Extreme Searcho Research on Big Unstructured data Analytics with
Applications in Wellness, Fintech and Smart Nation
o We are among the first to look into this topic in 2010
o Phase I: May 2010 to Sep 2016 with a Grant of S$11 millionEmphasis: Technology for unstructured data analytics
o Phase II: Oct 2016 to Sep 2021 with a Grant of S$12 millionEmphasis : Deep unstructured data analytics
o Additional Collaborator: Southampton Universityo With active participation of over 15 professors, over 30
PhD students, and 10 full-time researchers
• We focus on unstructured data analyticso Two key challenges: big data and paradigm change
• Big Data Challenges:1) Big Data Wellness Analytics2) Multimodal KG & Chabot3) Fintech4) Rich Media Analytics5) Recommendation
• Paradigm Change Challenges:1) From Video to 3D and VR2) From Recommendation to Influence
• Applications:o Wellness, Fintech & Smart Nation
• Deliverables:o Software Infrastructures to help nurture and
incubate new enterpriseso Work with industrieso New enterprises
USERS
Food Intake & Habits
Activities (Physical &
Cyber)
Others: Sensors for Vital Signs, Test;
Environment DataUser Data
Wellness Knowledge
Pregnancy Diabetes Depressions, etc.
Knowledge
Apps
PersonalizationAnalytics
• Other related research done at Tsinghua: – Depression modeling and monitoring– Sleep disorder– Intelligent remote medical examination system and deployment
• Southampton: – Lifestyle interventions for Asthma Patients:
2) Multi-modal Knowledge Graph (MMKG)
Research on fundamental building blocks towards MMKG• Basic research on triplet extraction from text
and video• Research on bridging text and knowledge for
joint representation of words and entities
Building MMKG in fashion and food/wellness domains
Conversation systems: Task oriented Chabot and Chabot with emotion Multimodal Chabot: Research on capturing multimodal semantics,
generating responses based on conversation history and domain knowledge, and reinforcement learning to further optimize the model
• Leverage on alternate big data for Futures and Commodity price forecasting
• Related research done at Tsinghua: – Fundamental and technical analysis of the future markets– Network analysis for Fintech
How to augment data to improve classifiers?• Traditional: data crawling, data cropping, flipping• New: domain transferring, data generation
Our research: conditional data generation• Generate data to improve model training
esp. for fine grained classification tasks
labelling management
Training Task management
model management Experiment management
Dataset management
Pipeline for end to end DL development cycles for Data Scientists
• User behaviors are affiliated with rich side information: – User demographics; Item attributes– Textual reviews; Various contexts – Shallow methods like MF (matrix
factorization) can’t incorporate them well.
• 1ST Enhancement: – Neural Factorization Machine (NFM) which
can automatically learn: 1) Higher-order feature interactions implicitly2) Second-order feature interactions explicitly3) Feature representations (no manual efforts)
– Used by Alibaba for search ads ranking.
[Zhang etal KDD’16]
[He and Chua, SIGIR’17]
• 2nd Enhancement: Tree-enhanced Embedding Model– NFM and others are “black-box” in learning higher-order interactions. – We learn explainable rules with trees and integrate into neural model.
Embed the explainable rules on certain features from decision trees
Provide sound & explainable reasons (rules) on why theproduct is suitable choice for the user.
Recommendation reason:User being <User-City: Florida, User-Style: Nightlife Seeker LuxuryTraveler, User-Age: 50-64> would like to visit Item being <Item-Price: $$$$, Item-Attr: Foie Gras, Lobster>
[Wang et al, WWW’18]
This phase focuses on bringing insights & intelligence to the systems:o Let machine handles big data, while human do creative taskso Wide range of real-time unstructured big data analyticso Applications in wellness, Fintech, and smart nation ....o Towards large-scale systems research, with deployment and collaborate
with industries
We have initiated several collaborations with industries on research in Fintech, wellness and e-commerce
Look forward to suggestion and collaboration on next phases of research