Upload
nickolas-jordan
View
216
Download
1
Embed Size (px)
Citation preview
11
Selected Applications of Transfer Learning
杨强, Qiang Yang
Department of Computer Science and EngineeringThe Hong Kong University of Science and Technology
Hong Kong
http://www.cse.ust.hk/~qyang
2
Case 1: 目标变化 目标迁移 Target Class Changes Target
Transfer Learning Training: 2 class problem Testing: 10 class problem. Traditional methods fail
Solution: find out what is not changed bewteen training and testing
3
Our Work Cross-Domain Learning
TrAdaBoosting (ICML 2007) Co-Clustering based Classification
(SIGKDD 2007) TPLSA (SIGIR 2008) NBTC (AAAI 2007)
Translated Learning Cross-lingual classification (in WWW
2008) Cross-media classification (In NIPS 2008)
Unsupervised Transfer Learning Self-taught clustering (ICML 2008)
4
Our Work (cont) Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu.
Translated Learning. In Proceedings of Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008), December 8, 2008, Vancouver, British Columbia, Canada. (Link)
Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. Cross-Domain Spectral Learning. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD 2008), Las Vegas, Nevada, USA, August 24-27, 2008. 488-496 (PDF)
Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Self-taught Clustering. In Proceedings of the 25th International Conference on Machine Learning (ICML 2008), Helsinki, Finland, 5-9 July, 2008. 200-207 (PDF)
Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Boosting for Transfer Learning. In Proceedings of The 24th Annual International Conference on Machine Learning (ICML'07) Corvallis, Oregon, USA, June 20-24, 2007. 193 - 200 (PDF)
Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong Yu. Co-clustering based Classification for Out-of-domain Documents. In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD'07), San Jose, California, USA, Aug 12-15, 2007. Pages 210-219 (PDF)
Dou Shen, Jian-Tao Sun, Qiang Yang and Zheng Chen. Building Bridges for Web Query Classification. In Proceedings of the 29th ACM International Conference on Research and Development in Information Retrieval (ACM SIGIR 06). Seattle, USA, August 6-11, 2006. Pages 131-138. (PDF)
5
Query Classification and Online Advertisement ACM KDDCUP 05
Winner SIGIR 06 ACM Transactions
on Information Systems Journal 2006 Joint work with Dou
Shen, Jiantao Sun and Zheng Chen
66
QC as Machine Learning
Inspired by the KDDCUP’05 competition Classify a query into a ranked list of
categories Queries are collected from real search
engines Target categories are organized in a tree
with each node being a category
77
Related Works
Document/Query Expansion
Borrow text from extra data source
Using hyperlink [Glover 2002];
Using implicit links from query log [Shen 2006];
Using existing taxonomies [Gabrilovich 2005];
Query expansion [Manning 2007]
Global methods: independent of the queries
Local methods using relevance feedback or pseudo-relevance feedback
Query Classification/Clustering
Classify the Web queries by geographical locality [Gravano 2003];
Classify queries according to their functional types [Kang 2003];
Beitzel et al. studied the topical classification as we do. However they have manually classified data [Beitzel 2005];
Beeferman and Wen worked on query clustering using clickthrough data respectively [Beeferman 2000; Wen 2001];
8
Target-transfer Learning in QC Classifier, once trained, stays constant
Target Classes Before Sports, Politics (European, US, China)
Target Classes Now Sports (Olympics, Football, NBA), Stock Market (Asian,
Dow, Nasdaq), History (Chinese, World) How to allow target to change?
Application: advertisements come and go, but our querytarget mapping needs not be
retrained! We call this the target-transfer learning
problem
99
Solutions: Query Enrichment + Staged Classification
Target Categories
Queries
Solution: Bridging classifier
Construction of Synonym- based
Classifiers
Construction of Statistical Classifier
QuerySearch Engine
Labels of Returned
Pages
Text of Returned
Pages
Classified results
Classified results
Finial ResultsPhase II: the testing phase
Phase I: the training phase
The Architecture of Our Approach
1010
Category information
Full text
Step 1: Query enrichment Textual information
TitleSnippet Category
1111
Step 2: Bridging Classifier
Wish to avoid: When target is changed, training needs to
repeat! Solution:
Connect the target taxonomy and queries by taking an intermediate taxonomy as a bridge
1212
Bridging Classifier (Cont.)
How to connect?
Prior prob. of IjC
The relation between and I
jC
TiC
The relation between and I
jC
q
The relation between andTiC
q
1313
Category Selection for Intermediate Taxonomy
Category Selection for Reducing Complexity
Total Probability (TP)
Mutual Information
1414 / 68
Experiment─ Data Sets & Evaluation
ACM KDDCUP Starting 1997, ACM KDDCup is the leading Data Mining and
Knowledge Discovery competition in the world, organized by ACM SIG-KDD.
ACM KDDCUP 2005 Task: Categorize 800K search queries into 67 categories Three Awards (1) Performance Award ; (2) Precision Award; (3) Creativity
Award Participation
142 registered groups; 37 solutions submitted from 32 teams
Evaluation data 800 queries randomly selected from the 800K query set 3 human labelers labeled the entire evaluation query set
Evaluation measurements: Precision and Performance (F1) We won all three. a
3
1
i)labeler human against (F13
1 F1 Overall
i
15
Result of Bridging Classifiers
Using bridging classifier allows the target classes to change freely
no the need to retrain the classifier!
Performance of the Bridging Classifier with Different Granularity of Intermediate Taxonomy
16
Summary: Target-Transfer Learning
Query
IntermediateClass
Target class
Similarity classify to
17
Cross-Domain Learning
LearningInput Output
18
Case 1 Source
Many labeled instances Target
Few labeled instances Target and source domains
Same feature representation Same classes Y (binary classes) Different P(X,Y) distribution
1919
TrAdaBoost = Transfer AdaBoost (cont.)
Given Insufficient labeled data from the target domain
(primary data) Labeled data following a different distribution
(auxiliary data) The auxiliary data are weaker evidence for building
the classifierTarget training source + target Uniform weights (X)
2020
TrAdaBoost = Transfer AdaBoost (cont.)
Misclassified examples: increase the weights of the misclassified
target data decrease the weights of the misclassified
source data
21
TrAdaBoost = Transfer AdaBoost (cont.)
Performance
22
Transfer Learning in Sensor Transfer Learning in Sensor Network TrackingNetwork Tracking
Received-Signal-Strength (RSS) based localization in an Indoor WiFi environment.
Where is the mobile device?
-40dBm-70dBm-30dBm
Access point 2
Access point 1
Access point 3
Mobile device
(location_x, location_y)
23
Distribution Changes
The mapping function f learned in the offline phase can be out of date.
Recollecting the WiFi data is very expensive. How to adapt the model ?
TimeNight time period Day time period0t 1t
24
Transfer Learning in Wireless Sensor Networks
Transfer across time Transfer across space Transfer across device
25
Latent Space based Transfer Learning Latent Space based Transfer Learning (Spatial Transfer) (Spatial Transfer) Transfer Localization Models across Space [Pan, Yang et al. AAAI 08]
Some labeled data collected in Area A and unlabeled data in B;
Only a few labeled data collected in Area B;
Want to: Construct a
localization model of the whole area (Area A and Area B)
26
Transfer across timeTransfer across time
Area: 30 X 40 (81 grids)
Six time periods: 12:30am--01:30am08:30am--09:30am12:30pm--01:30pm04:30pm--05:30pm08:30pm--09:30pm10:30pm--11:30pm
LeMan:
Static mapping function learnt from offline data;
LeMan2:
Relearn the mapping function from a few online data
LeMan3:
Combine offline and online data as a whole training data to learn the mapping function.
27
Transfer knowledge via latent manifold learning
Latent Manifold
Labeled WiFi DataLabeled WiFi Data
Knowledge Propagation
28
VIP Recommendation in Tencent Weibo
Friendship relations in Tencent QQ, which is the largest instant messenge network
KnowledgeTransfer
Properties:1. Data Sparsity: limited neighbors for most users
2. Heterogeneous Links: symmetric friendship vs. asymmetric following
3. Large Data: 1 billion users and tens of billion links
VIP Recommendation Based on One's 1. X: Friendship on QQ2. S1: User Following Relations on Tencent Weibo3. S2: VIP Following Relations on Tencent Weibo
Social Relation based Transfer (SORT)
30
Social App Recommendation in Tecent Qzone
Qzone (http://qzone.qq.com) is the largest social network in China.
Other Applications
Video Recommendation in Tencent Video
Four types of auxiliary data 1. binary ratings2. social networks3. context4. video content
Rating Prediction
Activity Recognition With sensor data collected on mobile
devices Location
GPS, Wifi, RFID Context: location, weather, etc.
From GPS, RFID, Bluetooth, etc. Various models can be used
Non-sequential models: Naïve Bayes, SVM …
Sequential models: HMM, CRF …
Activity Recognition: Input & Output (Vincent Zheng, A* Sg) Input
Context and locations Time, history, current/previous locations, duration,
speed, Object Usage Information
Trained AR Model Training data from calibration Calibration Tool: VTrack
Output: Predicted Activity Labels
Running? Walking? Tooth brushing? Having lunch?
32
http://www.cse.ust.hk/~vincentz/Vtrack.html
Datasets: MIT PlaceLab http://architecture.mit.edu/house_n/placelab.html
MIT PlaceLab Dataset (PLIA2) [Intille et al. Pervasive 2005]
Activities: Common household activities
33
Cross Domain Activity Recognition [Zheng, Hu, Yang, Ubicomp 2009]
Challenges: A new domain of
activities without labeled data
Cross-domain activity recognition
Transfer some available labeled data from source activities to help training the recognizer for the target activities.
34
CleaningIndoor
Laundry
Dishwashing
How to use the similarities?
35
Source Domain
Labeled Data
Source Domain
Labeled Data
Similarity
Measure
Similarity
Measure
<Sensor Reading, Activity Name>
Example: <SS, “Make Coffee”>
<Sensor Reading, Activity Name>
Example: <SS, “Make Coffee”>
Example: sim(“Make
Coffee”, “Make Tea”) = 0.6
Example: sim(“Make
Coffee”, “Make Tea”) = 0.6
Example: Pseudo Training Data: <SS, “Make Tea”, 0.6>
Target Domain Pseudo
Labeled Data
Target Domain Pseudo
Labeled Data
Weighted SVM Classifier
Weighted SVM Classifier
THE WEB
Calculating Activity Similarities How similar are
two activities?◦ Use Web search
results◦ TFIDF: Traditional IR
similarity metrics (cosine similarity)
◦ Example Mined similarity
between the activity “sweeping” and “vacuuming”, “making the bed”, “gardening”
Calculated Similarity with the activity "Sweeping"
Similarity with the activity "Sweeping"
36