Upload
lidia-pivovarova
View
207
Download
0
Embed Size (px)
Citation preview
Data Augmentation Method for the Image Sentiment
AnalysisAlexander Rakovsky1, Arseny Moskvichev2, Andrey Filchenkov1
1ITMO UniversitySaint Petersburg, Russia
2Saint Petersburg State UniversitySaint Petersburg, Russia
Image sentiment analysis
Positiveness: 0.9
Positiveness: 0.01
Why is it important?
Two words:
Social networks
How do we approach it?
1.Collect lots of labeled images2.Train a convolutional neural network3.???4.Profit
How do we approach it?
1.Collect lots of labeled images 2.Train a convolutional neural network3.???4.Profit
Problem!
Solution
Data augmentation.
1.Get a few manually labeled images with corresponding hashtags
2.Learn to reconstruct labels from hashtags3.Collect as much labeled data as you need!
Details
• Collecting data through FLICKR API (using keywords)
• Assessors evaluate the emotional colouring (positiveness) of each image
• Converting hashtags to vector representation (word2vec), and averaging them
• Using machine learning to predict assessors’ estimation
(Preliminary!) Results
• kNN accuracy on classification task: 0.95• Average correlation between assessors: 0.86• Between the kNN regression and assessors:
0.83• Using this algorithm is almost as good as
hiring one more assessor!• Suspiciously good...
Details
• Collecting data through FLICKR API (using keywords)
• Assessors evaluate the emotional colouring (positiveness) of each image
• Converting hashtags to vector representation (word2vec), and averaging them
• Using machine learning to predict assessors’ estimation
Nonrepresentative sample!
Pros
• Easy to use (no word preprocessing)• Good results* (compared to dictionary -
based solutions)
Cons
• Needs pre-training and an initial manually labeled sample
Conclusions
• The proposed method affords a simple and efficient hashtag-based data augmentation solution for image sentiment analysis.
• More work is to be done to estimate the method’s performance on a general set of images.
Thank you!