1. Video AI for Media and Entertainment Industry Albert Y. C.
Chen, Ph.D. Vice President, R&D Viscovery
2. Albert Y. C. Chen, Ph.D. Experience 2017-present: Vice
President of R&D @ Viscovery 2016-2017: Chief Scientist @
Viscovery 2015: Principal Scientist @ Nervve Technologies 2013-2014
Computer Vision Scientist @ Tandent 2011-2012 @ GE Global Research
Education Ph.D. in Computer Science, SUNY-Buffalo M.S. in Computer
Science, NTNU B.S. in Computer Science, NTHU
3. Viscovery = Video Discovery Optical Character Recognition
Offline Recognition 2013 2014 Product Recognition 2015 Video
Content related Advertisements 2017 Wearable Devices Video Content
Discovery & Interaction 2016 Leading provider of Video AI
analytic products
4. Current AI does not solve it all appl. layer tech layer
infra layer solution platform libraries modules data machine
computing power data accumulation via open API AI/DNN library
AI/DNN library gen purpose platforms gen purpose platforms
app-specic platforms app-specic platforms app app app app app HW
co. VerticalAIStartups agri. manu. med. n. retail trans. E.g., 1:
Google, Amazon, FB, 2: IBM, 3: Walmart, 5: NVidia
5. Vertical AI Solving industry-specic problems by combining AI
and Subject Matter Expertise. Full Stack Products Subject Matter
Expertise Proprietary Data AI delivers core value (Bradford Cross,
2017/06/14)
6. Vertical AI Example
7. Media & Entertainment Industrys challenge Internet Era:
Make content free, maximize trafc, ad revenue waiting at the end of
the rainbow? It worked for nearly 20 years, with Google and
Facebook being the only beneciary; they control 75% of digital ad
revenue, 99% of future growth. Is this business model still
working? Does it work for others? The latest unicorns from Silicon
Valley are suggesting otherwise.
8. Content Farms, maximizing trafc, killing the Internet along
the way.
9. NY Time saying no. WSJ and many others are following.
Source: https://www.nytimes.com/projects/2020-report/
10. People are willing to pay for good content
11. The curveball: App Stores and News Syndicators! News
Republic (acquired for 57M use, Aug 2016) 12.5 million daily active
users 60k USD annual revenue (toutiao.com) 80 million daily active
users. 1B USD annual revenue.
12. Pay source, or pay platform? Platform: More focus, less
distraction: news focus on content instead of customer service,
software development, etc. Potential Problem: Facebook and Google
control 75% of all trafc and 99% of expected future growth?
13. Netix Netix spends $250m USD yearly on personalization and
content recommendation. 104m subscribers worldwide; 52m in US (75%
market penetration, #1 in US, Youtube #2 at 53%) Netix subscribers
watch 19 days per month, for 28H/month (#2, less than Dishs 47
H/month)
15. Netix net income (20002016)
https://www.statista.com/statistics/272561/netix-net-income/
16. People are willing to pay, for good content, good
service.
17. The evolution of methods for monetizing text/video content
Struggling Traditional Media Free Content Ad Revenue Subscription
Revenue 2000 2005 2010 Do nothing? Sitting Duck. Improve Ad
Revenue? Ad Tech now Video Content-related ads Own platform? shared
platform, licensed content? tailored recommendations (improve UX
& stickiness) (user & video content related
recommendations) Video Data Mining
18. If we already have such precise indexing of video content
Jay Chao singing A dancing B wearing C with items D in front of E
at time F? We will disrupt: advertisement e-commerce online video
platform ecosystem screenwriting, lm producuction and lm
editing..
19. Video content-related advertisements Previous moment:
dining scene Insert Food Deliver Service ad Next Moment: dining
scene Food Delivery Service Ad: Previous moment: dining scene
Insert KFC ad Next second: dining scene Restaurant Ad:
20. Video content-related advertisements Previous moment:
driving scene Insert Automobile ad Next moment: driving scene
Automobile Ad: Consumer Electronics Ad:
21. Video content-related interactive shopping
22. Video content-related Recommendations
23. Video-content insights (for producers, writers, editors)
Viscoverys video insight publication on Ode to Joy 2
24. Mining Video Content with Computer Vision 85% of data are
unstructured, e.g., videos. Previously, videos need manual tagging
before its content can be indexed and further utilized. Computer
Vision is the AI subeld that focuses on recognizing and
understanding visual content.
25. What algorithms do we need? Face Motion Image scene Text
Audio Object Semantics
26. Where are we now? Face Object Scene Logos Text Audio Motion
Semantics
27. Where are new now? Face Recognition 1 to 1: 99%+ 1 to 100:
90% 1 to 10,000: 50%-70%. 1 to 1M: 30%. LFW dataset, common FN,
FP
28. Where are we now? Image Scene Classication MIT Places 365
dataset. top-5 accuracy rates >85%.
29. Where are we now? Object Detection & Classication
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 1000+
classes, 1.2M images. 0 0.125 0.25 0.375 0.5 11 12 13 14 11 12 13
14 classication error classication +localization error
30. Putting things together is not trivial and often very
messy. Classical Workow: 1. Data collection 2. Feature Extraction
3. Dimension Reduction 4. Classier (re)Design 5. Classier
Verication 6. Deploy Modern Brute-force workow 1. Data collection
2. Throw everything into a Deep Neural Network 3. Mommy, why doesnt
it work ???
31. Classical Problem #1: Curse of Dimensionality ze sit
sentarse Number of Variables vs Number of Samples Q. Who would make
such naive mistakes? A. Many newbies repeatedly do so.
32. Example 1-1: illegal parking detection legal parking
samples x100 illegal parking samples x100 Lets train a 150-layer
Res-Net!!! What could possibly go wrong?
33. Example 1-1: illegal parking detection Data: try cleaner
data Feature: ne-tune with pre-trained model; dont train from
scratch Classier overtting: beware of statistical
coincidences,
34. Example 1-2: Smart Photo Album with Google Cloud
Vision
35. Example 1-2: Smart Photo Album with Google Cloud Vision No
effective distance measure for thousands, if not millions of
dimensions (tags); would be approximately zero most of the
time.
36. Classical Problem #2: Overtting Data Make sure your deep
learning algorithm is learning better features for data, not
overtting the data with complex classiers.
37. Luckily, were in AI startup boom! (BCG AI Report, 2016/10)
appl. layer tech layer infra layer solution platform libraries
modules data machine computing power data accumulation via open API
AI/DNN library AI/DNN library gen purpose platforms gen purpose
platforms app-specic platforms app-specic platforms app app app app
app HW co. VerticalAIStartups agri. manu. med. n. retail trans.
E.g., 1: Google, Amazon, FB, 2: IBM, 3: Walmart, 5: NVidia
38. Vertical AI Startups Solving industry-specic problems by
combining AI and Subject Matter Expertise. Full Stack Products
Subject Matter Expertise Proprietary Data AI delivers core value
(Bradford Cross, 2017/06/14)
39. Examples of Vertical AI beating General Purpose AI
40. TOP 5 TAGS COMPARISON TAG AD PLACEMENT VALUE TAG AD
PLACEMENT VALUE Person Low Coulee Nazha (actress) High Anime Low
Sean Sun (actor) High Screenshot Low Back of smartphone High
Cartoon Low Female Medium Adult Medium Young Medium FIRST LOVE
DRAMA SERIES SCENE Competitive Analysis Baidu vs. Viscovery TOP 5
TAGS COMPARISON TAG (Mans Face) AD PLACEMENT VALUE TAG AD PLACEMENT
VALUE Age: 32 Medium Necklace High Asian Medium Baseball cap High
Male Medium Bracelet High Not smiling Low (inaccurate) Ziwen Wang
High Examples of Vertical AI beating General Purpose AI
41. Use AI to turn unstructured video data into a gold mine! 60
mins0 mins z CTR: 0.2% 60 mins0 mins z 60 mins0 mins using only
physical tags for recommendation CTR: 0.9% CTR: 2.0% z z Smartphone
Ad physical plus abstract and emotional tags physical, abstract and
emotional tags plus feedback