Upload
-
View
27
Download
0
Embed Size (px)
Citation preview
Optimizing Profit for the Movie Theatre
김가영 박수현 박현도 성훈 이재완 Keynote by : 진겸
> Which variables are needed(can be used)> Where to find actually useable variables’ database > How to make use of the obtained variables > What kind of Analysis techniques we should use > How to partition the screens in the cinema> What kind of algorithm to use for movie allocation > What kind of data structure to use for storing movies > What kind of data structure to use for scheduling movies
1 Setting Goals 2 Fixing(reinitializing) variables 3 Analyzing and predicting the box office 4 Scheduling the movies
Goals
Profit to variety
Discard immeasurable states
Predicting the Box-office
Variables변수
Variables변수
감독 날씨
등급
공휴일
정치판
지역
좌석수
평점
배우
시간
검색량배급사
마케팅홍보장르
국적
Variables사용가능한 변수
Movie’s Inherent Specification
Quantitive Value
주연배우 인지도
감독 인지도
배급사
조연배우 인지도
Social Buzz관객 평점
프랜차이즈
개봉전 평점
Need to modify variables for actual analysis!
* variables will be continuously modified and changed
개봉후 1주차 관객수 (매출점유율)
배우 인지도 - 근 5년간 출연작품의 관객수(200만)를 바탕으로 discrete한 점수를 매긴다. - 포털/검색엔진/SNS의 검색어 순으로 점수를 매긴다. - 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
감독 인지도 - 근 5년간 작품의 관객수를 바탕으로 discrete한 점수를 매긴다. - 각종 영화관련 수상중 유의미한 수상내역을 바탕으로 점수를 매긴다.
배급사 - 출시한 작품의 관객수를 바탕으로 discrete한 점수를 매긴다. - More search and discussion needed
프랜차이스 - More search and discussion needed
Social Buzz - Google/Naver Search Results - Twitter API/Exclusive sites
Variables need correct scope, standard, interval
Where to get?
관객 점유율http://www.kofic.or.kr/kofic/business/infm/introBoxOffice.do
평점 http://movie.naver.com/
감독 인지도http://www.kobis.or.kr/kobis/business/mast/peop/searchPeopleList.do
배우 인지도
http://www.kobis.or.kr/kobis/business/mast/peop/searchPeopleList.do http://www.kobis.or.kr/kobis/business/mast/mvie/searchMovieList.do
배급사 관련http://www.kobis.or.kr/kobis/business/mast/mvie/searchMovieList.do
버즈량http://snsbuzz.com/m_index.php
https://www.tibuzz.co.kr/
영화관 입장권 통신전산망
How can we get meaningful result?
Regression Analysis회귀 분석
Linear regression Logistic regression Bass defusion Multiple regression
2차원 Data선형적, 양의 상관관계가 있음
Multiple Linear Regression
Linear Regression
종속변수 (Output) 예측값
Parameter Vector(What we want to find)
독립변수(Data input)
Variable 1 Variable p
Coefficient 1 Coefficient 2
Multiple Linear Regression변수가 좀 많을때…
Mean Squared Error
Optimization with differentiation on w
Prediction Real Value
OLS regression
Non linear data linearalized by raising powers of variables
What if data is not linear?
Movie 4 : 1.56
Movie 1 : 1.17Movie 2 : 1.03
…
Movie 3 : 0.91
Regression Results >
Movie 4 : 1.56
Movie 1 : 1.17Movie 2 : 1.03
…
Movie 3 : 0.91
Need to be sorted by order
Regression Results >
Movie 4 : 1.56
Movie 1 : 1.17Movie 2 : 1.03
…
Regression Results >
Movie 3 : 0.91
Need to be sorted by order
Tournament Tree
Choosing the winnerTotal sort time of ( )logO n n
Movie 1
Movie 2Movie 3
……Movie n
31% 24% 20% 15%
Movie 4
90% Cutting point
Movie 1 : 1.56Movie 2 : 1.17Movie 3 : 1.03
…
Regression Results >
Problems we might interface
Data modification is too hard for analysis
R^2 not at the right precision (lower than 0.65)
그냥 구현을 못함 ㅜㅜ
-> Mechanical TulkConfigure parameters on our own
Not enough training data
Requisite Skillsets
Tensorflow
Scipy
Time Scheduling
Movie 1
Movie 2Movie 3
31% 24% 20% 15%
Movie 4
Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
Hypothesis> People are general and like to follow trend (People in Gangnam)
> No special cases
> All days are same.
Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
peak time
afternoon
morning
late night
forenoon
Division by TimezoneDifferent Weight is put on each time zone and Shelf algorithm is used for fitting in movies into each time zone Also, more popular movies are put more on peak times
Screen1 Screen2 Screen3 Screen4 Screen5 Screen6
peak time
Movie1
Movie1Movie1
Movie1Movie1
Division by TimezoneToo complex Need to differ the ratio of each movie per time zone
Movie2
Movie2
Movie2
Movie3Movie3
Movie2
Movie1 Movie4
Movie1Movie1Movie1
600 400 300 300 200 150
Division by Screens (with variety of seats)Different Weight is put on each screen by actually changing the number of seats After, we sort in the movie with the most weight into the screen with most seats
600 400 300 300 200 150
Division by Screens (with variety of seats)
Movie1 Movie1Movie1
Movie2
Movie2
Movie3
Movie4Movie3
Movie2
Problem with this is that, “time” is not considered at all.
Combining time zone with screen divisionOn top of partition by screens, we can create another layer of time zones. According to time zones, we can switch movies on based of fixed algorithms
600 400 300 300 200 150
Movie1 Movie1 Movie1 Movie2
Movie2
Movie3
Movie4Movie3
Movie2Movie1
Movie2
More discussion is needed on
> How to split the partitions of screens
> What variables should be considered
> Which algorithm should be used on the structure
> What we do with leftover time (how to use it effectively)
Requisite Skillsets
Reference sites
Thank you