26
付:2015/06/04 発表者:@_kobacky 場所:株式会社 ALBERT セミナールーム トピックモデルによる統計的潜在意味解析 読書会 2章「Latent Dirichlet Allocation(前半) 【注意】 この資は @_kobacky が本を読んで解した内容を記載したものであり、 本には存在しない記述や本とはなる表現をしている部分があります。 Amazonhttp://www.amazon.co.jp/dp/4339027588/

「トピックモデルによる統計的潜在意味解析」読書会 2章前半

Embed Size (px)

Citation preview

1. 2015/06/04 @_kobacky ALBERT 2Latent Dirichlet Allocation () @_kobacky Amazon http://www.amazon.co.jp/dp/4339027588/ 2. 2Latent Dirichlet Allocation .1 .2 Dirichlet .3LDA 3. 2.1 LDA : Bag of Words (BoW) LDA LDA Bag of Items () Bag of XXX 4. 2.2 Dirichlet LDA Dirichlet n Dirichlet ( K ) = 1,2,,K( ) 1 (Dirichlet ) k k=1 K =1 " # $ % & ' 5. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } a = 1,2,3( ) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) 6. 2.2 Dirichlet a = 1,2,3( ) (4,3,3) (1,1,10) (K=3) 7. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) (4,3,3) (1,1,10) (simplex) 1 3 Dirichlet 8. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) 9. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) 1 or 2 or 3 xi a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) 1 10. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } a = 1,2,3( ) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) 1 4 (n=4) 11. 2.2 Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } () Dirichlet xi 1 (n=1 ) {nk} () n a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) 1 4 (n=4) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} 12. 2.2 Dirichlet (2.1)x p x ( )= p xi ( ) i=1 n = k nk k=1 K p 1,2,1,3( ) ( )= 0.70.10.70.2 = 0.72 0.11 0.21 Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) x=(1,2,1,3) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } x = x1, x2, x3,, xn( ) (2.1) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} 13. 2.2 Dirichlet (2.2) p nk{ }k=1 K ,n( )= Multi nk{ }k=1 K ,n( ) n! nk ! k=1 K k nk k=1 K p 2,1,1{ } ,4( )=4 C2 2 C1 1 C1 0.72 0.11 0.21 ( )= 4! 2!1!1! 0.72 0.11 0.21 ( ) Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) Multi nk{ }k=1 K ,n( ) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } (2.2) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} =(0.7,0.1,0.2)122131 14. 2.2 Dirichlet (2.3)xi xi Multi xi ( ) p xi = k ( )= Multi nk =1 ,1( )= 1 nk ! k=1 K k nk k=1 K = k p xi ( )= Multi xi ( ) Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) Multi xi ( ) k1 k0 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } (2.3) {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} 15. 2.2 Dirichlet (2.4)(2.6)Dirichlet Dirichlet (1,0,0) (0,1,0) (0,0,1) a = 1,2,3( ) = 1,2,3( ) (0.7,0.1,0.2) (0.1,0.1,0.8) (4,3,3) (1,1,10) p a( )= Dir a( ) kk=1 K ( ) k( )k=1 K k k 1 k=1 K 1( )=1 n( )= n 1( ) n 1( ) + n( )= + n 1( ) + n 1( ) E k[ ]= k 0 V k[ ]= k 0 k( ) 0 2 1+0( ) 0 = k k=1 K (2.5) (2.6) (2.4)Dirichlet ( ) 1 or 2 or 3 xi nk{ }k=1 K = n1,n2,n3{ } {2,1,1}or{4,0,0}or {3,1,0}or{2,2,0} 16. 2.2 Dirichlet p(x)p(y|x) p(x|y)p(y|x)p(x)p(x) p(x)p(y|x) xi Multi xi ( ) p a( )= Dir a( ) kk=1 K ( ) k( )k=1 K k k 1 k=1 K p x,a( )= nk +kk=1 K ( ) nk +k( )k=1 K k nk +k 1 k=1 K x x x (2.9) (2.4) Dirichlet k k 17. 2.2 Dirichlet (2.7)() p x,a( )= p x, a( ) p x a( ) p x, a( )= p x ( )p a( ) = p xi ( )p a( ) i=1 n = k xi=k( ) k=1 K p a( ) i=1 n = k xi=k( )i=1 n p a( ) i=1 n = k nk p a( ) i=1 n k nk i=1 n k k 1 i=1 n = k nk +k 1 i=1 n P(|a) P(x|a) 18. 2.2 Dirichlet (2.8)(2.9)() p x,a( ) k nk +k 1 i=1 n p x,a( )= k nk +k 1 k=1 K k nk +k 1 d k=1 K p x,a( ) d =1 p(|x,a)Dirichlet p x,a( )= nk +kk=1 K ( ) nk +k( )k=1 K k nk +k 1 k=1 K p(|x,a)Dirichlet (2.8) (2.9) 19. 2.2 Dirichlet (2.10) Ep x,a( ) k[ ]= k p x,a( )d = nk +k n "k + "k"k =1 K (2.6) () 20. p.30 k Dirichlet 2.2 Dirichlet Dirichlet 21. () 2.3 LDA k,v k = k,1,,k,V( ) v = 1,2,3,,V{ } k v k zd,i 1,2,3,,K{ } wd,i d i wd,i 22. 2.3 LDA TASA LDA () slope 071 music 077 concert 077 play 077 jazz 077 periods 078 audiences 082 play 082 play 082 read 254 game 166 comes 040 Don 180 play 166 boys 020 play paly play 23. 2.3 LDA TASA LDA () MUSIC .090 DANCE .034 SONG .033 PLAY .030 SING .026 LITERATURE .031 POEM .028 PLAY .015 LITERARY .013 PLAY .136 BALL .129 GAME .065 PLAYING .042 HIT .032 77 77,v 82 82,v 166 166,v 24. 2.3 LDA d nd d M k,1 k,2 k,V k k,1 k,2 k,V k k,1 k,2 k,V k k,1 k,2 k,V k K (V) d = d,1,,d,K( ) (d=1,2,, M ) (k=1,2,, K ) Dir a( ) k = k,1,,k,V( ) Dir b( ) k Multi d( ) Multi zd,i( )wd,i zd,i (i=1,2,, ) zd,i wd,i nd 25. 2.3 LDA M Knd a bd kwd,i zd,i 26.