Introducing Latent Semantic Analysis

Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998.Scott Deerwester et al., “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, Vol. 41 (6), pp. 391-407, 1990.Kirk Baker, “Singular Value Decomposition Tutorial,” Electronic document, 2005.

Aug 22, 2014Hee-Gook Jun

2 / 25

Outline

SVD SVD to LSA Conclusion

3 / 25

Eigendecomposition vs. Singular Value Decomposition

Eigendecomposition– Must be a diagonalizable matrix– Must be a square matrix– Matrix (n x n size) must have n linearly independent eigenvector

e.g. symmetric matrix ..

Singular Value Decomposition– Computable for any size (M x n) of matrix

A U ∑ VT

A P Ʌ P-1

4 / 25

U: Left Singular Vectors of A

Unitary matrix– Columns of U are orthonormal (orthogonal + normal)– orthonormal eigenvectors of AAT

A U ∑ VT

and is orthogonal

= [0,0,0,1] = [0,1,0,0]

= (0x0) + (0x1) + (0x0) + (1x0)

is normal vector

= [0,0,0,1]

5 / 25

V: Right Singular Vectors of A

Unitary matrix– Columns of V are orthonormal (orthogonal + normal)– orthonormal eigenvectors of ATA

A U ∑ VT

6 / 25

∑ (or S)

Diagonal Matrix– Diagonal entries are the singular values of A

Singular values– Non-zero singular values– Square roots of eigenvalues from U (or V) in descending order

A U ∑ VT

7 / 25

Calculation Procedure

1. U is a list of eigenvectors of AAT

1. Compute AAT

2. Compute eigenvectors of AAT

3. Matrix Orthonormalization

2. V is a list of eigenvectors of ATA1. Compute ATA2. Compute eigenvalues of ATA3. Orthonormalize and transpose

3. ∑ is a list of eigenvalues of U or V1. (eigenvalues of U = eigenvalues of V)

A U ∑ VT

① ② ③

8 / 25

1.1 Matrix U – Compute AAT

Start with the matrix

Transpose of A

9 / 25

1.2 Matrix U – Eigenvectors and Eigenvalues [1/2]

Eigenvector– Nonzero vector that satisfies the equation– A is a square matrix, is an eigenvalue (scalar), is the eigenvector

≡rearrange

set determinent of the coefficient matrix to zero

10 / 25

1.2 Matrix U – Eigenvectors and Eigenvalues [2/2]

Thus, set of eigenvectors [𝟏 𝟏𝟏 −𝟏]

② For

Calculated eigenvalues

① For

eigenvector

11 / 25

1.3 Matrix U – Orthonormalization

Gram-Schmidt orthonormalization

𝑤𝑘=𝑣𝑘−∑𝑖=1

𝑘−1

(𝑢𝑖 ∙𝑣𝑘 )×𝑢𝑖

set of eigenvectors orthonormal matrix

𝑣1𝑣2 𝑢1𝑢2

normalize v1

normalize w2

find w2 (orthogonal to u1)

12 / 25

2.1 Matrix VT – Compute ATA

Start with the matrix

Transpose of A

13 / 25

2.2 Matrix VT – Eigenvectors and Eigenvalues [1/2]

Eigenvector– Nonzero vector that satisfies the equation– A is a square matrix, is an eigenvalue (scalar), is the eigenvector

≡rearrange

set determinent of the coefficient matrix to zeroby cofactor expansion ( 여인수 전개 )

14 / 25

2.2 Matrix VT – Eigenvectors and Eigenvalues [2/2]

Thus, set of eigenvectors

② For

① For eigenvector

[𝟏 𝟏𝟏 −𝟏]

③ For

15 / 25

2.3 Matrix VT – Orthonormalization and Transformation

Gram-Schmidtorthonormalization

𝑤𝑘=𝑣𝑘−∑𝑖=1

𝑘−1

(𝑢𝑖 ∙𝑣𝑘 )×𝑢𝑖

set of eigenvectors orthonormal matrix

𝑣1𝑣2𝑣3 𝑢1𝑢2𝑢3

normalize v1

normalize w2find w2 (orthogonal to u1)

normalize w3

find w3 (orthogonal to u2)

Transpose

16 / 25

3.1 Matrix ∑ (= S)

Square roots of the non-zero eigenvalues– Populate the diagonal with the values– Diagonal entries in ∑ are the singular values of A

17 / 25

Outline

SVD SVD to LSA

18 / 25

Latent Semantic Analysis

Use SVD (Singular Value Decomposition)– to simulate human learning of word and passage meaning

Represent word and passage meaning– as high-dimensional vectors in the semantic space

19 / 25

LSA Example

doc 1 " modem the steering linux. modem, linux the modem. steering the modem. linux "

doc 2 " linux; the linux. the linux modem linux. the modem, clutch the modem. petrol "

doc 3 " petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch "

doc 4 " the the the. clutch clutch clutch! steering petrol; steering petrol petrol; steering petrol "

First analysis – Document Similarity

Second analysis – Term Similarity

20 / 25

LSA Example: Build a Term Frequency Matrix

d1 d2 d3 d4

linux 3 4 1 0

modem 4 3 0 1

the 3 4 4 3

clutch 0 1 4 3

steer-ing

2 0 3 3

petrol 0 1 3 4

Let Matrix A =

21 / 25

LSA Example: Compute SVD of Matrix A

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

d1 d2 d3 d4

0.61 0.33

0.44 0.27 -0.63

0.46 -0.35

d1 d2 d3 d4

t1 (linux) 3 4 1 0

t2 (mo-dem)

4 3 0 1

t3 (the) 3 4 4 3

t4 (clutch) 0 1 4 3

t5 (steer-ing)

2 0 3 3

t6 (petrol) 0 1 3 4

6 x 4 4 x 4 4 x 4

U S VT

- R code -result ← svd(A)

22 / 25

LSA Example: Reduced SVD

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

d1 d2 d3 d4

0.61 0.33

0.44 0.27 -0.63

0.46 -0.35

6 x 4 4 x 4 4 x 4

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

d1 d2 d3 d4

0.61 0.33

0.44 0.27 -0.63

0.46 -0.35

6 x 2 2 x 2 2 x 4

23 / 25

LSA Example: Document Similarity

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

d1 d2 d3 d4

0.61 0.33

0.44 0.27 -0.63

0.46 -0.35

2 x 2 2 x 4

d1 d2 d3 d4

5.49 -6.49

2.79 2.88

d1 d2 d3 d4

d1 1 0.99 0.51 0.46

d2 0.99 1 0.58 0.54

d3 0.51 0.58 1 0.99

d4 0.46 0.54 0.99 1

𝑆𝑖𝑚 ( 𝐴 ,𝐵 )=𝑐𝑜𝑠𝑖𝑛𝑒𝜃=𝐴 ∙𝐵

|𝐴||𝐵|=

(−4.83×5.49 )+(−3.52×−3.28)

√ (−4.83 )2+(−3.52 )2×√(5.49 )2+ (−3.28 )2doc 1"modem the steering linux. modem, linux the modem. steering the modem. linux "

doc 2"linux; the linux. the linux modem linux. the modem, clutch the modem. petrol "

doc 3"petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch "

doc 4 "the the the. clutch clutch clutch! steering petrol; steering petrol petrol; steering petrol "

24 / 25

LSA Example: Term Similarity

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

𝑆𝑖𝑚 ( 𝐴 ,𝐵 )=𝑐𝑜𝑠𝑖𝑛𝑒𝜃=𝐴 ∙𝐵

|𝐴||𝐵|

t1 t2 t3 t4 t5 t6

t1 10.99

t20.99

t30.80

t40.29

t50.45

t60.28

linux modem the clutch steering petrol

linux modem the

modem linux the

the linux modem clutch steering petrol

clutch the steering petrol

steering the clutch petrol

petrol the clutch steering

25 / 25

Conclusion

Pros– Compute document similarity– even if they do not have common words

Cons– Statistical foundation missing → PLSA

11.4 0 0 0

0 6.27 0 0

0 0 2.21 0

0 0 0 1.28

d1 d2 d3 d4

0.61 0.33

0.44 0.27 -0.63

0.46 -0.35

Which one is to be chosen to reduce?

Introducing Latent Semantic Analysis

Documents

INTRODUCING CICERO

CSS Introducing

Introducing UX

ISWeb - Information Systems & Semantic Web Marcin Grzegorzek marcin@uni-koblenz.de1 5.4 Latent Semantic Indexing und Singulärwertzerlegung Zerlegung von

[Zag] introducing

Latent Semantic Transliteration using Dirichlet Mixture

Lexical Trigger and Latent Semantic Analysis for Cross-Lingual Language Model Adaptation

Latent rank theory

Introducing CandyCane

Introducing Bonsai

DENGAN METODE LATENT SEMANTIC INDEXINGhbunyamin.itmaranatha.org/Papers/TESIS_hendra_final.pdfJasman Pardede, Bambang Pramono, Rimba, Megah Mulya, dan kawan-kawan. Terima kasih atas

introducing modellonline.hu

Introducing powerpoint2010

9. Heterogeneity: Latent Class Modelspeople.stern.nyu.edu › wgreene › DiscreteChoice › 2014 › DC2014-9-LCModels.pdf[Topic 9-Latent Class Models] 3/66 Latent Classes • A population

Introducing -CLEAR: A Latent Variable Approach to ... · Introducing -CLEAR: A Latent Variable Approach to Measuring Nuclear Capability Bradley C. Smith William Spaniely October 17,

Conceptual Indexing using Latent Semantic Indexing856529/FULLTEXT01.pdf · Conceptual Indexing using Latent Semantic Indexing Magnus La Fleur and Fredrik Renström Information Retrieval

Emulación del procesamiento de palabras polisémicas determinando el contexto con LSA (Latent Semantic Analysis)

LATENT SEMANTIC INDEXING

Understanding Specific Latent Heat

LATENT SEMANTIC INDEXING. Limiti della ricerca per parole chiave I metodi di ranking tradizionali calcolano lattinenza di un documento ad una query sulla