19
Convolutional Neural Networks for Sentiment Classification 何何何 [email protected]

Convolutional neural networks for sentiment classification

Embed Size (px)

Citation preview

Page 1: Convolutional neural networks for sentiment classification

Convolutional Neural Networks for Sentiment

Classification

何云超[email protected]

Page 2: Convolutional neural networks for sentiment classification

Word Vectors• CNN 中使用词向量的三种方法

• 作为网络参数,在模型训练中学习,随机初始化• 使用词向量模型 (word2vec, GloVe 等 ) 训练词向量,在模型训练中保持不变• 使用词向量模型 (word2vec, GloVe 等 ) 训练词向量,用于网络初始化,在模型训练中调整

Page 3: Convolutional neural networks for sentiment classification

Sentence Matrix• 矩阵中的每一行或者每一列为一个词向量

Page 4: Convolutional neural networks for sentiment classification

Convolutional Layer• Wide Convolution• Narrow Convolution

The red connections all have the same weight.

s+m-1=7-5+1=3 s+m-1=7+5-1=11

Page 5: Convolutional neural networks for sentiment classification

Pooling Layer• Max pooling: The idea is to capture the most important feature—one

with the highest value—for each feature map.

Page 6: Convolutional neural networks for sentiment classification

Dropout: A Simple Way to Prevent Neural Networks from Overfitting • Consider a neural net with one hidden layer.• Each time we present a training example, we

randomly omit each hidden unit with probability 0.5.

• So we are randomly sampling from 2^H different architectures.

• All architectures share weights.• Dropout prevents units from co-adapting ( 共同作用 ) too much.

H

Page 7: Convolutional neural networks for sentiment classification

Dropout: A Simple Way to Prevent Neural Networks from Overfitting

Page 8: Convolutional neural networks for sentiment classification

CNN for Sentence Classification [1]

• Two channels• CNN-rand• CNN-non-static

• CNN-static• CNN-multichannel

Page 9: Convolutional neural networks for sentiment classification

DCNN Overview [2]• Convolutional Neural

Networks with Dynamic -𝑘Max Pooling

• Wide Convolution• Dynamic -Max Pooling𝑘

• 当前卷积层数• :卷积曾总数• :句子长度• :最高层卷积层参数

Page 10: Convolutional neural networks for sentiment classification

• Dynamic -Max Pooling𝑘• 当前卷积层数• :卷积曾总数• :句子长度• :最高层卷积层参数

• 例• IF, • Then,

13 1max(3, 18 ) max(3,12)=123

k

23 2max(3, 18 ) max(3,6)=63

k

Page 11: Convolutional neural networks for sentiment classification

• Folding• 问题:

• 卷积操作独立作用于每一行• 同一行中建立了复杂的依赖• 全连接层之前,不同行之间相互独立

• 因此:• Folding 操做将每两行相加• d 行降低为 d/2• 每一行都依赖于下层中的两行

Page 12: Convolutional neural networks for sentiment classification

Semantic Clustering [3]Sentence

Matrix

Semantic Candidate

Units

Semantic Units

m=2, 3, …,  句子长度 /2

Semantic Cliques

Page 13: Convolutional neural networks for sentiment classification

Semantic ClusteringSentence

Matrix

Semantic Candidate

Units

Semantic Units

m=2, 3, …,  句子长度 /2

Semantic Cliques

Semantic cliques

Page 14: Convolutional neural networks for sentiment classification

seq-CNN [4]• 受启发于图像有 RGB 、 CMYK 多通道的思想,将句子视为图像,句子中的单词视为像素,因此一个 d 维的词向量可以看成一个有

d 个通道的像素• 例 词汇表

句子句向量多通道

. . .

[0 0 0] [0 0 0] [1 0 0 ] [0 0 1] [0 1 0]

Page 15: Convolutional neural networks for sentiment classification

Enrich word vectors• 使用了字符级的向量 (character-level embeddings) ,将词向量和字符向量的合并在一起作为其向量表示。 [5]• 使用传统的文本特征来扩展词向量,主要包括:大写单词数量、表情符号、拉长的单词 (Elongated Units) 、情感词数量、否定词、标点符号、 clusters 、 n-grams 。 [6]

Page 16: Convolutional neural networks for sentiment classification

MVCNN: Multichannel Variable-Size Convolution [7]• 不同 word embeddings 所含有的单词不一样

• HLBL• Huang• GloVe• SENNA• Word2vec

• 对某些 unknown words 的处理• Randomly initialized• Projection: (mutual learning)

Page 17: Convolutional neural networks for sentiment classification

MVCNN:Training• Pretraining

• Unsupervised training• Average of context word vectors as a

predicted representation of the middle word

• To produce good initial values

• Training• Logistic regression

Page 18: Convolutional neural networks for sentiment classification

References[1] Kim, Y. (n.d.). Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).

[2] Kalchbrenner, N., Grefenstette, E., & Blunsom, P. (n.d.). A Convolutional Neural Network for Modelling Sentences. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).

[3] Wang, P., Xu, J., Xu, B., Liu, C. L., Zhang, H., Wang, F., & Hao, H. (2015). Semantic Clustering and Convolutional Neural Network for Short Text Categorization. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 2, pp. 352-357).

[4] Johnson, R., & Zhang, T. (n.d.). Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.

[5] dos Santos, C. N., & Gatti, M. (2014). Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of the 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland.

[6] Tang, D., Wei, F., Qin, B., Liu, T., & Zhou, M. (2014, August). Coooolll: A deep learning system for twitter sentiment classification. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) (pp. 208-212).

[7] Wenpeng Yin, Hinrich Schütze. Multichannel Variable-Size Convolution for Sentence Classification. The 19th SIGNLL Conference on Computational Natural Language Learning (CoNLL'2015, long paper). July 30-31, Peking, China.

Page 19: Convolutional neural networks for sentiment classification

谢谢聆听Q&A

何云超 [email protected]