A. Features
- Phrases come from Janus.
- Coarse-grained segmentation.
- 200-dimension vector representation.
- 7957 apks and 232274 sentences in total till now.
- Access via this Link and will be continuously updated.
B. Case: To find similar word
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
>>> import gensim >>> model = gensim.models.Word2Vec.load("/home/tong/Desktop/w2v/janus-embedding-model") >>> for v in model.most_similar(positive=[u'邮件'], topn=10): ... print v[0] 电子邮件 电子 新邮件 电子书 电子邮箱 子书 新邮 帐户 邮箱 发送到 >>> |
enjoy!