NLP

基于自然语言处理的隐私政策自动表述研究

对来自华为应用市场的1,500份中文隐私政策进行检测,检测结果表明38.5%的隐私政策为虚假隐私政策,剩余合法的隐私政策中,92.5%的隐私政策在完整性方面不符合“自评估指南”的要求。在隐私政策自动表述的基础上,设计了一种隐私政策打分方法,实验结果表明大部分隐私政策的得分位于低分数区间内。

Review Embedding Corpus for English Words and Phrases Released (2019.2.19)

A. Features

  1. 200-dimension vector representation.
  2. 213,118 english sentences in total.
  3. Access via this Link and will be continuously updated.

B. Case: To find similar word

 

Janus Embedding Corpus for Chinese Words and Phrases Released (2019.2.15)

A. Features

  1. Phrases come from Janus.
  2. Coarse-grained segmentation.
  3. 200-dimension vector representation.
  4. 7957 apks and 232274 sentences in total till now.
  5. Access via this Link and will be continuously updated.

B. Case: To find similar word

 

enjoy!