计算Java中两个单词的概念相似度和关系相似度
我正在基于this论文用Java实现一个可读性公式
我已经到了必须计算两个或更多单词的概念相似性和关系相似性的地步
他们说:
We use Latent Semantic Analysis (LSA) tools to compute word similarity. LSA can derive semantic information, including similarity, from a word-document co-occurrence matrix. Word/term co-occurrences are counted in a moving window of a fixed size that scans the entire corpus. The co-occurrence models using windowsizes of +-1 and +-4 considered as relational similarity and conceptual semantic models, respectively.
我试图查看一些LSA的实现,比如this one,但是找不到一种直接的方法来获得我想要的
我应该需要一个基于单词的矩阵,所以我尝试使用WS4J库来计算基于两个字符串数组的矩阵
WS4J也有一个方法calcRelatednessOfWords()
,但是它得到的结果与本文中显示的结果不匹配
有提供我想要的东西的图书馆吗?或者有人能给我指出正确的方向吗
共 (0) 个答案