基于用户的过滤：推荐系统问题的回答

基于用户的过滤：推荐系统

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

（免责声明，我不擅长这一领域，只对集体过滤有过眼云烟的知识。以下只是我发现有用的资源集合） 这方面的基础知识在<a href="http://books.google.co.uk/books?id=fEsZ3Ey-Hq4C&lpg=PP1&pg=PA7#v=onepage&q&f=false">Chapter 2 of the "Programming Collective Intelligence" book</a>中涵盖得相当全面。示例代码使用Python，这是另一个优点。 你可能也会发现这个网站很有用- <a href="http://guidetodatamining.com/">A Programmer's Guide to Data Mining</a>，特别是<a href="http://guidetodatamining.com/home/toc/chapter-2/">Chapter 2</a>和<a href="http://guidetodatamining.com/home/toc/chapter-3/">Chapter 3</a>，讨论了推荐系统和基于项的过滤。 简言之，可以使用诸如计算<a href="http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient">Pearson Correlation Coefficient</a>、<a href="http://en.wikipedia.org/wiki/Cosine_similarity">Cosine Similarity</a>、<a href="http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm">k-nearest neighbours</a>等技术，根据用户喜欢/购买/投票的项目来确定用户之间的相似性。 请注意，有许多python库是为此目的而编写的，例如<a href="http://code.google.com/p/pysuggest/">pysuggest</a>、<a href="https://github.com/muricoca/crab">Crab</a>、<a href="https://github.com/ocelma/python-recsys">python-recsys</a>和<a href="http://www.scipy.org/doc/api_docs/SciPy.stats.stats.html#pearsonr">SciPy.stats.stats.pearsonr</a>。 对于用户数超过项数的大型数据集，可以通过倒排数据来更好地缩放解决方案，并计算项之间的相关性（即基于项的筛选），然后使用该相关性推断相似的用户。当然，您不会实时执行此操作，而是将定期重新计算安排为后端任务。有些方法可以并行化/分布式，以大大缩短计算时间（假设您有足够的资源投入）。

基于用户的过滤：推荐系统

1 个回答

相关Python问题