嵌套for循环通过python lis

2024-05-18 23:26:54 发布

您现在位置：Python中文网/ 问答频道 /正文

7736

网友

男 | 程序猿一只，喜欢编程写python代码。

我必须循环浏览超过4000个条目的列表，并用python中的推荐算法检查它们的相似性。你知道吗

这个脚本需要很长时间运行（10-11个小时），我想合并多线程来提高速度，但不知道如何准确地做到这一点。你知道吗

    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt

    data=pd.read_csv('data.csv',index_col=0, encoding="ISO-8859-1")       

    # Get list of unique items
    itemList=list(set(data["product_ref"].tolist()))

    # Get count of customers
    userCount=len(set(data["customer_id"].tolist()))

    # Create an empty data frame to store item affinity scores for items.
    itemAffinity= pd.DataFrame(columns=('item1', 'item2', 'score'))

    def itemUsers(ind):
      return data[data.product_ref==itemList[ind]]["customer_id"].tolist()

    rowCount=0
    for ind1 in range(len(itemList)): 
        item1Users = itemUsers(ind1) 
        pool = Pool()
        pool.map(loop2, data_inputs)
        for ind2 in range(ind1+1, len(itemList)): 
            print(ind1, ":", ind2)       
            item2Users = itemUsers(ind2) 
            commonUsers= len(set(item1Users).intersection(set(item2Users))) 
            score=commonUsers / userCount
            itemAffinity.loc[rowCount] = [itemList[ind1],itemList[ind2],score] 
            rowCount +=1

Tags： csv import for data len as pd score

1条回答

网友

1楼 · 发布于 2024-05-18 23:26:54

使用多线程并不能提高运行时间。你知道吗

这样想吧，当你使用多线程时，你的计算时间在多个线程之间分配——当你可以在一个进程上分配计算时间的时候。你知道吗

例如，在线程上等待用户输入时，如果您想在等待时进行计算，它可能会有所帮助，但这不是您的情况。

嵌套for循环通过python lis

相关问题更多 >

编程相关推荐

热门问题

热门文章

嵌套for循环通过python lis

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >