我的代码部分地将列表输出作为数据帧列写入csv,但在

2024-04-27 18:42:39 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含两列的数据集,我想匹配两列中的字符串,并在第三列中生成匹配百分比。然后我想把CSV中的三列都取下来。这是我的密码。你知道吗

    Data: 

    **RoS  FCRA**
    pink pinky 
    rose grass 
    thick thin 

代码:

from fuzzywuzzy import fuzz, process
import pandas as pd
import csv

df = pd.read_excel("/Users/shreyaagarwal/Desktop/fcra test.xlsx")
with open("myfile.csv", "w") as fh:
     writer = csv.writer(fh)
     for i in (df["RoS"]):
        for p in (df["FCRA"]):
            s = p.encode('ascii', 'ignore').decode('ascii')
            match = fuzz.partial_ratio(i,s)
            df["Fuzzymatch"] = match
            writer.writerow([i,s,match])



Desired Output: 
    **RoS  FCRA  Match**
    pink pinky 20
    pink grass 0
    pink thin 0
    rose pinky 0
    rose grass 0
    rose thin 0

Tags: csvimportdfasmatchpinkyrosthin
1条回答
网友
1楼 · 发布于 2024-04-27 18:42:39

你似乎在错误的事情上循环,并且引入了你从未使用过的变量。我猜你想要

from fuzzywuzzy import fuzz, process
import pandas as pd
import csv

df = pd.read_excel("test.xlsx")
with open("myfile.csv", "w") as fh:
    writer = csv.writer(fh)
    for i in df["RoS"]:
        for p in df["FCRA"]:
            match = fuzz.partial_ratio(i,p)
            writer.writerow([i,p,match])

下面是一个MCVE的尝试:

import pandas as pd

df = pd.DataFrame(
    [['pink', 'pinky'], ['rose', 'grass'], ['thick', 'thin']],
    columns=['RoS', 'FCRA'])
for i in df["RoS"]:
    for p in df["FCRA"]:
        print(i, p)

结果:

('pink', 'pinky')
('pink', 'grass')
('pink', 'thin')
('rose', 'pinky')
('rose', 'grass')
('rose', 'thin')
('thick', 'pinky')
('thick', 'grass')
('thick', 'thin')

相关问题 更多 >