Python文件匹配和追加

2024-05-12 23:05:06 发布

您现在位置:Python中文网/ 问答频道 /正文

这是一个文件result.csv

M11251TH1230 
M11543TH4292 
M11435TDS144

这是另一个文件sample.csv

M11435TDS144,STB#1,Router#1 
M11543TH4292,STB#2,Router#1 
M11509TD9937,STB#3,Router#1
M11543TH4258,STB#4,Router#1

我是否可以编写一个Python程序来比较这两个文件,并且如果result.csv中的行与sample.csv中的行中的第一个单词匹配,那么在sample.csv中的每一行追加1或者追加0?你知道吗


Tags: 文件csvsample程序result单词routerstb
3条回答

使用csv.readercsv.writercsv模块)的解决方案:

import csv

newLines = []
# change the file path to the actual one
with open('./data/result.csv', newline='\n') as csvfile:
    data = csv.reader(csvfile)
    items = [''.join(line) for line in data]

with open('./data/sample.csv', newline='\n') as csvfile:
    data = list(csv.reader(csvfile))
    for line in data:
        line.append(1 if line[0] in items else 0)
        newLines.append(line)

with open('./data/sample.csv', 'w', newline='\n') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows(newLines)

sample.csv内容:

M11435TDS144,STB#1,Router#1,1
M11543TH4292,STB#2,Router#1,1
M11509TD9937,STB#3,Router#1,0
M11543TH4258,STB#4,Router#1,0

只有一列,我想知道你为什么把它做成result.csv。如果它没有更多的列,一个简单的文件读取操作就足够了。此外,将数据从result.csv转换为字典也有助于快速运行。你知道吗

result_file = "result.csv"
sample_file = "sample.csv"

with open(result_file) as fp:
    result_data = fp.read()
    result_dict = dict.fromkeys(result_data.split("\n"))
    """
    You can change the above logic, in case you have very few fields on csv like this:
    result_data = fp.readlines()
    result_dict = {}
    for result in result_data:
        key, other_field = result.split(",", 1)
        result_dict[key] = other_field.strip()
    """

#Since sample.csv is a real csv, using csv reader and writer
with open(sample_file, "rb") as fp:
    sample_data = csv.reader(fp)
    output_data = []
    for data in sample_data:
        output_data.append("%s,%d" % (data, data[0] in result_dict))

with open(sample_file, "wb") as fp:
    data_writer = csv.writer(fp)
    data_writer.writerows(output_data)
import pandas as pd

d1 = pd.read_csv("1.csv",names=["Type"])
d2 = pd.read_csv("2.csv",names=["Type","Col2","Col3"])
d2["Index"] = 0

for x in d1["Type"] :
    d2["Index"][d2["Type"] == x] = 1

d2.to_csv("3.csv",header=False)

考虑到“1.csv”和“2.csv”是您的csv输入文件,“3.csv”是您需要的结果

相关问题 更多 >