用数字序列替换csv文件的重复字符串,而不使用数据帧
python I have a csv file that contain 4 column i want to replace a string of all the column with a sequence of number if any duplicate will be there then it should give the prevoius number.for that i have written this code which return a three dicts:dict1,dict2,dict3 now i want to print that dictionary value in a file like given in below fig.
with open(tempFile, 'r', encoding="utf8") as csvfile:
# creating a csv reader object
csvreader = csv.reader(csvfile, delimiter=',')
next(csvreader, None)
firstRow = next(csvreader)
NameCount = 1
AddressCount=1
EmailCOunt=1
input_dict={firstRow[1]:NameCount}
input_dict2={firstRow[2]:AddressCount}
input_dict3={firstRow[3]:EmailCOunt}
dict1={firstRow[0]:NameCount}
dict2={firstRow[0]:AddressCount}
dict3={firstRow[0]:EmailCOunt}
for row in csvreader:
value = input_dict.get(row[1])
if value is None:
NameCount = NameCount + 1
input_dict.update({row[1]:NameCount})
dict1.update({row[0]: NameCount})
# input_dict2.update({row[2]:counter})
# dict3.update({row[0]: counter})
else:
input_dict.update({row[1]: value})
dict1.update({row[0]: value})
#
# input_dict2.update({row[2]: value1})
# dict3.update({row[0]: value1})
# value = input_dict2.get(row[2])
value1 = input_dict2.get(row[2])
if value1 is None:
AddressCount = AddressCount + 1
input_dict2.update({row[2]:AddressCount})
dict2.update({row[0]: AddressCount})
else:
input_dict2.update({row[2]: value1})
dict2.update({row[0]: value1})
value2 = input_dict3.get(row[3])
if value2 is None:
EmailCOunt = EmailCOunt + 1
input_dict3.update({row[3]:EmailCOunt})
dict3.update({row[0]: EmailCOunt})
else:
input_dict3.update({row[3]: value2})
dict3.update({row[0]: value2})
print('dict1-', dict1)
print('dict2-', dict2)
print('dict3-', dict3)[this is the image of my input csv file in which i have replaced the duplicated string of col 1,2,3 with seq no. by using dicts[this is how i need my output look like after string replacement][1] ][1]
这是写入csv文件的输入数据:
job_Id Name Address Email
1 snehil singh marathalli ss@gmail.com
2 salman marathalli ss@gmail.com
3 Amir HSR ar@gmail.com
4 Rakhesh HSR rakesh@gmail.com
5 Ram marathalli r@gmail.com
6 Shyam BTM ss@gmail.com
7 salman HSR ss@gmail.com
8 Amir BTM ar@gmail.com
9 snehil singh Majestic sne@gmail.com
我无法得到的必要输出是:
job_Id Name Address Email
1 1 1 1
2 2 1 1
3 3 2 2
4 4 2 3
5 5 1 4
6 6 3 1
7 2 2 1
8 3 3 2
9 1 4 5
请帮忙。。。。。。。。你知道吗
嗨,伙计们,我试过用这种方式,它的工作。。你知道吗
count=1
iter_obj1 = iter(dict1.values())
iter_obj2= iter(dict2.values())
iter_obj3 = iter(dict3.values())
while True:
try:
element1 = next(iter_obj1)
element2 = next(iter_obj2)
element3 = next(iter_obj3)
s = count, element1, element2, element3
print(s)
with open("snehil.csv", 'w') as f:
f.write('\n')
f.write(json.dumps(s)+'\n')
f.write(line)
count=count +1
except StopIteration:
break
输出为:
(1, 1, 1, 1)
(2, 2, 1, 1)
(3, 3, 2, 2)
(4, 4, 2, 3)
(5, 5, 1, 4)
(6, 6, 3, 1)
(7, 2, 2, 1)
(8, 3, 3, 2)
(9, 1, 4, 5)
这是正确的输出,但我无法在csv文件中打印它它只显示最后一行(9,1,4,5)它意味着它在单行中读取所有数据..对于打印,我使用了:
with open("snehil.csv", 'w') as f:
#f.write('\n')
f.write(json.dumps(s)+'\n')
甚至我也尝试用Dataframe将其打印到csv文件中,但出现了如下错误:AttributeError:'tuple'object has no attribute'values' 对于dataframe,我写的是:
df=pd.DataFrame.from_dict(s, orient='index')
print(df)
请帮助我如何得到它在csv文件和打印所有行在不同的细胞…谢谢
程序读取csv文件,用数字替换字符串并将其写入csv文件
import csv
import os
from io import StringIO
# tempFile="input1.csv"
with open("input1.csv", 'r') as csvfile:
# creating a csv reader object
reader = csv.reader(csvfile, delimiter=',')
next(reader, None)
data = {}
for row in reader:
for header, value in row.items():
try:
data[header].append(value)
except KeyError:
data[header] = [value]
for key in data.keys():
values = data[key]
things = list(sorted(set(values), key=values.index))
for i, x in enumerate(data[key]):
data[key][i] = things.index(x) + 1
with open("snehil.csv", "w") as outfile:
writer = csv.writer(outfile)
# Write headers
writer.writerow(data.keys())
# Make one row equal to one value from each list
rows = zip(*data.values())
# Write rows
writer.writerows(rows)
执行此程序时,我遇到一个错误:
for header, value in row.items():
AttributeError: 'list' object has no attribute 'items'
请帮帮我,我不明白为什么我会犯这个错误。。。。。。你知道吗
您可以将您的
csv
读取为dictionary
,列出每个键(列)的值,然后使用一组唯一值作为索引。你知道吗首先我们读取数据:
然后,我们将数据重组为一组具有值列表
{key_1: [], key_2: []}
的键:接下来要为每个列表中的每个值指定一个唯一标识符。你知道吗
如何在新的csv文件中保存
data
?由于
csv.writerows()
接受一个列表,但将其视为一行,因此我们需要重新构造数据,使每一行都是每个列表中的一个值。这可以通过zip()
实现:相关问题 更多 >
编程相关推荐