将CSV转换为JSON,在字典中追加组类别
我先说一下,我是个新手,用的还是个很糟糕的“数据库”。下面是我从csv文件输出的json结构(大致框架如下)。其实我想做的就是把A列中的“组别”(信息技术)加到每个“数据”字典里,也就是说想要有一个“组别”键:值,像这样“组别”: “信息技术”。然后在第5行以下的所有内容(消费者选择性)也要有“组别”: “消费者选择性”的键值。
{
"stocks": [
{
"data": {
"portfolio_average_weight": "5.985"
"portfolio_total_return": "27.948"
},
"name": "Google Inc "
},
{
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292"
},
"name": "Mastercard Inc "
}]
}
Column A Column B Column C Column D
Information Technology [blank cell] [blank cell] [blank cell]
[blank cell] Google 5.985 27.948
[blank cell] Mastercard 2.896 24.292
Consumer Discretionary [blank cell] [blank cell] [blank cell]
[blank cell] xxxxxx xxxxxxxxx xxxxxxxxx
这是我现在的代码:
with open('test.csv', 'rU') as csvfile:
lines = csv.reader(csvfile)
for line in lines:
elif line[0] == "" and line[1] != "":
data = test_two_level(line)
bottom_level = {
"name": line[2],
"data": data}
def test_two_level(line):
data = {
"portfolio_average_weight":line[3],
"portfolio_total_return":line[4]}
return data
我希望最终输出的样子是这样的:
{
"stocks": [
{
"data": {
"portfolio_average_weight": "5.985",
"portfolio_total_return": "27.948",
"group": "Information Technology"
},
"name": "Google Inc "
},
{
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292",
"group": "Information Technology"
},
"name": "Mastercard Inc "
}]
}
下面是csv文件:
Information Technology,,,
,Google Inc ,5.985,27.948
,Mastercard Inc ,2.896,24.292
Consumer Discretionary,,,
1 个回答
1
我更喜欢用csv.DictReader
而不是csv.reader
,因为用csv.DictReader
写出来的代码更容易看懂。而且每一行数据都读成一个字典,这样代码看起来也更整齐,特别是在处理JSON对象的时候,JSON对象通常也是由一个或多个字典组成的。
import csv, json
with open('csv_to_json_test.csv', 'rb') as csvfile:
csvfields = 'group', 'name', 'average_weight', 'total_return'
reader = csv.DictReader(csvfile, fieldnames=csvfields)
database = {}
stocks = database['stocks'] = [] # initialize item to be parsed
group = None
for row in reader:
if row['group']:
group = row['group']
else:
stocks.append(
{
'data': {
"portfolio_average_weight": row['average_weight'],
"portfolio_total_return": row['total_return']
},
'name': row['name'].rstrip(), # strips trailing spaces
'group': group,
}
)
print 'database =',
print json.dumps(database, indent=4)
输出:
database = {
"stocks": [
{
"group": "Information Technology",
"data": {
"portfolio_average_weight": "5.985",
"portfolio_total_return": "27.948"
},
"name": "Google Inc"
},
{
"group": "Information Technology",
"data": {
"portfolio_average_weight": "2.896",
"portfolio_total_return": "24.292"
},
"name": "Mastercard Inc"
}
]
}