将CSV转换为JSON,在字典中追加组类别

1 投票
1 回答
791 浏览
提问于 2025-04-17 21:57

我先说一下,我是个新手,用的还是个很糟糕的“数据库”。下面是我从csv文件输出的json结构(大致框架如下)。其实我想做的就是把A列中的“组别”(信息技术)加到每个“数据”字典里,也就是说想要有一个“组别”键:值,像这样“组别”: “信息技术”。然后在第5行以下的所有内容(消费者选择性)也要有“组别”: “消费者选择性”的键值。

{
  "stocks": [
    {
      "data": {
        "portfolio_average_weight": "5.985"
        "portfolio_total_return": "27.948"
      },
      "name": "Google Inc              "
    },
    {
      "data": {
        "portfolio_average_weight": "2.896",
        "portfolio_total_return": "24.292"
      },
      "name": "Mastercard Inc          "
    }]
}

Column A                           Column B         Column C        Column D

Information Technology           [blank cell]     [blank cell]     [blank cell]
[blank cell]                        Google            5.985           27.948
[blank cell]                     Mastercard           2.896           24.292
Consumer Discretionary           [blank cell]     [blank cell]     [blank cell]
[blank cell]                        xxxxxx         xxxxxxxxx          xxxxxxxxx

这是我现在的代码:

with open('test.csv', 'rU') as csvfile:
    lines = csv.reader(csvfile)
    for line in lines:
      elif line[0] == "" and line[1] != "":
        data = test_two_level(line)
        bottom_level = {
        "name": line[2],
        "data": data}

def test_two_level(line):
  data = {
      "portfolio_average_weight":line[3],
      "portfolio_total_return":line[4]}
  return data

我希望最终输出的样子是这样的:

{
  "stocks": [
    {
      "data": {
        "portfolio_average_weight": "5.985",
        "portfolio_total_return": "27.948",
        "group": "Information Technology"
      },
      "name": "Google Inc              "
    },
    {
      "data": {
        "portfolio_average_weight": "2.896",
        "portfolio_total_return": "24.292",
        "group": "Information Technology"
      },
      "name": "Mastercard Inc          "
    }]
}

下面是csv文件:

Information Technology,,,
,Google Inc              ,5.985,27.948
,Mastercard Inc          ,2.896,24.292
Consumer Discretionary,,,

1 个回答

1

我更喜欢用csv.DictReader而不是csv.reader,因为用csv.DictReader写出来的代码更容易看懂。而且每一行数据都读成一个字典,这样代码看起来也更整齐,特别是在处理JSON对象的时候,JSON对象通常也是由一个或多个字典组成的。

import csv, json

with open('csv_to_json_test.csv', 'rb') as csvfile:
    csvfields = 'group', 'name', 'average_weight', 'total_return'
    reader = csv.DictReader(csvfile, fieldnames=csvfields)
    database = {}
    stocks = database['stocks'] = []  # initialize item to be parsed
    group = None
    for row in reader:
        if row['group']:
            group = row['group']
        else:
            stocks.append(
                {
                    'data': {
                        "portfolio_average_weight": row['average_weight'],
                        "portfolio_total_return": row['total_return']
                    },
                    'name': row['name'].rstrip(),  # strips trailing spaces
                    'group': group,
                }
            )

print 'database =',
print json.dumps(database, indent=4)

输出:

database = {
    "stocks": [
        {
            "group": "Information Technology",
            "data": {
                "portfolio_average_weight": "5.985",
                "portfolio_total_return": "27.948"
            },
            "name": "Google Inc"
        },
        {
            "group": "Information Technology",
            "data": {
                "portfolio_average_weight": "2.896",
                "portfolio_total_return": "24.292"
            },
            "name": "Mastercard Inc"
        }
    ]
}

撰写回答