使用Python根据文件中的列数从文件生成JSON结构

3条回答

网友

1楼 · 编辑于 2024-05-16 07:58:49

您可以使用pandas和.to_json(orient='records')

df = pd.read_csv(open(file))
df.to_json(orient='records')

这将输出与文件中ID相同数量的记录：

[{"Member_ID":"M1000","User_ID":"U1000","A_ID":"A1000","Login_ID":"Jim1","First_Name":"Jim","Last_Name":"Kong"},...,{"Member_ID":"M2000","User_ID":"U2000","A_ID":"A2000","Login_ID":"OlilaJ","First_Name":"Olila","Last_Name":"Jayavarman"}]

网友

2楼 · 编辑于 2024-05-16 07:58:49

使用DictReader获取文件中的头

import csv
with open('names.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    print reader.fieldnames # gets you file header
    for row in reader:
        Member_ID = row["Member_ID"]
        User_ID = row["User_ID"]
        Proxy_ID = row.get("Proxy_ID", "")
        A_ID = row.get("A_ID", "")

        if Proxy_ID:
            ....
        else:
            ....

网友

3楼 · 编辑于 2024-05-16 07:58:49

正如其他人所建议的，使用csv模块可能更容易，但使用约定方法也可以实现：

delim = "," # Just in case we switch to tsv or something

with open('test.txt', 'r') as file:
    # Create a list of valid headers in comma seperated values and their respective index
    header = [(i, col) for i, col in enumerate(next(file).rstrip().split(delim)) if col.endswith('_ID')]

    # Create a list of data in comma seperated values
    data = [l.rstrip().split(delim) for l in file.readlines()]

    # Go through each record to create a payload
    for record in data:

        # Here we use the header index to retrieve the respective data to create the dictionary with list comprehension
        payload = {'IndividualInfo': [{key: record[i], 'Identifiertype': '001', 'EType':'01'} for i, key in header]}

        # Do whatever you need with json.dumps(payload)

结果如下：

# the index/header pairs
# [(0, 'Member_ID'), (1, 'User_ID'), (2, 'Proxy_ID'), (3, 'A_ID'), (4, 'Login_ID')]

# the separated data
# [['M1000', 'U1000', 'P1000', 'A1000', 'Jim1', 'Jim', 'Kong'], ['M2000', 'U2000', 'P2000', 'A2000', 'OlilaJ', 'Olila', 'Jayavarman'], ['M3000', 'U3000', 'P3000', 'A3000', 'LisaKop', 'Lisa', 'Kopkingg'], ['M4000', 'U4000', 'P4000', 'A4000', 'KishoreP', 'Kishore', 'Pindhar'], ['M5000', 'U5000', 'P5000', 'A5000', 'Gobi123', 'Gobi', 'Nadar']]

# The payloads
# {'IndividualInfo': [{'Member_ID': 'M1000', 'Identifiertype': '001', 'EType': '01'}, {'User_ID': 'U1000', 'Identifiertype': '001', 'EType': '01'}, {'Proxy_ID': 'P1000', 'Identifiertype': '001', 'EType': '01'}, {'A_ID': 'A1000', 'Identifiertype': '001', 'EType': '01'}, {'Login_ID': 'Jim1', 'Identifiertype': '001', 'EType': '01'}]}
# {'IndividualInfo': [{'Member_ID': 'M2000', 'Identifiertype': '001', 'EType': '01'}, {'User_ID': 'U2000', 'Identifiertype': '001', 'EType': '01'}, {'Proxy_ID': 'P2000', 'Identifiertype': '001', 'EType': '01'}, {'A_ID': 'A2000', 'Identifiertype': '001', 'EType': '01'}, {'Login_ID': 'OlilaJ', 'Identifiertype': '001', 'EType': '01'}]}
# {'IndividualInfo': [{'Member_ID': 'M3000', 'Identifiertype': '001', 'EType': '01'}, {'User_ID': 'U3000', 'Identifiertype': '001', 'EType': '01'}, {'Proxy_ID': 'P3000', 'Identifiertype': '001', 'EType': '01'}, {'A_ID': 'A3000', 'Identifiertype': '001', 'EType': '01'}, {'Login_ID': 'LisaKop', 'Identifiertype': '001', 'EType': '01'}]}
# {'IndividualInfo': [{'Member_ID': 'M4000', 'Identifiertype': '001', 'EType': '01'}, {'User_ID': 'U4000', 'Identifiertype': '001', 'EType': '01'}, {'Proxy_ID': 'P4000', 'Identifiertype': '001', 'EType': '01'}, {'A_ID': 'A4000', 'Identifiertype': '001', 'EType': '01'}, {'Login_ID': 'KishoreP', 'Identifiertype': '001', 'EType': '01'}]}
# {'IndividualInfo': [{'Member_ID': 'M5000', 'Identifiertype': '001', 'EType': '01'}, {'User_ID': 'U5000', 'Identifiertype': '001', 'EType': '01'}, {'Proxy_ID': 'P5000', 'Identifiertype': '001', 'EType': '01'}, {'A_ID': 'A5000', 'Identifiertype': '001', 'EType': '01'}, {'Login_ID': 'Gobi123', 'Identifiertype': '001', 'EType': '01'}]}

注意：我使用enumerate()创建了index/header组合，因为如果在_ID列之间有其他列，它将为您提供一个精确的方法来定位相应的数据。你知道吗

编辑：

对于Python2.7，请改用以下命令（sample on repl.it）：

delim = "," # Just in case we switch to tsv or something

with open('test.txt', 'r') as file:
    # Create a list of valid headers in comma seperated values and their respective index
    header = [(i, col) for i, col in enumerate(next(file).rstrip().split(delim)) if col.endswith('_ID')]
    # Create a list of data in comma seperated values
    data = []
    for f in file:
        data.append(f.rstrip().split(delim))

# We're done with reading the file,
# We can proceed outside the `with` context manager from this point

# Go through each record to create a payload
for record in data:

    # Here we use the header index to retrieve the respective data to create the dictionary with list comprehension
    payload = {'IndividualInfo': [{key: record[i], 'Identifiertype': '001', 'EType':'01'} for i, key in header]}

    # Do whatever you need with json.dumps(payload)

编辑：

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用Python根据文件中的列数从文件生成JSON结构

编辑：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >