Python CSV到JSON数组对象,具有来自CSV的唯一值作为一个JSON对象,其中多于

2024-04-26 12:37:46 发布

您现在位置:Python中文网/ 问答频道 /正文

拥有此CSV:

Domain,IP,Server,PoweredBy,MetaGenerator,Email
http://www.example1.com,1.1.1.1,,,,
http://www.example2.com,2.2.2.2,Apache,PHP/5.5.9-1ubuntu4.20,,
http://www.example3.com,3.3.3.3,Apache,PHP/5.5.9-1ubuntu4.20,Easy Digital Downloads v2.4.9;Powered by Visual Composer - drag and drop page builder for WordPress.,info@example3.com;sales@example3.com

尝试构建一个JSON对象数组,其中每个对象都是CSV值的唯一组合,其中有许多(用“;”分隔),即

我们可以看到,我们有不同的元生成器和电子邮件www.example3.com你知道吗

在这种情况下,对象的JSON数组应该如下所示,每个组合都是数组中的JSON对象:

[{'Domain': 'http://www.example1.com',
  'Email': '',
  'IP': '1.1.1.1',
  'MetaGenerator': '',
  'PoweredBy': '',
  'Server': ''},
 {'Domain': 'http://www.example2.com',
  'Email': '',
  'IP': '2.2.2.2',
  'MetaGenerator': '',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'sales@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Easy Digital Downloads v2.4.9',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'sales@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'info@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Easy Digital Downloads v2.4.9',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'},
 {'Domain': 'http://www.example3.com',
  'Email': 'info@example3.com',
  'IP': '2.2.2.2',
  'MetaGenerator': 'Powered by Visual Composer - drag and drop page builder for WordPress.',
  'PoweredBy': 'PHP/5.5.9-1ubuntu4.20',
  'Server': 'Apache'}]

有以下Python代码:

import csv
import pprint
import json

with open("results.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    out=[]
    d=dict()
    for row in reader:
        if ';' in row['Email']:
          val = row['Email'].split(';')
          for v in val:
            d['Email']=v
            out.append(d)    
        if ';' in row['MetaGenerator']:
          val = row['MetaGenerator'].split(';')
          for v in val:
            d['MetaGenerator']=v
            out.append(d)
        else:
          d=row
          out.append(d) 


pprint.pprint(out)

但它不能正常工作。你知道吗

如何实现我的目标?伪代码也可以。秩序并不重要。我应该使用什么模块?你知道吗

谢谢你


Tags: ipcomhttpforserveremaildomainapache
1条回答
网友
1楼 · 发布于 2024-04-26 12:37:46

试试这个(检查itertools文档):

import csv
import pprint
import json
import itertools

out=[]
with open("results.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    for row in reader:

        Domains = row['Domain'].split(";")
        Ips = row['IP'].split(";")
        Servers = row['Server'].split(";")
        Emails = row['Email'].split(";")
        MetaGenerators = row['MetaGenerator'].split(";")
        PoweredBy = row['PoweredBy'].split(";")

        for comb in itertools.product(Domains, Ips, Servers, Emails, MetaGenerators, PoweredBy):
            (cDomain, cIp, cServer, cEmail, cMeta, cPowered) = comb

            out.append({
                    'Domain': cDomain,
                    'IP': cIp,
                    'Server': cServer,
                    'Email': cEmail,
                    'MeraGenerator': cMeta,
                    'PoweredBy': cPowered
                })

pprint.pprint(out)

检查这个可读性较差但更智能的解决方案,它与csv字段隔离:

out=[]
with open("results.csv", 'r') as csvfile:
    reader = csv.DictReader(csvfile, delimiter=',')
    headers = reader.fieldnames

    for row in reader:
        fields = [value.split(";") for key, value in row.iteritems()]
        out += [{headers[key]: value for key, value in enumerate(comb)} for comb in itertools.product(*fields)]

pprint.pprint(out)

相关问题 更多 >