在json fi中分组月份

# import json module for parsing import json import re # define a list of keywords keywords = ('tax', 'Tax', 'policy', 'Policy', 'regulation', 'Regulation', 'spending', 'Spending', 'budget', 'Budget', 'oil', 'Oil', 'Holyrood', 'holyrood', 'Scottish parliament', 'Scottish Parliament', 'scottish parliament' ) with open('Aberdeen2005.json') as json_file: # read json file line by line for line in json_file.readlines(): json_dict = json.loads(line) if any(keyword in json_dict["body"].lower() for keyword in keywords): print(json_dict['date'].split()[0])

2条回答

网友

1楼 · 编辑于 2024-05-12 18:13:00

这里只是一个示例，因为您没有提供JSON文件的样子

import re

months = ('January', 
         'February', 
         'March', 
         'April',
         'May', 
         'June', 
         'July',
         'August',
         'September',
         'October',
         'November',
         'December')

file_content = '''
December 29, 2005 Thursday
December 15, 2005 Thursday
April 21, 2005
April 6, 2005
January 19, 2005
January 19, 2005
January 11, 2005
'''

d = {m:0 for m in months}

for line in file_content.splitlines():
    if line != '':
        # filter out empty strings from the split
        data = list(filter(lambda x: x != '', re.split('[,\s+]', line)))
        d[data[0]] += 1 # Grouping

print(d)
print(d['January'])

输出

{'August': 0, 'July': 0, 'November': 0, 'December': 2, 'April': 2, 'May': 0, 'October': 0, 'January': 3, 'September': 0, 'June': 0, 'March': 0, 'February': 0}
3

网友

2楼 · 编辑于 2024-05-12 18:13:00

你可以用熊猫试试这个：

import pandas
import json

# note if this actually works your json file is not correctly formed
df = pandas.DataFrame([json.loads(l) for l in open('Aberdeen2005.json')])

# Parse dates and set index
df.date = pandas.to_datetime(df.date)
df.set_index('date', inplace=True)

# match keywords
matchingbodies = df[df.body.str.contains("|".join(keywords))].body

# Count by month
counts = matchingbodies.groupby(lambda x: x.month).agg(len)

相关问题更多 >

编程相关推荐

热门问题

热门文章