使用python排列文本文件中的数据:按月份分组数据

2024-03-28 20:46:34 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个名为data.txt的文本文件,其中包含以下信息

03/05/2016  502
04/05/2016  502
05/05/2016  501
07/05/2016  504
09/05/2016  505
13/05/2016  506
23/05/2016  501
30/05/2016  501
02/06/2016  502
04/06/2016  502
06/06/2016  501
07/06/2016  504
08/06/2016  505
13/06/2016  506
25/06/2016  499
31/06/2016  501
04/07/2016  501

我希望输出是这样的。此数据应存储在另一个名为reslt.txt的文件中 (更新)

^{pr2}$

reslt.txt文件中的第3列是data.txt文件中第2列值的总和。 我正在使用Python2.7,但我不知道如何实现这一点 请帮帮我,伙计

更新2

03/05/2016  502
04/05/2016  502.2
05/05/2016  501.9
07/05/2016  504.6
09/05/2016  505
13/05/2016  506.1
23/05/2016  501.3
30/05/2016  501.4
02/06/2016  502
04/06/2016  502
06/06/2016  501
07/06/2016  504
08/06/2016  505
13/06/2016  506
25/06/2016  499
31/06/2016  501
04/07/2016  501 

Tags: 文件数据txt信息data文本文件总和伙计
2条回答
import re 
from collections import defaultdict

def sum_months(data_path):
    with open (data_path, 'r') as f:
        rows = f.readlines()
        sumdict  = defaultdict(int)
        for row in rows:
            month = re.findall("/\d{2}/\d{4}", row)[0]
            sum = re.findall("\d+$", row)[0]
            sumdict[month] += eval(sum) 
    return sumdict   

def pad_strings_and_create_rows(sumdict):
    rows = []
    for k, v in sumdict.iteritems():
        rows.append('01' + k + ' - ' + '30' + k + ' ' + str(v))
    return list(sorted(rows))            

def write_result_to_file(results_lst):
    with open('reslt.txt', 'a') as f:
        for row in results_lst:
            f.write(row + '\n')  

write_result_to_file(pad_strings_and_create_rows(sum_months('data.txt')))  

看起来输出要求有点变化!然而,这应该提供足够的动力来消除这些粗糙的边缘。在

dataStore = {}

# Method to process an input line
def processLine(dateStr, val):
  if dateStr not in dataStore:
    dataStore[dateStr] = val
  else:
    dataStore[dateStr] += val

# Method to read input file line by line
def doStuff(inFile, outFile):
  with open(inFile, 'r') as fp:
    for line in fp:
      dateStr, val = line.split()

      # cast decimal value to integer
      val = int(val)

      # process the date string to only keep the month and year
      dateStr = dateStr.split('/')
      dateStr = "/".join((dateStr[1], dateStr[2]))

      processLine(dateStr, val)

  # once you are done reading file, generate output
  writeBuf = []
  for key in dataStore:
    writeBuf.append((key, dataStore[key]))
  writeBuf.sort()

  with open(outFile, 'wb') as fp:
    for tup in writeBuf:
      line = '01/'+tup[0]+' - 30/'+tup[0] + '  ' + str(tup[1]) + '\n'
      fp.write(line)

if __name__ == '__main__':
  inFile = 'data.txt'
  outFile = 'result.txt'

  doStuff(inFile, outFile)

您可以很容易地将这一天也包括在内。只需修改我处理dateStr的部分。processLine方法也会改变。在


StackOverflow不是让别人做你的全部作业。展示你当前的进展,随时寻求帮助解决错误和改进。下次你在这里寻求帮助时请记住这一点。在

相关问题 更多 >