如何修复Python中处理日期的错误?

0 投票
1 回答
35 浏览
提问于 2025-04-14 18:01

我刚开始学习Python,正在尝试一些我在网上找到的挑战。我想从一个字符串中提取日期,这样我就可以把它放到一列里。给我的输入是:

"131594", "", "BIDGROUP", 1, 0, 0, 2, "", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 0,

"131594", "AWARD", "UNTOUCHABLE", 1, 1, 0, 1, "", 0:00, 0:00, 10JUN2014, 13JUN2014 23:59, 01JAN2009, 01JAN2009, false, 100,

"131594", "AWARD", "ADVANCED_TRIP", 1, 2, 0, 0, "740025Jun2014,705406Jun2014,737722Jun2014,696130Jun2014", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

首先,我查找元素“ADVANCE_TRIP”,然后对于每个标识符,我在字符串中找到的地方,需要创建一个新的行,命名为“TRIP_ID”,并保留之前提到的日期。我尝试后得到的结果是:

"131594", "", "BIDGROUP", 1, 0, 0, 2, "", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 0,

"131594", "AWARD", "UNTOUCHABLE", 1, 1, 0, 1, "", 0:00, 0:00, 10JUN2014, 13JUN2014 23:59, 01JAN2009, 01JAN2009, false, 100,

"131594", "AWARD", "TRIP_ID", 1, 2, 0, 0, "7400", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 3, 0, 0, "7054", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 4, 0, 0, "7377", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 5, 0, 0, "6961", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

现在,正确的输出应该是这样的:

"131594", "", "BIDGROUP", 1, 0, 0, 5, "", 0:00, 0:00, 01JAN2009, 01JAN2009, 01JAN2009, 01JAN2009, false, 0,

"131594", "AWARD", "UNTOUCHABLE", 1, 1, 0, 1, "", 0:00, 0:00, 10JUN2014, 13JUN2014 23:59, 01JAN2009, 01JAN2009, false, 100,

"131594", "AWARD", "TRIP_ID", 1, 2, 0, 0, "7400", 0:00, 0:00, 25Jun2014, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 3, 0, 0, "7054", 0:00, 0:00, 06Jun2014, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 4, 0, 0, "7377", 0:00, 0:00, 22Jun2014, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

"131594", "AWARD", "TRIP_ID", 1, 5, 0, 0, "6961", 0:00, 0:00, 30Jun2014, 01JAN2009, 01JAN2009, 01JAN2009, false, 15,

我唯一不明白的是,如何提取每个“TRIP_ID”标识符旁边的日期,并把它放到相应的列中,也就是第十一列。例如,在我的输出中,我有:“7400”,0:00,0:00,01JAN2009,01JAN2009,但应该是:“7400”,0:00,0:00,25Jun2014,01JAN2009。

这是我写的代码:

import sys

lines = []

for line in sys.stdin:
    lines.append(line.strip())

output_lines = []

for line in lines:
    elements = line.split(", ")
    if elements[2] == '"ADVANCED_TRIP"':
        elements[2] = '"TRIP_ID"'
        trip_ids = elements[7].split(",")
        for i, trip_id in enumerate(trip_ids):
            trip_id = trip_id.strip('"')
            output_line = elements[:7] + [f'"{trip_id[:4]}"'] + elements[8:]
            output_line[4] = str(int(output_line[4]) + i)
            output_lines.append(output_line)
    else:
        output_lines.append(elements)

for output_line in output_lines:
    print(", ".join(output_line))

有没有人知道我该如何继续?

1 个回答

0

你走在正确的道路上。要正确提取每个“TRIP_ID”相关的日期,你可以使用正则表达式来识别字符串中的日期格式……你可以试试这个

import sys
import re
from datetime import datetime

lines = []

for line in sys.stdin:
    lines.append(line.strip())

output_lines = []

for line in lines:
    elements = line.split(", ")
    if elements[2] == '"ADVANCED_TRIP"':
        elements[2] = '"TRIP_ID"'
        trip_ids = elements[7].split(",")
        dates = re.findall(r'\d{2}[A-Za-z]{3}\d{4}', line)  # Extract dates (e.g., 01JAN2009)
        for i, (trip_id, date) in enumerate(zip(trip_ids, dates)):
            trip_id = trip_id.strip('"')
            output_line = elements[:7] + [f'"{trip_id[:4]}"'] + elements[8:]
            output_line[4] = str(int(output_line[4]) + i)
            output_line[10] = date  # Replace the placeholder date with the extracted date
            output_lines.append(output_line)
    else:
        output_lines.append(elements)

for output_line in output_lines:
    print(", ".join(output_line))

希望这对你有帮助!祝你在学习Python的过程中好运!

撰写回答