解析csv文件并根据相对大小将行写入文件

2024-05-29 03:00:24 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一长串的天气变量,我已经过滤掉了那些不符合特定标准的。例如,所有数据点仅位于上午11点(11点)和下午5点(17点)之间。11点到17点之间的数据代表单个事件,而不是每天都包含一个事件。我在试着确定哪几天发生了一件事。我知道每发生一个新事件,时间(HH24)列中的值比它前面的值低。例如,如果值16(5pm)后跟11、12、13、14或15,则我知道数据已进入新的日期和事件。你知道吗

我尝试编写的代码将获取包含最后一个值的行(例如17)并将其写入文件,然后写入下一行。这样,新的csv文件将包含每个事件的开始时间(和其他信息)和结束时间。我假设我将需要使用for命令,但我不知道如何使用csv writer来完成这个特殊的挑战。下面是我的代码概要,下面是我需要帮助的部分

import csv

with open("weather_out_2000_2006_time_filtered_and_speed_filtered.csv", "rb") as input, open("X:\weatherresults\seabreezeevents.csv", "wb") as wanted:
    reader = csv.DictReader(input, delimiter=",", skipinitialspace=True)
    fieldnames = reader.fieldnames
    writer_wanted = csv.DictWriter(wanted, fieldnames, delimiter=",")
    writer_wanted.writeheader()

    for line_number, row in enumerate(reader):
        try:
            if float(row["HH24"]) < #the value in the subsequent row:
                writer_wanted.writerow(row) # and also write subsequent row
        except:
            print "Failed to parse line", line_number
            print row

我的数据文件看起来像这样。我已经展示了HH24从高值到低值的转变,所以你可以理解我的意思。你知道吗

hd,Station Number,Year Month Day Hours Minutes in YYYY,MM,DD,HH24,MI format in Local time,Year Month Day Hours Minutes in YYYY,MM,DD,HH24,MI format in Local standard time,Year Month Day Hours Minutes in YYYY,MM,DD,HH24,MI format in Universal coordinated time,Precipitation since last (AWS) observation in mm,Quality of precipitation since last (AWS) observation value,Air Temperature in degrees Celsius,Quality of air temperature,Air temperature (1-minute maximum) in degrees Celsius,Quality of air temperature (1-minute maximum),Air temperature (1-minute minimum) in degrees Celsius,Quality of air temperature (1-minute minimum),Wet bulb temperature in degrees Celsius,Quality of Wet bulb temperature,Wet bulb temperature (1 minute maximum) in degrees Celsius,Quality of wet bulb temperature (1 minute maximum),Wet bulb temperature (1 minute minimum) in degrees Celsius,Quality of wet bulb temperature (1 minute minimum),Dew point temperature in degrees Celsius,Quality of dew point temperature,Dew point temperature (1-minute maximum) in degrees Celsius,Quality of Dew point Temperature (1-minute maximum),Dew point temperature (1 minute minimum) in degrees Celsius,Quality of Dew point Temperature (1 minute minimum),Relative humidity in percentage %,Quality of relative humidity,Relative humidity (1 minute maximum) in percentage %,Quality of relative humidity (1 minute maximum),Relative humidity (1 minute minimum) in percentage %,Quality of Relative humidity (1 minute minimum),Wind (1 minute) speed in km/h,Wind (1 minute) speed quality,Minimum wind speed (over 1 minute) in km/h,Minimum wind speed (over 1 minute) quality,Wind (1 minute) direction in degrees true,Wind (1 minute) direction quality,Standard deviation of wind (1 minute),Standard deviation of wind (1 minute) direction quality,Maximum wind gust (over 1 minute) in km/h,Maximum wind gust (over 1 minute) quality,Visibility (automatic - one minute data) in km,Quality of visibility (automatic - one minute data),Mean sea level pressure in hPa,Quality of mean sea level pressure,Station level pressure in hPa,Quality of station level pressure,QNH pressure in hPa,Quality of QNH pressure,#
    hd,40842,2000,3,22,13,40,2000,3,22,13,40,2000,3,22,13,40,0,N,20.4,N,20.5,N,20.4,N,20.2,N,20.2,N,20.1,N,20.1,N,20.1,N,20,N,98,N,,N,,N,9,N,8,N,18,N,7,N,11,N,,N,1013.3,N,1012.2,N,1013.3,N,#
    hd,40842,2000,3,22,13,47,2000,3,22,13,47,2000,3,22,13,47,0,N,20.5,N,20.5,N,20.5,N,20.2,N,20.2,N,20.2,N,20.1,N,20.1,N,20,N,97,N,,N,,N,4,N,0,N,56,N,75,N,5,N,,N,1013.2,N,1012.1,N,1013.2,N,#
    hd,40842,2000,3,23,11,0,2000,3,23,11,0,2000,3,23,11,0,0,N,23.4,N,23.4,N,23.3,N,21.3,N,21.4,N,21.3,N,20.2,N,20.3,N,20.2,N,82,N,,N,,N,8,N,5,N,66,N,2,N,9,N,,N,1013.6,N,1012.5,N,1013.6,N,#
    hd,40842,2000,3,23,11,1,2000,3,23,11,1,2000,3,23,11,1,0,N,23.4,N,23.4,N,23.4,N,21.4,N,21.4,N,21.3,N,20.3,N,20.3,N,20.2,N,82,N,,N,,N,8,N,5,N,68,N,3,N,9,N,,N,1013.6,N,1012.5,N,1013.6,N,#

Tags: ofcsvinrowpointdegreestemperaturequality
1条回答
网友
1楼 · 发布于 2024-05-29 03:00:24

每当日期发生变化时,都要写一行,因此我认为最好创建一个日期变量进行比较。 (我在评论中提到,为什么仅仅比较“HH24”值无法确定何时到达新日期)

跟踪和写出上一行要容易得多(因为您已经处理了它),而不是下一行要容易得多,所以您应该这样考虑继续。 类似以下的内容应该会有所帮助(未经测试):

...
import datetime
prev_row = None
for line_number, row in enumerate(reader):
    try:
        dt = datetime.date(year=row["Year"], month=row["Month"], day=row["day"])
        if prev_row is not None and dt > prev_row['dt']:
            writer_wanted.writerow(prev_row['row'])
            writer_wanted.writerow(row)
        prev_row = {'row':row, 'dt':dt}
    except:
        print "Failed to parse line", line_number
        print row

编辑:

您的计划中,这一行的第一部分:

with open("weather_out_2000_2006_time_filtered_and_speed_filtered.csv", "rb") as input

打开命名的.csv文件进行输入(因为mode是'rb'-docs here)。你知道吗

同一行的下一部分:

open("X:\weatherresults\seabreezeevents.csv", "wb") as wanted

打开指定的输出文件('wb'模式)-参见与上面相同的引用。你知道吗

此时,变量名inputwanted现在都引用了file type的对象。你知道吗

接下来,您的程序使用csv模块以特定的方式读取文件,这种方式有助于解析逗号分隔的文本文件;并将该引用分配给reader变量。 类似地,它将变量writer_wanted赋给csv.DictWriter,这将有助于在将行写入输出文件(由wanted引用)时格式化。你知道吗

之后,一次读取一行:

for line_number, row in enumerate(reader):

一次写一行:

writer_wanted.writerow(row)

如果你想要更多的细节,你最好的办法是通过一些Python教程(Google是你的朋友)。你知道吗

相关问题 更多 >

    热门问题