从文本文件绘制非均匀时间序列数据

0 投票
1 回答
83 浏览
提问于 2025-04-12 14:02

这个问题是对如何读取.txt文件来绘制图表的后续提问。

我有一个文件,里面存储了时间序列数据,格式如下:

00:01:28,102,103,103 20-03-2024
00:02:16,111,110,110
00:02:33,108,109,109
00:02:49,107,108,108
...24 hours read...  # not in the measurement file
23:58:54,111,112,112
23:59:11,109,110,110
23:59:47,115,116,117
00:00:04,115,116,116 21-03-2024
00:00:20,121,122,120
00:00:36,124,125,125
...24 hours read...
23:59:02,115,115,116
23:59:19,114,114,114
23:59:51,113,114,115
00:00:07,113,114,115 22-03-2024
00:00:24,116,117,115
00:00:45,115,115,116
...24 hours read...
23:59:08,101,101,100
23:59:32,103,103,102
23:59:48,102,102,102
...Next day...

每一行都有一个时间戳、三个数字读数,有时还会有一个日期,表示新一天的开始。我想用pandas和matplotlib来绘制这些数据,但遇到了两个主要问题:x轴的标签(小时)重叠了,而且图表加载得很慢。

这是我目前绘图的方法:

plt.figure(figsize=(15,9))
plt.xlabel('Day')
plt.ylabel('Voltage')
# Plot three series from the data
plt.plot(C0Temp, C1Temp, label="Voltage", color=LineColorTemp1Text)
plt.plot(C2Temp, C3Temp, label="Max", color='r')
plt.plot(C4Temp, C5Temp, label="Min", color='g')
plt.legend()

# Attempt to format x-axis to handle daily data
locator = mdates.AutoDateLocator(minticks=12, maxticks=24)
plt.gcf().axes[0].xaxis.set_major_locator(locator)
plt.xticks(rotation=45)

我希望能得到一些建议,如何有效地按天甚至按月绘制这些数据,确保x轴的标签清晰可读,同时图表加载得更快。

1 个回答

1

由于这个文本文件的格式不统一,我们需要逐行读取它。这样做可以处理数据表示上的不同情况,比如某些行可能有日期,有些行则没有,还有一些行是非数据行(比如“24小时阅读...”和“第二天”)。通过逐行读取,脚本可以区分哪些是数据条目,哪些是元数据或注释,从而确保只处理相关的信息。这个方法能把最初不规则的文件整理成一个结构化的数据集,方便后续分析和可视化。

我建议统一测量输出的格式。

解析文件

import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd

# Initialize variables
timestamps = []
values1 = []
values2 = []
values3 = []
current_date = None

# Implement parsing logic to accurately handle the lines with and without dates
# 00_test.txt is the data from the OP in a text file
with open('00_test.txt', "r") as file:
    for line in file:
        line = line.strip()

        if not line:
            continue  # Skip non-data lines if they exist
        
        parts = line.split(',')
        if len(parts) == 4 and parts[-1].count('-') == 2:  # Checking if the last part is a date
            # Extract date from the last part
            time, val1, val2, val3, date = parts[0], parts[1], parts[2], parts[3].split(' ')[0], parts[3].split(' ')[1]
            current_date = pd.to_datetime(date, format="%d-%m-%Y")
        else:
            # Process data lines without a date
            time, val1, val2, val3 = parts[0], parts[1], parts[2], parts[3]
            if current_date:  # Ensure a date has been set
                datetime_str = f"{current_date.date()} {time}"
                datetime_obj = pd.to_datetime(datetime_str, format="%Y-%m-%d %H:%M:%S")
                timestamps.append(datetime_obj)
                values1.append(float(val1))
                values2.append(float(val2))
                values3.append(float(val3))

创建数据框

# Ensure the DataFrame is created outside the loop
df = pd.DataFrame({'DateTime': timestamps, 'Value1': values1, 'Value2': values2, 'Value3': values3})
df.set_index('DateTime', inplace=True)

绘图

这个图展示了一个数据框,每个数据点都有标记,标记的大小是固定的,坐标轴也有标签。x轴上的主要刻度显示的是“年-月-日”的日期格式,次要刻度则表示在特定范围内每4小时的时间。主要刻度的标签旋转90度并居中,而次要刻度的标签保持水平并居中。图中有主次刻度的网格线,样式不同,以便区分日期和时间。布局经过调整,以便更清晰地显示,旋转的标签也更易于阅读。

已经有很多问题讨论了如何使用pandas数据框绘图,以及如何格式化pandas数据框的日期时间x轴。我鼓励你去看看这些资源,并根据你的需求调整图表。如果你有更多关于绘图的问题或具体的调整需求,可以考虑发布一个新问题,并引用已有的讨论。

# Plot the DataFrame directly
ax = df.plot(marker='.', figsize=(15, 9), xlabel='Time', ylabel='Voltage')

# Setting the major ticks to display the date in 'Y-m-d' format
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))

# Setting the minor ticks to display the time
ax.xaxis.set_minor_locator(mdates.HourLocator(byhour=range(4, 21, 4)))  # Adjust the interval as needed
ax.xaxis.set_minor_formatter(mdates.DateFormatter('%H:%M'))

# Enhance the display for readability
plt.setp(ax.xaxis.get_majorticklabels(), rotation=90, ha="center")  # Rotate major ticks for better visibility
plt.setp(ax.xaxis.get_minorticklabels(), rotation=0, ha="center")  # Rotate and right-align minor ticks

ax.xaxis.grid(True, which='major', linestyle='-', linewidth='0.5', color='black')  # Major grid lines
ax.xaxis.grid(True, which='minor', linestyle=':', linewidth='0.5', color='gray')  # Minor grid lines

plt.tight_layout()  # Adjust layout to make room for tick labels
plt.show()

在此输入图片描述


df

                     Value1  Value2  Value3
DateTime                                   
2024-03-20 00:02:16   111.0   110.0   110.0
2024-03-20 00:02:33   108.0   109.0   109.0
2024-03-20 00:02:49   107.0   108.0   108.0
2024-03-20 23:58:54   111.0   112.0   112.0
2024-03-20 23:59:11   109.0   110.0   110.0
2024-03-20 23:59:47   115.0   116.0   117.0
2024-03-21 00:00:20   121.0   122.0   120.0
2024-03-21 00:00:36   124.0   125.0   125.0
2024-03-21 23:59:02   115.0   115.0   116.0
2024-03-21 23:59:19   114.0   114.0   114.0
2024-03-21 23:59:51   113.0   114.0   115.0
2024-03-22 00:00:24   116.0   117.0   115.0
2024-03-22 00:00:45   115.0   115.0   116.0
2024-03-22 23:59:08   101.0   101.0   100.0
2024-03-22 23:59:32   103.0   103.0   102.0
2024-03-22 23:59:48   102.0   102.0   102.0

撰写回答