复制粘贴解决数据处理错误?

2024-04-20 06:02:43 发布

您现在位置:Python中文网/ 问答频道 /正文

在使用python2.7处理linux16.04下的数据时,我遇到了一个非常奇怪的问题。 我使用以下函数创建.csv文件:

from ast import literal_eval
    with open('logs.csv') as f:
    data = [literal_eval(line) for line in f]

文件已正确创建,如下所示:

('2017-04-01 12:05:00','0.01770001','0.0177887','0.01780275','0.01770001')
('2017-04-01 12:10:00','0.0177887','0.01771308','0.01785263','0.01771039')
('2017-04-01 12:15:00','0.01773','0.01780092','0.01780092','0.01773')
('2017-04-01 12:20:00','0.0178','0.01781212','0.01784922','0.01774015')
('2017-04-01 12:25:00','0.01781212','0.01774528','0.01782994','0.01774528')
('2017-04-01 12:30:00','0.01774529','0.0178732','0.01788145','0.01774509')
('2017-04-01 12:35:00','0.01788145','0.01793318','0.01793318','0.01788145')
('2017-04-01 12:40:00','0.01794','0.01780093','0.01799984','0.01780092')
('2017-04-01 12:45:00','0.01785694','0.01806699','0.01807519','0.01785694')
('2017-04-01 12:50:00','0.01807999','0.01819687','0.01827573','0.018027')
('2017-04-01 12:55:00','0.01819687','0.01825402','0.0184','0.01800011')
('2017-04-01 13:00:00','0.01822416','0.01830994','0.01835554','0.0181777')
('2017-04-01 13:05:00','0.01825415','0.01810171','0.01830986','0.01810008')
('2017-04-01 13:10:00','0.01810174','0.01818991','0.01818991','0.01810173')
('2017-04-01 13:15:00','0.01818991','0.01818002','0.01819687','0.01818001')
('2017-04-01 13:20:00','0.01818002','0.01821999','0.01822','0.01818001')

然后我把它通过这个代码来画一个图:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import dates, ticker
import matplotlib as mpl
from mpl_finance import candlestick_ohlc
from ast import literal_eval

mpl.style.use('default')


data = []
ohlc_data = [] 

with open('logsXMR.csv') as f:
    data = [literal_eval(line) for line in f]


for line in data:
        #ohlc_data.append((np.float64(line[0]), np.float64(line[1]), np.float64(line[2]), np.float64(line[3]), np.float64(line[4])))
        ohlc_data.append((dates.datestr2num(line[0]), np.float64(line[1]), np.float64(line[2]), np.float64(line[3]), np.float64(line[4])))

fig, ax1 = plt.subplots()
candlestick_ohlc(ax1, ohlc_data, width = 0.5/((24*60)/5), colorup = 'g', colordown = 'r', alpha = 0.8)

#ax1.xaxis.set_major_formatter(dates.DateFormatter('%d/%m/%Y %H:%M'))
ax1.xaxis.set_major_locator(ticker.MaxNLocator(10))

plt.xticks(rotation = 30)
plt.grid()
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Historical Data XMRUSD')
plt.tight_layout()
plt.show()

但每次我犯错误时:

Traceback (most recent call last):
  File "CSVing.py", line 15, in <module>
    data = [literal_eval(line) for line in f]
  File "/usr/lib/python2.7/ast.py", line 49, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/lib/python2.7/ast.py", line 37, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 2
    ('2017-04-01 12:10:00','0.0177887','0.01771308','0.01785263','0.01771039')
    ^

我不明白为什么会出现这个错误,因为如果我只是简单地将数据复制并粘贴到另一个文件中,一切正常,我就能完美地绘制图形。我只是不明白,因为这两个数据文件是相同的,没有额外的空间或任何东西。你知道吗

是什么导致了这个错误?我怎样才能直接使用生成的数据文件,而不需要将数据复制粘贴到另一个文件中?你知道吗

提前谢谢

像素


Tags: 文件infromimportfordataaseval
1条回答
网友
1楼 · 发布于 2024-04-20 06:02:43

我建议你重新考虑一下你的数据格式。我不知道这些数据是从哪里来的,但是以一种不包含偏执等内容的方式存储是合理的

如果您真的需要使用这种数据格式,您仍然可以使用例如pandas,并通过删除无用的字符来清理格式。你知道吗

u = """('2017-04-01 12:05:00','0.01770001','0.0177887','0.01780275','0.01770001')
('2017-04-01 12:10:00','0.0177887','0.01771308','0.01785263','0.01771039')
('2017-04-01 12:15:00','0.01773','0.01780092','0.01780092','0.01773')
('2017-04-01 12:20:00','0.0178','0.01781212','0.01784922','0.01774015')
('2017-04-01 12:25:00','0.01781212','0.01774528','0.01782994','0.01774528')
('2017-04-01 12:30:00','0.01774529','0.0178732','0.01788145','0.01774509')
('2017-04-01 12:35:00','0.01788145','0.01793318','0.01793318','0.01788145')
('2017-04-01 12:40:00','0.01794','0.01780093','0.01799984','0.01780092')
('2017-04-01 12:45:00','0.01785694','0.01806699','0.01807519','0.01785694')
('2017-04-01 12:50:00','0.01807999','0.01819687','0.01827573','0.018027')
('2017-04-01 12:55:00','0.01819687','0.01825402','0.0184','0.01800011')
('2017-04-01 13:00:00','0.01822416','0.01830994','0.01835554','0.0181777')
('2017-04-01 13:05:00','0.01825415','0.01810171','0.01830986','0.01810008')
('2017-04-01 13:10:00','0.01810174','0.01818991','0.01818991','0.01810173')
('2017-04-01 13:15:00','0.01818991','0.01818002','0.01819687','0.01818001')
('2017-04-01 13:20:00','0.01818002','0.01821999','0.01822','0.01818001')"""

import io
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import dates
from mpl_finance import candlestick_ohlc

replace = {"\(" : "", "\)" : "", "'" : ""}
df = pd.read_csv(io.StringIO(u), sep=",",  header=None).replace(replace, regex=True)
# use pd.read_csv("myfilename.txt", ...)  here for your real file

df[0] = dates.datestr2num(df[0])
df.iloc[:,1:] = df.iloc[:,1:].astype(float)

fig, ax1 = plt.subplots()
candlestick_ohlc(ax1, df.values, width = 0.5/((24*60)/5), 
                 colorup = 'g', colordown = 'r', alpha = 0.8)

ax1.xaxis.set_major_formatter(dates.DateFormatter('%d/%m/%Y %H:%M'))
ax1.xaxis.set_major_locator(dates.MinuteLocator((0,15,30,45)))

plt.xticks(rotation = 30)
plt.grid()
plt.xlabel('Date')
plt.ylabel('Price')
plt.title('Historical Data XMRUSD')
plt.tight_layout()
plt.show()

enter image description here

请注意,数据似乎也不是Ohlc格式,因此图形看起来很奇怪。但由于对数据一无所知,您需要自己找出正确的顺序。你知道吗

相关问题 更多 >