类型错误:无法转换为数字
这是我的代码:
file_path = 'TEST3.csv' # Update the path to your CSV file
columns = ['Year','T', 'TM', 'Tm', 'PP', 'Yields_Blé_dur']
n_steps_in, n_steps_out = 3, 1
test_set_years = 5
df = load_data(file_path)
# Preprocess the data
df_processed = preprocess_data_columns(df, columns)
# Extract 'Year' for plotting purposes
years = df_processed['Year'].values
[这是我的数据库] (https://i.stack.imgur.com/SZgMR.png)
GitHub链接:https://github.com/Moiz1500/LSTM-model
我遇到了这个错误:
TypeError Traceback (most recent call last)
Cell In[44], line 10
7 df = load_data(file_path)
9 # Preprocess the data
---> 10 df_processed = preprocess_data_columns(df, columns)
11 # Extract 'Year' for plotting purposes
12 years = df_processed['Year'].values
Cell In[33], line 2
1 def preprocess_data_columns(df, columns):
----> 2 df = df[columns].fillna(df.mean()) #Fill NaN values with the mean of the column
3 return df
File c:\Users\hemic\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\frame.py:11335, in DataFrame.mean(self, axis, skipna, numeric_only, **kwargs)
11327 @doc(make_doc("mean", ndim=2))
11328 def mean(
11329 self,
(...)
11333 **kwargs,
11334 ):
> 11335 result = super().mean(axis, skipna, numeric_only, **kwargs)
11336 if isinstance(result, Series):
11337 result = result.__finalize__(self, method="mean")
...
-> 1678 raise TypeError(f"Could not convert {x} to numeric")
1679 try:
1680 x = x.astype(np.complex128)
Type Error: Could not convert ['1/1/20121/2/20121/3/201212/31/2023'] to numeric.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
我该如何修复这个错误呢?
我不知道怎么把“年份”这一列转换成“日期”。
1 个回答
0
我不知道怎么把年份这一列转换成日期。
你可以把Year
这一列转换成日期时间格式,这样就能避免使用pd.to_datetime
时出现错误:
# assumes import pandas as pd
df['Year'] = pd.to_datetime(df['Year'])
然后再计算平均值。
最好是你能更有针对性地选择哪些列来用平均值替换缺失值(NaNs)。因为Year
这一列没有缺失值,所以你根本不需要计算平均值。
以下代码在Pandas 1.5.3版本中运行时不会出现警告或错误。
from urllib.request import urlretrieve
import pandas as pd
url = 'https://raw.githubusercontent.com/Moiz1500/LSTM-model/main/TEST3.csv'
file_path = 'TEST3.csv'
urlretrieve(url, file_path)
def preprocess_data_columns(df, columns):
df[columns] = df[columns].fillna(df[columns].mean())
return df
df = pd.read_csv(file_path)
df['Year'] = pd.to_datetime(df['Year'])
# Preprocess the data
columns = df.columns[1:]
df_processed = preprocess_data_columns(df, columns)
# Extract 'Year' for plotting purposes
years = df_processed['Year'].values
years