高验证损失和均方误差

0 投票
1 回答
41 浏览
提问于 2025-04-12 18:17

我有一个非常大的能量波数据集,我正在用神经网络来练习,但我的均方误差(MSE)和验证损失(Val-loss)都非常高。我尝试使用相关矩阵,并进行了两步拆分,还用了三个隐藏层和正则化。结果显示,均方误差和验证损失的值高达12万亿。这是我的数据链接

# Load the data
perth_49 = pd.read_csv(r'WEC_Perth_49.csv')
sydney_49 = pd.read_csv(r'WEC_sydney_49.csv')
perth_100 = pd.read_csv(r'WEC_perth_100.csv')
sydney_100 = pd.read_csv(r'WEC_sydney_100.csv')

#merge the dataframes based on a common identifier
merged_data = pd.concat([perth_49, sydney_49, perth_100, sydney_100])

#define the target variable "total power otput"
target_variable = 'Total_Power'

# Defne the potential features 
features = [f'X{i}' for i in range(1, 101)] + [f'Y{i}' for i in range(1, 101)] + [f'Power{i}' for i in range(1, 101)] + ['qW']

# Compute the correlation matrix
correlation_matrix = merged_data[features + [target_variable]].corr()

# Sort the correlations with the target variable in descending order
correlation_with_target = correlation_matrix[target_variable].sort_values(ascending=False)

print("Correlation with target variable:")
print(correlation_with_target)

# Choose the top N features with the highest correlation
top_features = correlation_with_target.head(5).index.tolist()
top_features = top_features[1:5]  # Exclude the target variable itself

# Select the relevant features from the dataset
selected_data = merged_data[top_features + [target_variable]]

# Replace NaN values with 0 in selected_data
 #selected_data.fillna(0, inplace=True)

# Scale the features (normalizing)
selected_data[top_features] = scale(selected_data[top_features])

# Split the data into training and testing sets
X = selected_data[top_features].values
y = selected_data[target_variable].values.reshape(-1, 1)

# Split the data into training, validation, and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

# Print the sizes of each set
print("Training set size:", len(X_train))
print("Validation set size:", len(X_val))
print("Testing set size:", len(X_test))

# Scaling the features using MinMaxScaler
scaler = MinMaxScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

# Creating a neural network with Keras
#model = keras.Sequential([
#    layers.Dense(128, activation='relu', input_shape=(len(top_features),)),
#    layers.Dense(64, activation='relu'),
#    layers.Dense(32, activation='relu'),
#    layers.Dense(16, activation='relu'),
#    layers.Dense(1)  # a single output (total power)
#])
##########
from tensorflow.keras import regularizers

model = keras.Sequential([
    layers.Dense(128, activation='relu', input_shape=(len(top_features),), kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(32, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(16, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dense(1)  # a single output (total power)
])

# Compiling the model
optimizer = keras.optimizers.Adam(learning_rate=0.00001)
model.compile(optimizer=optimizer, loss='mean_squared_error')

# Early stopping
#early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights= True)

# Training the model
model.fit(X_train_scaled, y_train, epochs=4000, batch_size=32, validation_data=(X_val_scaled, y_val))

# Training the model with early stopping
#model.fit(X_train_scaled, y_train, epochs=4000, batch_size=32, validation_data=(X_val_scaled, y_val), callbacks=[early_stopping])


# Evaluating the model on the test set
loss = model.evaluate(X_test_scaled, y_test)
print("Test Loss:", loss)

# Making predictions
predictions = model.predict(X_test_scaled)

# Calculating mean squared error
mse = mean_squared_error(y_test, predictions)
print("Mean Squared Error:", mse)

1 个回答

1

我首先想到的是,你的4千次训练可能是因为学习率太低和网络层数太深的组合导致的。我现在是在一台普通的台式电脑上运行,所以以我目前的速度,7天内无法完成所有4千次训练。不过在第60次训练时,我的损失值达到了7770亿,学习率是0.001,而且只有第一层隐藏层。没错,增加隐藏层可以让模型学习更复杂的功能,而降低学习率可以让学习过程更平滑,但这两者都需要更多的训练次数才能达到特定的效果。如果你的电脑比我快,试试用更小的网络和更高的学习率,看看能得到什么结果。

编辑:我刚刚做了一些小批量的测试,供你参考:

100 epochs with set up as described above:  
Test Loss: 490669342720.0
Mean Squared Error: 490669524959.7527


100 epochs with first 2 hidden layers and learning rate=0.1:
Test Loss: 388628578304.0
Mean Squared Error: 388628638043.2276


100 epochs with first 2 hidden layers, learning rate=1:
Test Loss: 449328513024.0
Mean Squared Error: 449328481916.0099

撰写回答