KeyError: "['建筑年龄', '楼层', '楼层数量']不在索引中

0 投票
1 回答
45 浏览
提问于 2025-04-12 21:25
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import category_encoders as ce

# Read the data
transactions_master_df = pd.read_csv('my_data.csv')

# Calculate the average house price for each district
avg_price_per_district = transactions_master_df.groupby('District')['Price'].mean().reset_index()
avg_price_per_district.rename(columns={'Price': 'AvgPrice'}, inplace=True)

#print the average price for each district with the district column next to it
print(avg_price_per_district)

# Merge the average price information with the original DataFrame
transactions_master_df = pd.merge(transactions_master_df, avg_price_per_district, on='District', how='left')

# Binary encode the 'District' feature
encoder = ce.BinaryEncoder(cols=['District'], base=6)
transactions_encoded = encoder.fit_transform(transactions_master_df)

# Concatenate additional features to the encoded DataFrame
additional_features = ['Building Age', 'Floor', 'Number of Floors', 'Elevator', 
                      'number of bathrooms', 'Otopark', 'steeped alley', 
                      'material used and luxuriness', 'view', 
                      'prestige of that district and its vicinity']

# Check if additional features are present in the transactions_encoded DataFrame
for feature in additional_features:
    if feature not in transactions_encoded.columns:
        print(f"Warning: {feature} column not found in transactions_encoded DataFrame.")

# Concatenate additional features to the encoded DataFrame
final_features = pd.concat([transactions_encoded[['District_0', 'District_1', 'District_2', 'SquareMeter']], 
                            transactions_encoded[additional_features]], axis=1)

# Ensure 'final_features' contains the necessary columns for training
print(final_features.head())


你好,

在这段代码中,我正在为我的房价数据集构建一个模型。首先,我对一些非数字特征进行了编码,然后在我把其他特征合并到最终特征变量时,出现了以下错误:

final_features = pd.concat([transactions_encoded[['District_0', 'District_1', 'District_2', 'SquareMeter']], 
---> 38                             transactions_encoded[additional_features]], axis=1)
KeyError: "['Building Age', 'Floor', 'Number of Floors'] not in index"

奇怪的是,这些特征在我的数据集中确实存在,但我不知道为什么会给我这个错误。

1 个回答

0

看起来你的数据表中的列名‘建筑年龄’、‘楼层’和‘楼层数量’里有多余的空格。这些多余的空格让程序在找这些列名的时候出错,导致了KeyError(键错误)。

撰写回答