如何使用python的pandas处理excel中的组合列？

--------------------------------------------------------- Name | property 1 | property 2 | property 3 | --------------------------------------------------------- variableName1 | X1 | Y1 | Z1 | --------------------------------------------------------- variableName2 | X2 | Y2 | Z2 | --------------------------------------------------------- variableName3 | X3 | Y3 | Z31 | --------------- | | | Z32 | --------------------------------------------------------- variableName4 | X4 | Y4 | Z4 | ---------------------------------------------------------

{"Name":"VariableName1","Property1":"X1","Property2":"Y1","Property3":"Z1"} {"Name":"VariableName2","Property1":"X2","Property2":"Y2","Property3":"Z2"} {"Name":"VariableName3","Property1":"X3","Property2":"Y3","Property3":"Z31"} {"Name":null,"Property1":null,"Property2":null,"Property3":"Z32"} {"Name":"VariableName4","Property1":"X4","Property2":"Y4","Property3":"Z4"}

1条回答

网友

1楼 · 发布于 2024-04-19 23:26:10

您可以执行以下操作：

# Get the data
df = pd.read_excel('testExcel.xlsx',
                   sheet_name='Hoja1',
                   na_values='NA',
                   skiprows=2)

# Remove empty rows
df = df.dropna(axis='columns', how='all')

# Fill down the 'Name' values
df['Name'] = df['Name'].fillna(method='ffill')

# Define an aggregate function
def join_values(series):
    return ', '.join(pd.Series.dropna(series))

# Group and aggregate the data using the defined function
df = df.groupby(by='Name').aggregate(join_values)

# Reset multi index
df = df.reset_index()

# Serialize
json_output = df.to_json(orient='records')

请注意，此解决方案将具有重复“Name”值的行聚合到一行中

相关问题更多 >

编程相关推荐

热门问题

热门文章