在pandas和python中，在特殊条件下将dataframe中的数据转换为列表

list 1: list of all WD1 as follows: [flu-like symptoms, dizziness, major mood swings, lots of anxiety, tiredness, Dizziness, headaches, neck pain, headache, nausea] list 2: comment_id: [1, 1, 1, 1, 1, 14, 14, 14, 17, 17] list 3 drug_id [lex.1, lex.1, lex.1, lex.1, lex.1, lex14, lex14, lex14, lex18, lex18]

1条回答

网友

1楼 · 发布于 2024-05-19 00:06:31

您可以通过cumcount()创建一个rowid，该rowid对应于comment_id和drug_id的每个组合中的列索引，然后用两个id列作为索引将其取消堆叠：

df1 = (df.assign(rowid = df.groupby(["comment_id", "drug_id"]).cumcount() + 1)
       .set_index(["comment_id", "drug_id", "rowid"])
       .rename_axis(("comment_id", "drug_id","")).unstack(level=2))

# rename columns from multi-index to single index
df1.columns = [''.join(map(str, col)) for col in df1.columns]
df1.reset_index()

数据设置：

WDs = ["flu-like symptoms", "dizziness", "major mood swings", "lots of anxiety", "tiredness",  "Dizziness", "headaches", "neck pain", "headache", "nausea"] 
comment_id = [1, 1, 1, 1, 1, 14, 14, 14, 17, 17]
drug_id = ["lex.1", "lex.1",  "lex.1", "lex.1", "lex.1",  "lex14", "lex14", "lex14", "lex18", "lex18"]

df = pd.DataFrame({"WD": WDs, "comment_id": comment_id, "drug_id": drug_id})

更新：

看起来您想要相反的结果，给定数据帧df1，您可以首先将其转换为长格式，然后每个列都是您需要的，您可以使用tolist()来转换它们：

df2 = df1.set_index(["comment_id", "drug_id"]).stack().rename("WD").reset_index()   
comment_id, drug_id, WD = df2.comment_id.tolist(), df2.drug_id.tolist(), df2.WD.tolist()

相关问题更多 >

编程相关推荐

热门问题

热门文章