条件数据帧拆分和排序

2024-05-15 16:59:27 发布

您现在位置:Python中文网/ 问答频道 /正文

我有如下的数据框前几个条目

df:

idx no                  surity      name        percentage  result
0   0.29999999999999993 0.974185    computer    0.84        1
1   0.18000000000000016 1.0         vegetables  1.14        1
2   0.27                1.0         electronics 1.32        1
3   0.17999999999999994 0.999655    books       1.59        0
4   0.8399889999999997  0.99992     fruits      1.770008    2

下面我想以两种不同的方式导出json,其中文件是单次导出,但是,首先作为初始值,而不是名称是结果的值,在后面的最终值中,只有前四列没有这样的更改

"initial":[
 {
        "no":0.3,
        "surity":"null",
        "name":"1",
        "percentage":"0.84",
    },
    {
        "no":0.18,
        "surity":"null",
        "name":"1",
        "percentage":"1.14",
    },
    {
        "no":0.27,
        "surity":"null",
        "name":"1",
        "percentage":"1.32",
    },
    {
        "no":0.18,
        "surity":"null",
        "name":"0",
        "percentage":"1.59",
    },
    {
        "no":0.83999,
        "surity":"null",
        "name":"2",
        "percentage":"1.770007",
    }
]
"final":
[
 {
        "no":0.3,
        "surity":"0.973225",
        "name":"computer",
        "percentage":"0.84",
    },
    {
        "no":0.18,
        "surity":"1.0",
        "name":"vegetables",
        "percentage":"1.14",
    },
    {
        "no":0.27,
        "surity":"1.0",
        "name":"electronics",
        "percentage":"1.32",
    },
    {
        "no":0.18,
        "surity":"0.999663",
        "name":"books",
        "percentage":"1.59",
    },
    {
        "no":0.83999,
        "surity":"0.99991",
        "name":"fruits",
        "percentage":"1.770007",
    }
]

另外,第二个导出,我希望类似于initial,所有名称值顺序相同,以no求和,结果为单个条目,如下所示:

"initial":[
 {
        "no":3.3,
        "surity":"null",
        "name":"1",
        "percentage":"0.84",
    },
    {
        "no":0.18,
        "surity":"null",
        "name":"0",
        "percentage":"1.59",
    },
    {
        "no":0.83999,
        "surity":"null",
        "name":"2",
        "percentage":"1.770007",
    }
]
"final":
[
 {
        "no":0.3,
        "surity":"0.973225",
        "name":"computer",
        "percentage":"0.84",
    },
    {
        "no":0.18,
        "surity":"1.0",
        "name":"vegetables",
        "percentage":"1.14",
    },
    {
        "no":0.27,
        "surity":"1.0",
        "name":"electronics",
        "percentage":"1.32",
    },
    {
        "no":0.18,
        "surity":"0.999663",
        "name":"books",
        "percentage":"1.59",
    },
    {
        "no":0.83999,
        "surity":"0.99991",
        "name":"fruits",
        "percentage":"1.770007",
    }
]

Tags: 数据noname名称条目booksnullcomputer
1条回答
网友
1楼 · 发布于 2024-05-15 16:59:27

所有这些都是关于按照你的指示系统化的

  • to_dict(orient="records")提供准备DF时所需的内容
  • drop()列和rename()列获得第一次导出
  • 使用groupby()agg()获取聚合将获得第二次导出
import numpy as np
df = pd.read_csv(io.StringIO("""idx no  surity  name    percentage  result
0   0.29999999999999993 0.974185    computer    0.84    1
1   0.18000000000000016 1.0 vegetables  1.14    1
2   0.27    1.0 electronics 1.32    1
3   0.17999999999999994 0.999655    books   1.59    0
4   0.8399889999999997  0.99992 fruits  1.770008    2"""), sep="\s+", index_col=0)

df.no = df.no.round(4)

exp1 = {"initial":df
        .drop(columns="name")
        .rename(columns={"result":"name"})
        .assign(surity=np.nan)
        .to_dict(orient="records")
,"final":df.drop(columns="result").to_dict(orient="records")
}

exp2 = {"initial":df.groupby(["result"]).agg({"no":"first","percentage":"sum"}).reset_index()
 .rename(columns={"result":"name"}).assign(surity=np.nan).to_dict(orient="records")
 ,"final":df.groupby(["name"]).agg({"no":"first","percentage":"sum","surity":"first"}).reset_index().to_dict("records")
}

表1

{'initial': [{'no': 0.3, 'surity': nan, 'percentage': 0.84, 'name': 1},
  {'no': 0.18, 'surity': nan, 'percentage': 1.14, 'name': 1},
  {'no': 0.27, 'surity': nan, 'percentage': 1.32, 'name': 1},
  {'no': 0.18, 'surity': nan, 'percentage': 1.59, 'name': 0},
  {'no': 0.84, 'surity': nan, 'percentage': 1.770008, 'name': 2}],
 'final': [{'no': 0.3,
   'surity': 0.974185,
   'name': 'computer',
   'percentage': 0.84},
  {'no': 0.18, 'surity': 1.0, 'name': 'vegetables', 'percentage': 1.14},
  {'no': 0.27, 'surity': 1.0, 'name': 'electronics', 'percentage': 1.32},
  {'no': 0.18, 'surity': 0.999655, 'name': 'books', 'percentage': 1.59},
  {'no': 0.84, 'surity': 0.99992, 'name': 'fruits', 'percentage': 1.770008}]}

exp2

{'initial': [{'name': 0, 'no': 0.18, 'percentage': 1.59, 'surity': nan},
  {'name': 1, 'no': 0.3, 'percentage': 3.3, 'surity': nan},
  {'name': 2, 'no': 0.84, 'percentage': 1.770008, 'surity': nan}],
 'final': [{'name': 'books',
   'no': 0.18,
   'percentage': 1.59,
   'surity': 0.999655},
  {'name': 'computer', 'no': 0.3, 'percentage': 0.84, 'surity': 0.974185},
  {'name': 'electronics', 'no': 0.27, 'percentage': 1.32, 'surity': 1.0},
  {'name': 'fruits', 'no': 0.84, 'percentage': 1.770008, 'surity': 0.99992},
  {'name': 'vegetables', 'no': 0.18, 'percentage': 1.14, 'surity': 1.0}]}

相关问题 更多 >