当原始数据不好时,如何重塑数据帧

2024-06-11 10:49:43 发布

您现在位置:Python中文网/ 问答频道 /正文

我从api获得了原始数据,JSON文件如下:

{
  "lowestResellPrice": {
    "stockX": 313,
    "flightClub": 332,
    "goat": 332,
    "stadiumGoods": 380
  },
  "resellPrices": {
    "stockX": {
      "4": 313,
      "5": 382,
      "6": 404,
      "7": 425,
      "8": 366,
      "9": 373,
      "10": 325,
      "11": 330,
      "12": 374,
      "13": 374,
      "14": 335,
      "16": 350,
      "4.5": 375,
      "5.5": 390,
      "6.5": 409,
      "7.5": 398,
      "8.5": 344,
      "9.5": 355,
      "10.5": 340,
      "11.5": 400,
      "12.5": 407,
      "13.5": 405,
      "14.5": 427
    },
    "goat": {
      "4": 332,
      "5": 400,
      "6": 410,
      "7": 440,
      "8": 378,
      "9": 365,
      "10": 336,
      "11": 355,
      "12": 380,
      "13": 374,
      "14": 370,
      "16": 409,
      "17": 730,
      "4.5": 395,
      "5.5": 377,
      "6.5": 424,
      "7.5": 420,
      "8.5": 360,
      "9.5": 351,
      "10.5": 355,
      "11.5": 400,
      "12.5": 420,
      "13.5": 420,
      "14.5": 448
    },
    "stadiumGoods": {
      "4": 2915,
      "5": 4189,
      "6": 4084,
      "7": 4383,
      "8": 3664,
      "9": 3559,
      "10": 3484,
      "11": 3372,
      "12": 4039,
      "13": 5133,
      "16": 4039,
      "4.5": 4189,
      "5.5": 4563,
      "6.5": 4346,
      "7.5": 4308,
      "8.5": 4039,
      "9.5": 3589,
      "10.5": 3447,
      "11.5": 5133
    },
    "flightClub": {
      "4": 332,
      "5": 400,
      "6": 410,
      "7": 440,
      "8": 378,
      "9": 369,
      "10": 336,
      "11": 370,
      "12": 380,
      "13": 374,
      "14": 370,
      "16": 409,
      "17": 730,
      "4.5": 400,
      "5.5": 405,
      "6.5": 424,
      "7.5": 420,
      "8.5": 360,
      "9.5": 369,
      "10.5": 355,
      "11.5": 400,
      "12.5": 420,
      "13.5": 420,
      "14.5": 532
    }
},
  "imageLinks": [
    "https://image.goat.com/attachments/product_template_additional_pictures/images/033/925/077/medium/585885_01.jpg.jpeg?1583776607",
    "https://image.goat.com/attachments/product_template_additional_pictures/images/035/012/435/medium/585885_03.jpg.jpeg?1585958414",
    "https://image.goat.com/attachments/product_template_additional_pictures/images/035/012/439/medium/585885_06.jpg.jpeg?1585958413",
    "https://image.goat.com/attachments/product_template_additional_pictures/images/035/012/433/medium/585885_08.jpg.jpeg?1585958414",
    "https://image.goat.com/attachments/product_template_additional_pictures/images/035/012/436/medium/585885_04.jpg.jpeg?1585958413"
  ],
  "_id": "5f92732d82c05921d4602bab",
  "shoeName": "adidas Yeezy Boost 350 V2 Cinder",
  "brand": "adidas",
  "silhoutte": "adidas Yeezy Boost 350 V2",
  "styleID": "FY2903",
  "make": "adidas Yeezy Boost 350 V2",
  "colorway": "Cinder/Cinder/Cinder",
  "retailPrice": 220,
  "thumbnail": "https://stockx.imgix.net/adidas-Yeezy-Boost-350-V2-Cinder-Product.jpg?fit=fill&bg=FFFFFF&w=700&h=500&auto=format,compress&trim=color&q=90&dpr=2&updated_at=1594236988",
  "releaseDate": "2020-03-21",
  "description": "The Yeezy Boost 350 V2 'Cinder' features a neutral look on its signature construction. Built with Primeknit, the Cinder upper includes a tonal monofilament stripe on the lateral side. A heel pull-loop provides easy on and off, while a similar finish marks the cage around the Boost midsole. A gum rubber outsole provides traction.",
  "urlKey": "adidas-yeezy-boost-350-v2-cinder",
  "resellLinks": {
    "stockX": "https://stockx.com/adidas-yeezy-boost-350-v2-cinder",
    "flightClub": "https://www.flightclub.com/yeezy-boost-350-v2-cinder-fy2903",
    "goat": "https://www.goat.com/sneakers/yeezy-boost-350-v2-cinder-fy2903",
    "stadiumGoods": "https://www.stadiumgoods.com/adidas-yeezy-boost-350-v2-cinder-fy2903"
  }
}

我只想要“转售价格”部分,并按如下方式进行重塑:

df = pd.json_normalize(data["resellPrices"])

输出不是我想要的:

stockX.4  stockX.5  stockX.6  stockX.7  ...  flightClub.11.5  flightClub.12.5  flightClub.13.5  flightClub.14.5
0       313       382       404       425  ...              400              420              420              532

[1 rows x 90 columns]

我希望数据框的形状是:

专栏:斯托克斯、山羊、体育场用品、飞行俱乐部

行数:4,5,6

值:332400410

我发现原因是转售价格中的内容是dict而不是list…因为我无法修改原始JSON文件(直接从API生成),我希望有人能给我一些建议


Tags: httpsimagecomtemplateproductattachmentsadditionaljpg
1条回答
网友
1楼 · 发布于 2024-06-11 10:49:43

问题解决了!只需像这样使用DataFrame.from_dict

df = pd.DataFrame.from_dict(data["resellPrices"])

结果是:

stockX  goat  stadiumGoods  flightClub
4      313.0   332        2915.0         332
5      382.0   400        4189.0         400
6      404.0   410        4084.0         410
7      425.0   440        4383.0         440
8      366.0   378        3664.0         378
9      373.0   365        3559.0         369
10     325.0   336        3484.0         336
11     330.0   355        3372.0         370
12     374.0   380        4039.0         380
13     374.0   374        5133.0         374
14     335.0   370           NaN         370
16     350.0   409        4039.0         409
4.5    375.0   395        4189.0         400
5.5    390.0   377        4563.0         405
6.5    409.0   424        4346.0         424
7.5    398.0   420        4308.0         420
8.5    344.0   360        4039.0         360
9.5    355.0   351        3589.0         369
10.5   340.0   355        3447.0         355
11.5   400.0   400        5133.0         400
12.5   407.0   420           NaN         420
13.5   405.0   420           NaN         420
14.5   427.0   448           NaN         532
17       NaN   730           NaN         730

太好了!我爱熊猫

相关问题 更多 >