熊猫：更高效的索引垃圾桶？

asns = df.index.levels[0] countries = df.index.levels[1] country_by_asn = {} asn_by_country = {} for asn in asns: by_asn = df.loc[[d == asn for d in df.index.get_level_values("asn")]] country_by_asn[asn] = list(by_asn.index.get_level_values("country")) for country in countries: by_country = df.loc[[d == country for d in df.index.get_level_values("country")]] asn_by_country[country] = list(by_country.index.get_level_values("asn"))

1条回答

网友

1楼 · 发布于 2024-05-16 05:37:25

将^{}与groupby一起使用，将值转换为list，最后一个^{}：在2.2秒内对68000行数据进行实验运行

df1 = df.reset_index()

a = df1.groupby('asn')['country'].apply(list).to_json()
b = df1.groupby('country')['asn'].apply(list).to_json()

或者纯python解决方案—首先创建元组列表，然后创建字典，最后json：在0.06秒内实验性地运行68000行数据

import json

l = df.index.tolist()

a, b = {}, {}
for x, y in l:
    a.setdefault(x, []).append(y)
    b.setdefault(y, []).append(y)

a = json.dumps(a)
b = json.dumps(b)

类似的解决方案：实验性地在0.06秒内运行68000行数据

l = df.index.tolist()

from collections import defaultdict

a, b = defaultdict( list ), defaultdict( list )

for n,v in l:
    a[n].append(v)
    b[v].append(n)

a = json.dumps(a)
b = json.dumps(b)

@stevendesu的“新手”解决方案：实验性地在0.06秒内运行了68000行数据

l = df.index.tolist()

a, b = {}, {}

for n, v in l:
    if n not in a:
        a[n] = []
    if v not in b:
        b[v] = []
    a[n].append(v)
    b[v].append(n)

a = json.dumps(a)
b = json.dumps(b)

print (a)
{"12345": ["US", "MX"], "54321": ["US"]}

print (b)
{"MX": [12345], "US": [12345, 54321]}

相关问题更多 >

编程相关推荐

热门问题

热门文章