我正在尝试查找值在0500、5005000、500035000、3500065000、>60000之间的所有行

2024-06-10 20:29:23 发布

您现在位置:Python中文网/ 问答频道 /正文

import pandas as pd

df = pd.read_csv('/Users/gfidarov/Desktop/daylite/export_daylite_v0.2.csv')
#print(df)
df1 = df[df['Итог'] >'60000']
a = len(df1)
df5 = df[df['Итог'].isin(['40565', '60000'])]
f = len(df5)
df2 = df[df['Итог'].isin(['5000', '35000'])]
b = len(df2)
df3 = df[df['Итог'].isin(['500', '5000'])]
c = len(df3)
df4 = df[df['Итог'].isin(['0', '500'])]
d = len(df4)
#print(df2)
print(a)    # >60000
print(b)    # 5000- 35000
print(c)    # 500 - 5000
print(d)    # 0 - 500
print(f)    # 35000 - 60000

我的代码可以很好的工作给我一些值,但例如在我的csv我有一些35000-65000之间的值。不知何故,该列表的输出为零,这意味着我的代码看不到这些值。你知道吗

我的价值观类型如下

44300
23400
4050
31230
12
45333
12341
64500
3430
13
95844
330
2
32
78
0

这就是我得到的结果。你知道吗


Tags: csv代码importpandasdflenpddf1
3条回答

您可以考虑使用pd.cut,如下所示

import numpy as np
import pandas as pd

lst = [44300, 23400, 4050, 31230, 12, 45333,
       12341, 64500, 3430, 13, 95844, 330, 2,
       32, 78, 0]
df = pd.DataFrame({"a":lst})

bins = [0, 500, 5000, 35000, 60000, np.infty]
df["bins"] = pd.cut(df["a"], bins)

df.groupby("bins").size()

bins
(0.0, 500.0]          6
(500.0, 5000.0]       2
(5000.0, 35000.0]     3
(35000.0, 60000.0]    2
(60000.0, inf]        2
import pandas as pd

df = pd.read_csv('/Users/gfidarov/Desktop/daylite/export_daylite_v0.2.csv')
df = df[pd.to_numeric(df['Итог'], errors='coerce').notnull()]

df1 = df[df['Итог'] > 60000]
a = len(df1)
df2 = df[df['Итог'].between(40565, 60000)]
b = len(df2)
df3 = df[df['Итог'].between(5000, 35000)]
c = len(df3)
df4 = df[df['Итог'].between(500, 5000)]
d = len(df4)
df5 = df[df['Итог'].between(0, 500)]
f = len(df5)

print(a)
print(b)
print(c)
print(d)
print(f)
print(a+b+c+d+f)

输出[]

22
15
585
570
326
1518

读取cvs时,可以将列转换为int

import pandas as pd

df = pd.read_csv('/Users/gfidarov/Desktop/daylite/export_daylite_v0.2.csv')
df = df[pd.to_numeric(df['Итог'], errors='coerce').notnull()]

那么所有的数学运算都将按预期进行

#print(df)
df1 = df[df['Итог'] > 60000]

df5 = df[df['Итог'].between(40565, 60000)]

df2 = df[df['Итог'].between(5000, 35000)]

df3 = df[df['Итог'].between(500, 5000)]

df4 = df[df['Итог'].between([0, 500)]

相关问题 更多 >