与Pandas群体合作

2024-06-09 14:30:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有个问题问得我头晕目眩。 假设我有下一个数据帧:

df2 = pd.DataFrame(np.random.randint(0,3,size=(10, 4)),columns=['ONE', 'TWO', 'CARS', 'FOUR'])
df2['NAMES'] = ['Peter','Jon','Mary','Mary','Peter','Peter','BONIFACE','Michael','Lucy','Gilari']
df2['CARS'] = ['Mercedes','BMW','Ford','BMW','BMW','Dacia','Ford','Pontiac','Chevrolet','Tesla']

例如,我把它按汽车分类。你知道吗

agrupe = df2.groupby(['CARS'])

问题是,一旦我对它进行了分组,我就想用它来操作,例如在BMW制造的组中,我想从列1上有2的元素中,将列2的值赋给列4。让我们看看我是否学会操作它:

g = agrupe.get_group('BMW')

从这个开始

     ONE TWO CARS  FOUR  NAMES
1    1    0  BMW     1    Jon
3    2    1  BMW     1   Mary
4    0    1  BMW     0  Peter

对此:

    ONE  TWO CARS  FOUR  NAMES
1    1    0  BMW     1   Jon
3    2    1  BMW     1   Mary
4    0    1  BMW     1  Peter

Tags: 数据dataframenamescarsonepeterpdfour
1条回答
网友
1楼 · 发布于 2024-06-09 14:30:44

似乎您需要带有自定义函数f^{}

np.random.seed(100)
df2 = pd.DataFrame(np.random.randint(0,3,size=(10, 4)),columns=['ONE', 'TWO', 'CARS', 'FOUR'])
df2['NAMES'] = ['Peter','Jon','Mary','Mary','Peter','Peter','BONIFACE','Michael','Lucy','Gilari']
df2['CARS'] = ['Mercedes','BMW','Ford','BMW','BMW','Dacia','Ford','Pontiac','Chevrolet','Tesla']
print (df2)
   ONE  TWO       CARS  FOUR     NAMES
0    0    0   Mercedes     2     Peter
1    2    0        BMW     1       Jon
2    2    2       Ford     2      Mary
3    1    0        BMW     0      Mary
4    0    2        BMW     1     Peter
5    1    2      Dacia     0     Peter
6    0    1       Ford     1  BONIFACE
7    0    0    Pontiac     1   Michael
8    1    2  Chevrolet     2      Lucy
9    1    1      Tesla     2    Gilari
def f(x):
    if (x.name == 'BMW'):
        x.loc[x.ONE == 2, 'FOUR'] = x.TWO
    return x

agrupe = df2.groupby('CARS').apply(f)
print (agrupe)
   ONE  TWO       CARS  FOUR     NAMES
0    0    0   Mercedes     2     Peter
1    2    0        BMW     0       Jon
2    2    2       Ford     2      Mary
3    1    0        BMW     0      Mary
4    0    2        BMW     1     Peter
5    1    2      Dacia     0     Peter
6    0    1       Ford     1  BONIFACE
7    0    0    Pontiac     1   Michael
8    1    2  Chevrolet     2      Lucy
9    1    1      Tesla     2    Gilari

更好的解决方案是首先选择列CARSBMW且列ONE2的所有行,然后按列TWO更改FOUR

df2.loc[(df2.CARS == 'BMW') & (df2.ONE == 2), 'FOUR'] = df2.TWO
print (df2)
   ONE  TWO       CARS  FOUR     NAMES
0    0    0   Mercedes     2     Peter
1    2    0        BMW     0       Jon
2    2    2       Ford     2      Mary
3    1    0        BMW     0      Mary
4    0    2        BMW     1     Peter
5    1    2      Dacia     0     Peter
6    0    1       Ford     1  BONIFACE
7    0    0    Pontiac     1   Michael
8    1    2  Chevrolet     2      Lucy
9    1    1      Tesla     2    Gilari

或者如果需要更改列ONE中的2,则按列TWO更改列FOUR

np.random.seed(13)
df2 = pd.DataFrame(np.random.randint(0,3,size=(10, 4)),columns=['ONE', 'TWO', 'CARS', 'FOUR'])
df2['NAMES'] = ['Peter','Jon','Mary','Mary','Peter','Peter','BONIFACE','Michael','Lucy','Gilari']
df2['CARS'] = ['Mercedes','BMW','Ford','BMW','BMW','Dacia','Ford','Pontiac','Chevrolet','Tesla']
print (df2)
   ONE  TWO       CARS  FOUR     NAMES
0    2    0   Mercedes     0     Peter
1    2    2        BMW     1       Jon
2    0    2       Ford     0      Mary
3    2    2        BMW     2      Mary
4    1    1        BMW     1     Peter
5    0    2      Dacia     1     Peter
6    2    1       Ford     2  BONIFACE
7    0    0    Pontiac     0   Michael
8    2    2  Chevrolet     0      Lucy
9    1    1      Tesla     2    Gilari


df2.loc[df2.ONE == 2, 'FOUR'] = df2.TWO
print (df2)
   ONE  TWO       CARS  FOUR     NAMES
0    2    0   Mercedes     0     Peter
1    2    2        BMW     2       Jon
2    0    2       Ford     0      Mary
3    2    2        BMW     2      Mary
4    1    1        BMW     1     Peter
5    0    2      Dacia     1     Peter
6    2    1       Ford     1  BONIFACE
7    0    0    Pontiac     0   Michael
8    2    2  Chevrolet     2      Lucy
9    1    1      Tesla     2    Gilari

相关问题 更多 >