当你看到两个或两个不匹配的序列时会发生什么？

# Loc and Iloc also allow for conditional statments to filter rows of data # using Loc on the logic test above only returns rows where the result is True only_billys = df.loc[df["first_name"] == "Billy", :] print(only_billys) only_peters = df.loc[df["first_name"] == "Peter", :] print(only_peters) print() only_richardsons = df.loc["Richardson", :] print(only_richardsons) print() isBilly = (df["first_name"] == "Billy").sort_index() print(isBilly.describe()) print() isPeter = (df["first_name"] == "Peter") print(isPeter.describe()) print() billy_or_peter = isPeter | isBilly print(billy_or_peter.describe()) print(billy_or_peter)

(only_billys) id first_name Phone Number Time zone last_name Clark 20 Billy 62-(213)345-2549 Asia/Makassar Andrews 23 Billy 86-(859)746-5367 Asia/Chongqing Price 59 Billy 86-(878)547-7739 Asia/Shanghai id first_name Phone Number Time zone (only_peters) last_name Richardson 1 Peter 7-(789)867-9023 Europe/Moscow id first_name Phone Number Time zone (only_richardsons) last_name Richardson 1 Peter 7-(789)867-9023 Europe/Moscow Richardson 25 Donald 62-(259)282-5871 Asia/Jakarta (isBilly.describe() - sorted index) count 100 unique 2 top False freq 97 Name: first_name, dtype: object (isPeter.describe()) count 100 unique 2 top False freq 99 Name: first_name, dtype: object (billy_or_peter.describe() - 126 rows???) count 126 unique 2 top False freq 121 Name: first_name, dtype: object (billy_or_peter listing - notice 4 Richardsons where before there were only 2) last_name Adams False Allen False Andrews True Austin False Baker False Banks False Bell False Berry False Bishop False Black False Brooks False Brown False Bryant False Bryant False Bryant False Bryant False Burke False Butler False Butler False Butler False Butler False Carroll False Chapman False Chavez False Clark True Collins False Cook False Day False Day False Day False ... Price True Reid False Reyes False Rice False *Richardson True *Richardson True *Richardson False *Richardson False Riley False Roberts False Robertson False Robinson False Rogers False Scott False Shaw False Shaw False Shaw False Shaw False Simmons False Snyder False Sullivan False Torres False Tucker False Vasquez False Wagner False Walker False Washington False Watkins False Wells False Williamson False Name: first_name, Length: 126, dtype: bool

1条回答

网友

1楼 · 发布于 2024-05-23 14:02:25

不匹配不是这里的问题，pandas将在|之前对齐。您的问题是由于索引重复造成的。为此，比较是作为匹配索引中的outer连接进行的。因此，一个中的2个richardson和另一个中的2个richardson将导致输出中的4行

为了更清楚地说明这一点，请看添加索引重复和未对齐的字符串时会发生什么。我们从笛卡尔积中得到索引1的6（2 x 3）行：

import pandas as pd

df1 = pd.DataFrame(list('abcd'), index=[1,1,2,3])
df2 = pd.DataFrame(list('1243'), index=[1,1,3,1])
df1+df2

     0
1   a1
1   a2
1   a3
1   b1
1   b2
1   b3
2  NaN
3   d4

相关问题更多 >

编程相关推荐

热门问题

热门文章