我在卡格尔上试过泰坦尼克号模型。奇怪的是,isna().sum()输出了错误的信息
import os
import pandas as pd
import numpy as np
import statsmodels.api as sm
from google.colab import auth
auth.authenticate_user()
import gspread
from oauth2client.client import GoogleCredentials
gc = gspread.authorize(GoogleCredentials.get_application_default())
worksheet = gc.open('titanic_train').sheet1
titanic = worksheet.get_all_records()
titanic = pd.DataFrame(titanic)
titanic
titanic.info()
titanic.isna().sum()
输出如下所示
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 12 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 PassengerId 891 non-null int64
1 Survived 891 non-null int64
2 Pclass 891 non-null int64
3 Name 891 non-null object
4 Sex 891 non-null object
5 Age 891 non-null object
6 SibSp 891 non-null int64
7 Parch 891 non-null int64
8 Ticket 891 non-null object
9 Fare 891 non-null float64
10 Cabin 891 non-null object
11 Embarked 891 non-null object
dtypes: float64(1), int64(5), object(6)
memory usage: 83.7+ KB
PassengerId 0
Pclass 0
Name 0
Sex 0
Age 0
SibSp 0
Parch 0
Ticket 0
Fare 0
Cabin 0
Embarked 0
dtype: int64
据说楠是0,但在年龄上有几个楠。为什么它检测不到Nan?是因为数据类型吗
它这样做是因为没有
NaNs
您注意到
df.info()
没有空值这是因为您的panda版本是1.2.4。当我降级到.24或其他更低版本时,您将获得nan值
相关问题 更多 >
编程相关推荐