Python中的Agrupate变量

ingresos <- sample(0:100,15,replace = T) sexo <- sample(0:1,15,replace=T) base2 <- data.frame(ingresos, sexo) base2$grupo[as.numeric(base2$ingresos) >= 0 & as.numeric(base2$ingresos)<=29] <- 1 base2$grupo[as.numeric(base2$ingresos) >= 30 & as.numeric(base2$ingresos)<=49] <- 2 base2$grupo[as.numeric(base2$ingresos) >= 50 & as.numeric(base2$ingresos)<=69] <- 3 base2$grupo[as.numeric(base2$ingresos) >= 70] <- 4 base2

1条回答

网友

1楼 · 发布于 2024-05-16 22:44:31

在R中，您需要：

base2$bins = cut(base2$ingresos,breaks=c(0,30,50,70,+Inf),
include.lowest=TRUE,right=FALSE)

   ingresos sexo     bins
1        38    0  [30,50)
2        98    0 [70,Inf]
3        17    1   [0,30)
4        76    1 [70,Inf]
5        54    0  [50,70)
6        91    1 [70,Inf]
7         4    0   [0,30)
8        68    0  [50,70)
9         9    0   [0,30)
10       32    0  [30,50)
11       13    0   [0,30)
12       64    1  [50,70)
13       35    1  [30,50)
14       44    0  [30,50)
15       63    0  [50,70)

在熊猫中，您可以执行以下操作：

base2['bins'] = pd.cut(base2['Ingreso'],
bins=[0,30, 50,70,+np.Inf],include_lowest=True,right=False)


Ingreso Grupo   bins
Id          
1   57  1   [50.0, 70.0)
2   71  2   [70.0, inf)
3   25  3   [0.0, 30.0)
4   45  4   [30.0, 50.0)
5   26  5   [0.0, 30.0)
6   1   6   [0.0, 30.0)
7   51  7   [50.0, 70.0)
8   39  8   [30.0, 50.0)
9   67  9   [50.0, 70.0)
10  78  10  [70.0, inf)
11  58  11  [50.0, 70.0)
12  27  12  [0.0, 30.0)
13  48  13  [30.0, 50.0)
14  75  14  [70.0, inf)
15  22  15  [0.0, 30.0)

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python中的Agrupate变量

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >