我有一个数据集,不幸的是它有零星的日期时间值,而不是int
或str
例如,我如何通过遍历数据库并将2019-05-03 00:00:00
替换为5-3来编辑这些值
我尝试了一些循环,但没有效果。有捷径吗
,age,menopause,tumor-size,inv-nodes,node-caps,deg-malig,breast,breast-quad,irradiat,Class
0,40-49,premeno,15-19,0-2,yes,3,right,left_up,no,recurrence-events
1,50-59,ge40,15-19,0-2,no,1,right,central,no,no-recurrence-events
2,50-59,ge40,35-39,0-2,no,2,left,left_low,no,recurrence-events
3,40-49,premeno,35-39,0-2,yes,3,right,left_low,yes,no-recurrence-events
4,40-49,premeno,30-34,2019-05-03 00:00:00,yes,2,left,right_up,no,recurrence-events
5,50-59,premeno,25-29,2019-05-03 00:00:00,no,2,right,left_up,yes,no-recurrence-events
6,50-59,ge40,40-44,0-2,no,3,left,left_up,no,no-recurrence-events
7,40-49,premeno,2014-10-01 00:00:00,0-2,no,2,left,left_up,no,no-recurrence-events
8,40-49,premeno,0-4,0-2,no,2,right,right_low,no,no-recurrence-events
9,40-49,ge40,40-44,15-17,yes,2,right,left_up,yes,no-recurrence-events
10,50-59,premeno,25-29,0-2,no,2,left,left_low,no,no-recurrence-events
11,60-69,ge40,15-19,0-2,no,2,right,left_up,no,no-recurrence-events
12,50-59,ge40,30-34,0-2,no,1,right,central,no,no-recurrence-events
13,50-59,ge40,25-29,0-2,no,2,right,left_up,no,no-recurrence-events
14,40-49,premeno,25-29,0-2,no,2,left,left_low,yes,recurrence-events
15,30-39,premeno,20-24,0-2,no,3,left,central,no,no-recurrence-events
16,50-59,premeno,2014-10-01 00:00:00,2019-05-03 00:00:00,no,1,right,left_up,no,no-recurrence-events
17,60-69,ge40,15-19,0-2,no,2,right,left_up,no,no-recurrence-events
18,50-59,premeno,40-44,0-2,no,2,left,left_up,no,no-recurrence-events
19,50-59,ge40,20-24,0-2,no,3,left,left_up,no,no-recurrence-events
20,50-59,lt40,20-24,0-2,?,1,left,left_low,no,recurrence-events
21,60-69,ge40,40-44,2019-05-03 00:00:00,no,2,right,left_up,yes,no-recurrence-events
22,50-59,ge40,15-19,0-2,no,2,right,left_low,no,no-recurrence-events
23,40-49,premeno,2014-10-01 00:00:00,0-2,no,1,right,left_up,no,no-recurrence-events
24,30-39,premeno,15-19,2019-08-06 00:00:00,yes,3,left,left_low,yes,recurrence-events
25,50-59,ge40,20-24,2019-05-03 00:00:00,yes,2,right,left_up,no,no-recurrence-events
您可以使用一个自定义函数,该函数使用
regex
查找日期时间字符串,并用非零填充的“%m-%d”替换它们(在Linux上,您还可以使用strftime
和“%-m-%-d”…):这里有一条路
df['inv-nodes']=df['inv-nodes'].str.extract('(\d{4})-(\d{2}-\d{2})')[1].fillna(df['tumor-size'])
相关问题 更多 >
编程相关推荐