将数据帧的多列作为参数传递给函数时,datetime列不希望执行下面的格式化函数。我可以管理与内联解决方案显示。。。但如果知道原因就好了。。。我是否应该使用不同的日期,例如数据类型?谢谢(p.s.熊猫=很棒)
import pandas as pd
import numpy as np
import datetime as dt
def fmtfn(arg_dttm, arg_int):
retstr = arg_dttm.strftime(':%Y-%m-%d') + '{:0>3}'.format(arg_int)
# bombs with: 'numpy.datetime64' object has no attribute 'strftime'
# retstr = '{:%Y-%m-%d}~{:0>3}'.format(arg_dttm, arg_int)
# bombs with: invalid format specifier
return retstr
def fmtfn2(arg_dtstr, arg_int):
retstr = '{}~{:0>3}'.format(arg_dtstr, arg_int)
return retstr
# The source data.
# I want to add a 3rd column newhg that carries e.g. 2017-06-25~066
# i.e. a concatenation of the other two columns.
df1 = pd.DataFrame({'mydt': ['2017-05-07', '2017-06-25', '2015-08-25'],
'myint': [66, 201, 100]})
df1['mydt'] = pd.to_datetime(df1['mydt'], errors='raise')
# THIS WORKS (without calling a function)
print('\nInline solution')
df1['newhg'] = df1[['mydt', 'myint']].apply(lambda x: '{:%Y-%m-%d}~{:0>3}'.format(x[0], x[1]), axis=1)
print(df1)
# THIS WORKS
print('\nConvert to string first')
df1['mydt2'] = df1['mydt'].apply(lambda x: x.strftime('%Y-%m-%d'))
df1['newhg'] = np.vectorize(fmtfn2)(df1['mydt2'], df1['myint'])
print(df1)
# Bombs in the function - see above
print('\nPass a datetime')
df1['newhg'] = np.vectorize(fmtfn)(df1['mydt'], df1['myint'])
print(df1)
您还可以使用pandas的内置函数,这使其更易于阅读:
相关问题 更多 >
编程相关推荐