对熊猫来说,任何帮助都是非常感激的
def csv_reader(fileName):
reqcols=['_id__$oid','payload','channel']
io = pd.read_csv(fileName,sep=",",usecols=reqcols)
print(io['payload'].values)
return io
io输出行['payload']:
{
"destination_ip": "172.31.14.66",
"date": "2014-10-19T01:32:36.669861",
"classification": "Potentially Bad Traffic",
"proto": "UDP",
"source_ip": "172.31.0.2",
"priority": "`2",
"header": "1:2003195:5",
"signature": "ET POLICY Unusual number of DNS No Such Name Responses ",
"source_port": "53",
"destination_port": "34638",
"sensor": "5cda4a12-4730-11e4-9ee4-0a0b6e7c3e9e"
}
我试图从ndarray对象中提取特定数据。从数据帧中提取数据的方法是什么
"destination_ip": "172.31.13.124",
"proto": "ICMP",
"source_ip": "201.158.32.1",
"date": "2014-09-28T14:49:43.391463",
"sensor": "139cfdf2-471e-11e4-9ee4-0a0b6e7c3e9e"
使用@jezrael的sample
df
解决方案
str.cat
将所有payload
粉碎在一起pd.read_json
一次解析整个过程我认为您首先需要将} 添加原始列:
string
的dicts
重新表示转换为dictionaries
在每一行中的json.loads
或ast.literal_eval
在payload
列中,然后通过构造函数创建新的DataFrame
,通过子集过滤列,如果需要,通过^{时间安排:
访问pandas中的列是相当直接的。只需传递所需列的列表:
代码:
测试代码:
结果:
相关问题 更多 >
编程相关推荐