从协议缓冲区创建可在Pandas中使用的Python字典对象

8 投票

1 回答

9157 浏览

提问于 2025-04-18 12:50

我现在正在和一个提供协议缓冲区的服务器进行交互。我可能会收到非常多的消息。目前，我的处理流程是读取协议缓冲区并将其转换为Pandas数据框（虽然这一步并不是必须的，但Pandas提供了很好的工具来分析数据集），具体步骤如下：

读取协议缓冲区，这将是一个谷歌的protobuf对象。
使用protobuf_to_dict将协议缓冲区转换为字典。
使用pandas.DataFrame.from_records来获取一个数据框。

这个方法很好用，但考虑到我从protobuf读取的消息数量非常大，先转换成字典再转换成Pandas数据框的过程效率不高。我的问题是：有没有可能创建一个类，让Python的protobuf对象看起来像字典？也就是说，能省去第二步。如果有任何参考资料或伪代码，那就太好了。

效率优化数据处理字典转换数据分析 pandas 数据框 protobuf 协议缓冲区

1 个回答

你可以看看这个叫做 ProtoText 的 Python 包。它可以让你像操作字典一样直接访问你的 protobuf 对象。

举个例子：假设你有一个 Python 的 protobuf 对象 person_obj。

import ProtoText
print person_obj['name']       # print out the person_obj.name 
person_obj['name'] = 'David'   # set the attribute 'name' to 'David'
# again set the attribute 'name' to 'David' but in batch mode
person_obj.update({'name': 'David'})
print ('name' in person_obj)  # print whether the 'name' attribute is set in person_obj 
# the 'in' operator is better than the google implementation HasField function 
# in the sense that it won't raise Exception even if the field is not defined

回答于 2025-04-18 由 Python大师

分享举报

从协议缓冲区创建可在Pandas中使用的Python字典对象

1 个回答

撰写回答