将RPy2 ListVector转换为Python字典
在Python中,跟R语言里的命名列表相对应的东西是字典(dict),不过通过RPy2,你可以得到一个叫做ListVector的对象。
import rpy2.robjects as robjects
a = robjects.r('list(foo="barbat", fizz=123)')
此时,a就是一个ListVector对象。
<ListVector - Python:0x108f92a28 / R:0x7febcba86ff0>
[StrVector, FloatVector]
foo: <class 'rpy2.robjects.vectors.StrVector'>
<StrVector - Python:0x108f92638 / R:0x7febce0ae0d8>
[str]
fizz: <class 'rpy2.robjects.vectors.FloatVector'>
<FloatVector - Python:0x10ac38fc8 / R:0x7febce0ae108>
[123.000000]
我想要的是一种可以像普通Python字典那样使用的东西。我现在的临时解决办法是:
def as_dict(vector):
"""Convert an RPy2 ListVector to a Python dict"""
result = {}
for i, name in enumerate(vector.names):
if isinstance(vector[i], robjects.ListVector):
result[name] = as_dict(vector[i])
elif len(vector[i]) == 1:
result[name] = vector[i][0]
else:
result[name] = vector[i]
return result
as_dict(a)
{'foo': 'barbat', 'fizz': 123.0}
b = robjects.r('list(foo=list(bar=1, bat=c("one","two")), fizz=c(123,345))')
as_dict(b)
{'fizz': <FloatVector - Python:0x108f7e950 / R:0x7febcba86b90>
[123.000000, 345.000000],
'foo': {'bar': 1.0, 'bat': <StrVector - Python:0x108f7edd0 / R:0x7febcba86ea0>
[str, str]}}
所以,我想问的是……有没有更好的方法,或者RPy2里有没有我应该使用的内置功能?
7 个回答
3
下面是我写的一个函数,用来把rpy2的ListVector转换成Python的字典,并且这个函数可以处理嵌套的列表:
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
def r_list_to_py_dict(r_list):
converted = {}
for name in r_list.names:
val = r_list.rx(name)[0]
if isinstance(val, ro.vectors.DataFrame):
converted[name] = pandas2ri.ri2py_dataframe(val)
elif isinstance(val, ro.vectors.ListVector):
converted[name] = r_list_to_py_dict(val)
elif isinstance(val, ro.vectors.FloatVector) or isinstance(val, ro.vectors.StrVector):
if len(val) == 1:
converted[name] = val[0]
else:
converted[name] = list(val)
else: # single value
converted[name] = val
return converted
6
你还可以这样做:
输入
dict(a.items())
输出
{'foo': R object with classes: ('character',) mapped to:
['barbat'], 'fizz': R object with classes: ('numeric',) mapped to:
[123.000000]}
9
我之前也遇到过一个问题,涉及到很多层嵌套的不同 rpy2 向量类型。我在 StackOverflow 上找不到直接的答案,所以我决定分享我的解决方案。
根据 CT Zhu 的回答,我写了下面这段代码,可以递归地将整个结构转换为 Python 类型。
from collections import OrderedDict
import numpy as np
from rpy2.robjects.vectors import DataFrame, FloatVector, IntVector, StrVector, ListVector, Matrix
def recurse_r_tree(data):
"""
step through an R object recursively and convert the types to python types as appropriate.
Leaves will be converted to e.g. numpy arrays or lists as appropriate and the whole tree to a dictionary.
"""
r_dict_types = [DataFrame, ListVector]
r_array_types = [FloatVector, IntVector, Matrix]
r_list_types = [StrVector]
if type(data) in r_dict_types:
return OrderedDict(zip(data.names, [recurse_r_tree(elt) for elt in data]))
elif type(data) in r_list_types:
return [recurse_r_tree(elt) for elt in data]
elif type(data) in r_array_types:
return np.array(data)
else:
if hasattr(data, "rclass"): # An unsupported r class
raise KeyError('Could not proceed, type {} is not defined'
'to add support for this type, just add it to the imports '
'and to the appropriate type list above'.format(type(data)))
else:
return data # We reached the end of recursion
18
简单的R列表转成Python字典:
>>> import rpy2.robjects as robjects
>>> a = robjects.r('list(foo="barbat", fizz=123)')
>>> d = { key : a.rx2(key)[0] for key in a.names }
>>> d
{'foo': 'barbat', 'fizz': 123.0}
将任意R对象转成Python对象,使用R的RJSONIO进行JSON序列化和反序列化
在R服务器上:安装这个包可以用命令:install.packages("RJSONIO", dependencies = TRUE)
>>> ro.r("library(RJSONIO)")
<StrVector - Python:0x300b8c0 / R:0x3fbccb0>
[str, str, str, ..., str, str, str]
>>> import rpy2.robjects as robjects
>>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list(33,"bb")) ) ')
>>> pyobj = json.loads( rjson[0] )
>>> pyobj
{u'lst': [33, u'bb'], u'foo': u'barbat', u'fizz': 123}
>>> pyobj['lst']
[33, u'bb']
>>> pyobj['lst'][0]
33
>>> pyobj['lst'][1]
u'bb'
>>> rjson = robjects.r(' toJSON( list(foo="barbat", fizz=123, lst=list( key1=33,key2="bb")) ) ')
>>> pyobj = json.loads( rjson[0] )
>>> pyobj
{u'lst': {u'key2': u'bb', u'key1': 33}, u'foo': u'barbat', u'fizz': 123}
24
我觉得把一个r向量放进一个dictionary
并不需要那么复杂,试试这个方法:
In [290]:
dict(zip(a.names, list(a)))
Out[290]:
{'fizz': <FloatVector - Python:0x08AD50A8 / R:0x10A67DE8>
[123.000000],
'foo': <StrVector - Python:0x08AD5030 / R:0x10B72458>
['barbat']}
In [291]:
dict(zip(a.names, map(list,list(a))))
Out[291]:
{'fizz': [123.0], 'foo': ['barbat']}
当然,如果你不介意使用pandas
,那就更简单了。结果会是numpy.array
而不是list
,但在大多数情况下这样也没问题:
In [294]:
import pandas.rpy.common as com
com.convert_robj(a)
Out[294]:
{'fizz': [123.0], 'foo': array(['barbat'], dtype=object)}