Python rapmedusa包_程序模块 - PyPI

mapreduce查询redis的python实现

rapmedusa的Python项目详细描述

#拉美杜莎

rapmedusa是一个python模块，通过 redis键值存储。它旨在提供类似的功能（在某些方面）couchdb的视图特性和mongodb的mapreduce数据库命令。

##依赖关系

rapmedusa依赖于andy mccurdy的redis py模块，它可以是从https://github.com/andymccurdy/redis-py][1获得。当然，您还需要一个正在运行的[redis][2]实例来连接。在这两种情况下，任何版本>；=2.0都应与Rapmedusa兼容。

##安装

$ sudo pip install rapmedusa

或

$ sudo easy_install rapmedusa

或来源：

$ sudo python setup.py install

##概述

首先，导入所需的模块：

:::python

>>> import redis
>>> from rapmedusa import emit, map_reduce

接下来，以通常的方式连接到正在运行的redis实例：

:::python

>>> redis = redis.StrictRedis(host='localhost', port=6379, db=0)

最后，必须提供map和reduce函数的实现，并将其与redis服务器的活动连接一起传递到对map\reduce（）函数的调用中：

:::python

>>> def myMap(key, val):
        ...
        emit(newKey, newVal)

>>> def myReduce(key, values):
        ...
        return newVal

>>> result = map_reduce(redis, myMap, myReduce)

这将返回一个python dictionary对象，其中包含运行mapreduce作业的结果。字典中的每个键对应于传入reduce函数的键，并包含reduce函数为该键计算的值。

##详细信息

现在是时候更深入地了解Rapmedusa是如何执行MapReduce任务的。基本上有6个步骤：

Read the input data set from a specified Redis hash.
Pass each key/value pair from the input data set to the registered map function.
Organize key/value pairs emitted by the map function into a set of Redis lists, one list per distinct emitted key.
Each of these lists is passed to the registered reduce function, along with the corresponding key.
The result of each call to reduce is stored in the Redis hash reserved for the job output, under the key used in the reduce call.
A Python dictionary representing the contents of the job output hash is returned.

在这一点上，一个自然的问题是如何指定输入和输出散列键？这些（以及在以上步骤）可以在对map_reduce（）的调用中指定。下面是附加的可选参数列表，可以是在调用中指定：

inKey – specifies the key under which the input data set is stored (defaults to ‘rapmedusa:inputs’)
outKey – specifies the key under which the job output is stored (defaults to ‘rapmedusa:outputs’)
sortKey – specifies the key prefix under which the output of the map function (step 3 above) is stored (defaults to ‘rapmedusa:sortedVals’)
sortedKeySet – specifies the key under which the set formed from the list keys of step 3 is stored (defaults to ‘rapmedusa:sortedKeySet’)
cleanUp – a boolean value indicating whether the temporary keys (sortKey, sortedKeySet) should be deleted from the Redis store upon the completion of the MapReduce job (defaults to True)

您很少需要重写sortedkey和sortedkey集的默认值，因为命名冲突极不可能发生。但是，您可能希望为更容易记住的inkey和outkey指定自定义值。

##示例

###例1：计算年龄

这个例子演示了一个mapreduce作业，其中输入键被映射到个人记录，而map函数生成键根据其中一个记录条目，年龄。

:::python

>>> import redis
>>> from rapmedusa import *

>>> conn = redis.StrictRedis(host='localhost', port=6379, db=0)
>>> conn.hset('myInput', 1, "{'name': 'Chad', 'age': 43}")
>>> conn.hset('myInput', 2, "{'name': 'Ron', 'age': 21}")
>>> conn.hset('myInput', 3, "{'name': 'George', 'age': 54}")
>>> conn.hset('myInput', 4, "{'name': 'Alice', 'age': 54}")

>>> def myMap(key, value):
                obj = eval(value)
                emit(str(obj['age']), '1')

>>> def myReduce(key, vals):
                total = 0
                for v in vals:
                        total += int(v)
                return total

>>> result = map_reduce(conn, myMap, myReduce, inKey='myInput')
>>> print result
{'54': '2', '21': '1', '43': '1'}

作者

rapmedusa由greg leighton（grleighton@gmail.com）开发和维护。最新版本可在[https://github.com/gleighto/rapmedusa][3下载。

[1]：https://github.com/andymccurdy/redis-py [2]：http://redis.io [3]：https://github.com/gleighto/rapmedusa

欢迎加入QQ群-->： 979659372

rapmedusa 1.0

rapmedusa的Python项目详细描述

作者

推荐PyPI第三方库

rightsignature

hanziconv

randomness_beacon

ingatesdk

datasette-json-html

repoze.who.plugins.ldap

Pushl

aiohttp_mako

kaybee-bulma

libparsing

mysqlbinlog2blinker

yorn

pyrbac

croaring

kenny-loggings

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

rapmedusa 1.0

rapmedusa的Python项目详细描述

作者

推荐PyPI第三方库

rightsignature

hanziconv

randomness_beacon

ingatesdk

datasette-json-html

repoze.who.plugins.ldap

Pushl

aiohttp_mako

kaybee-bulma

libparsing

mysqlbinlog2blinker

yorn

pyrbac

croaring

kenny-loggings

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签