Python skinner包_程序模块 - PyPI

未提供项目说明

skinner的Python项目详细描述

斯金纳

基于Python的强化学习新框架Skinner

它是为初学者的RL。在

目前尚处于开发阶段，api设计不完善，但运行稳定。对于网格世界来说，它已经足够成熟了。在

享受skinner！在

要求

健身房
numpy公司

下载

通过pip命令pip install skinner从github或pypi下载。在

设计

我们考虑observer设计模式。env和其中的代理一般都是相互观察的。agent观察env如何行为并获得奖励，env观察agent和其他物体呈现给观看者并记录信息。在

特色

太简单了

使用

快速启动

在示例中运行demo.py。还有其他例子：demo1.py, demo2.py。在

此外，还可以在^{}中观看动画

示例

作者列举了3个例子。建议用户查看代码。在objects.py中定义对象，在simple_grid.py中定义新的环境，然后在脚本中编写演示编程（请参见demo.py）。在

定义环境

如果您只想构建一个简单的env，那么下面是一个选项，一个网格世界。在

#!/usr/bin/env python3# -*- coding: utf-8 -*-"""Demo of RLAn env with some traps and a gold."""fromskinnerimport*fromgym.envs.classic_controlimportrenderingfromobjectsimport*classMyGridWorld(GridMaze,SingleAgentEnv):"""Grid world    A robot playing the grid world, tries to find the golden (yellow circle), meanwhile    it has to avoid of the traps(black circles)    Extends:        GridMaze: grid world with walls        SingleAgentEnv: there is only one agent    """# configure the env# get the positions of the objects (done automatically)CHARGER=...TRAPS=...DEATHTRAPS=...GOLD=...def__init__(self,*args,**kwargs):super(MyGridWorld,self).__init__(*args,**kwargs)self.add_walls(conf['walls'])self.add_objects((*traps,*deathtraps,charger,gold))# Define the condition when the demo of rl will stop.defis_terminal(self):returnself.agent.positioninself.DEATHTRAPSorself.agent.position==self.GOLDorself.agent.power<=0defis_successful(self):returnself.agent.position==self.GOLD# Following methods are not necessary, that only for recording the process of rldefpost_process(self):ifself.is_successful():self.history['n_steps'].append(self.agent.n_steps)else:self.history['n_steps'].append(self.max_steps)self.history['reward'].append(self.agent.total_reward)self.agent.post_process()defpre_process(self):self.history['n_steps']=[]self.history['reward']=[]defend_process(self):importpandasaspddata=pd.DataFrame(self.history)data.to_csv('history.csv')

配置env及其对象

请参见conf.yaml以获取示例。对象类将在objects.py中定义。在

^{pr2}$

定义对象

对象的形状（默认为圆形）
绘图的方法（如果形状很简单，不要重写它）

class_Object(Object):props=('name','position','color','size')default_position=(0,0)# set default value to help you reducing the codes when creating an objectclassGold(_Object):defdraw(self,viewer):'''this method is the most direct to determine how to plot the object        You should define the shape and coordinate        '''...classCharger(_Object):defcreate_shape(self):'''redefine the shape, here we define a squre with edges length of 40.        The default shape is a circle        '''a=20self.shape=rendering.make_polygon([(-a,-a),(a,-a),(a,a),(-a,a)])self.shape.set_color(*self.color)

定义代理

转换函数$f（s，a）$
奖励函数$r（s，a，s'）$

fromskinnerimport*classMyRobot(StandardAgent):actions=Discrete(4)# define the shapesize=30color=(0.8,0.6,0.4)def_reset(self):# define the initial state...def_next_state(self,state,action):"""transition function: s, a -> s'        """...def_get_reward(self,state0,action,state1):"""reward function: s,a,s'->r        """...# define parametersagent=MyRobot(alpha=0.3,gamma=0.9)

示例

代码

请参阅examples中的脚本

结果

纪念

为了纪念伟大的美国心理学家B. F. Skinner（1904-1990）。RL的灵感主要来自他的行为主义。行为主义心理学史上有许多贡献者，他可能是其中最著名的一位。在

欢迎加入QQ群-->： 979659372

skinner 0.2.1

skinner的Python项目详细描述

斯金纳

要求

下载

设计

特色

使用

快速启动

示例

定义环境

配置env及其对象

定义对象

定义代理

示例

代码

结果

纪念

推荐PyPI第三方库

frankly-python

redbrick

cekit

django-laporem-field

openerp-procurement

PicoTest

nvstrings-cuda92

parle

mondemand

jojo

SQLObject2

verhaal

wc-csv

cli_flask

pwgen-passphrase

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

skinner 0.2.1

skinner的Python项目详细描述

斯金纳

要求

下载

设计

特色

使用

快速启动

示例

定义环境

配置env及其对象

定义对象

定义代理

示例

代码

结果

纪念

推荐PyPI第三方库

frankly-python

redbrick

cekit

django-laporem-field

openerp-procurement

PicoTest

nvstrings-cuda92

parle

mondemand

jojo

SQLObject2

verhaal

wc-csv

cli_flask

pwgen-passphrase

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签