Python gym-super-mario-bros包_程序模块 - PyPI

超级马里奥兄弟为openai健身房

gym-super-mario-bros的Python项目详细描述

健身房超级马里奥兄弟

Mario

一个OpenAI Gym环境任天堂上的超级马里奥兄弟和超级马里奥兄弟2（损失等级）娱乐系统（NES）使用 the nes-py emulator。

安装

gym-super-mario-bros的首选安装位置是pip：

pip install gym-super-mario-bros

用法

Python

在尝试创建环境之前，必须导入gym_super_mario_bros。这是因为健身房环境是在运行时注册的。默认情况下， gym_super_mario_bros环境使用完整的NES操作空间256 离散动作。为了解决这个问题，gym_super_mario_bros.actions提供三个操作列表（RIGHT_ONLY、SIMPLE_MOVEMENT和COMPLEX_MOVEMENT）对于nes_py.wrappers.JoypadSpace包装器。见 gym_super_mario_bros/actions.py对于这三个清单中的每一个法律行动的细目。

fromnes_py.wrappersimportJoypadSpaceimportgym_super_mario_brosfromgym_super_mario_bros.actionsimportSIMPLE_MOVEMENTenv=gym_super_mario_bros.make('SuperMarioBros-v0')env=JoypadSpace(env,SIMPLE_MOVEMENT)done=Trueforstepinrange(5000):ifdone:state=env.reset()state,reward,done,info=env.step(env.action_space.sample())env.render()env.close()

注意：gym_super_mario_bros.make只是gym.make的别名方便。

注意：删除训练代码中对render的调用加速。

命令行

gym_super_mario_bros具有用于播放的命令行界面使用键盘或均匀随机移动的环境。

gym_super_mario_bros -e <the environment ID to play> -m <`human` or `random`>

注意：默认情况下，-e设置为SuperMarioBros-v0，-m设置为 human。

环境

这些环境允许3次尝试（生命）通过32个阶段在游戏中。环境只将可奖励的游戏帧发送到代理；不从NES模拟器发送剪切场景、加载屏幕等代理也不能在这些实例期间执行操作。如果剪切场景无法通过黑客攻击网络的RAM、环境跳过将锁定python进程，直到模拟器准备好执行下一个操作。

Environment	Game	ROM
^{}	SMB	standard
^{}	SMB	downsample
^{}	SMB	pixel
^{}	SMB	rectangle
^{}	SMB2	standard
^{}	SMB2	downsample

个别阶段

这些环境允许一次尝试（生命）通过一次游戏的阶段。

使用模板

SuperMarioBros-<world>-<stage>-v<version>

其中：

<world>是{1，2，3，4，5，6，7，8}中的一个数字，表示世界
<stage>是{1，2，3，4}中的一个数字，表示一个世界中的阶段
<version>是{0，1，2，3}中的一个数字，指定要使用的rom模式
- 0：标准ROM
- 1:降采样ROM
- 2：像素rom
- 3：矩形ROM

例如，要在下采样rom上播放4-2，可以使用环境身份证SuperMarioBros-4-2-v1。

随机阶段选择

随机阶段选择环境随机选择一个阶段并允许一次清除的尝试。在死亡和随后对reset的呼叫中，环境随机选择一个新的阶段。这只适用于标准超级马里奥兄弟游戏，not失去的水平（目前）。使用这些环境，将RandomStages附加到SuperMarioBrosid。例如，要使用标准rom和随机阶段选择 SuperMarioBrosRandomStages-v0。使用 ^在调用reset之前，env的{}方法，即env.seed(1)。

步骤

关于step方法返回的奖励和信息的信息。

奖励功能

奖励函数假设游戏的目标是向右移动尽可能快地（增加代理的x值），而不快死了。为了模拟这个游戏，三个独立的变量组成奖励：

v：状态之间代理x值的差异
- 在这种情况下，这是给定步骤的瞬时速度
- v=x1-x0
  - x0是步骤之前的x位置
  - x1是步骤之后的x位置
- 向右移动{em1}$v>；0
- 向左移动{em1}$v<；0
- 不移动{em1}$v=0
c：帧之间游戏时钟的差异
- 这项处罚使特工无法站立直至
- c=c0-c1
  - c0是步骤之前的时钟读数
  - c1是步骤之后的时钟读数
- 没有时钟刻度{em1}$c=0
- 时钟刻度{em1}$c<；0
d：惩罚在某一状态下死亡的代理人的死刑。
- 这一刑罚鼓励代理人避免死亡
- 活着{em1}$d=0
- 死亡{em1}$d=-15

r=v+c+d

奖励被限制在（-15，15）范围内。

`info`字典

由step方法返回的info字典包含以下内容按键：

Key	Type	Description
^{}	^{}	The number of collected coins
^{}	^{}	True if Mario reached a flag or ax
^{}	^{}	The number of lives left, i.e., {3, 2, 1}
^{}	^{}	The cumulative in-game score
^{}	^{}	The current stage, i.e., {1, ..., 4}
^{}	^{}	Mario's status, i.e., {'small', 'tall', 'fireball'}
^{}	^{}	The time left on the clock
^{}	^{}	The current world, i.e., {1, ..., 8}
^{}	^{}	Mario's x position in the stage (from the left)
^{}	^{}	Mario's y position in the stage (from the bottom)

引文

如果你在研究中使用它，请引用gym-super-mario-bros。

@misc{gym-super-mario-bros,
  author = {Christian Kauten},
  title = {{S}uper {M}ario {B}ros for {O}pen{AI}{G}ym},
  year = {2018},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/Kautenja/gym-super-mario-bros}},
}

欢迎加入QQ群-->： 979659372

gym-super-mario-bros 7.2.3

gym-super-mario-bros的Python项目详细描述

健身房超级马里奥兄弟

安装

用法

Python

命令行

环境

个别阶段

随机阶段选择

步骤

奖励功能

`info`字典

引文

推荐PyPI第三方库

boom

ktrain

odoo9-addon-website-sale-require-login

hackernews_scraper

collective.wtforms

odoo10-addon-l10n-nl-tax-statement

pyfl

tesstractor

fio_party_merge

collective.takeaportrait

outlyer-cli

astmonke

perceval-puppet

badlands-companion

oathldap-tool

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

gym-super-mario-bros 7.2.3

gym-super-mario-bros的Python项目详细描述

健身房超级马里奥兄弟

安装

用法

Python

命令行

环境

个别阶段

随机阶段选择

步骤

奖励功能

info字典

引文

推荐PyPI第三方库

boom

ktrain

odoo9-addon-website-sale-require-login

hackernews_scraper

collective.wtforms

odoo10-addon-l10n-nl-tax-statement

pyfl

tesstractor

fio_party_merge

collective.takeaportrait

outlyer-cli

astmonke

perceval-puppet

badlands-companion

oathldap-tool

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

`info`字典

导航栏

项目链接

标签