表示和求解mdp的概率逻辑编程框架。
mdpproblog的Python项目详细描述
MDP-ProbLog is a Python3 framework to represent and solve (infinite-horizon) MDPs using Probabilistic Logic Programming.
安装
必须安装python3。
$ pip3 install mdpproblog
用法
$ mdp-problog --help usage: mdp-problog {list, show, simulate, solve} [-m DOMAIN INSTANCE] [OPTIONS] MDP-ProbLog is a Python3 framework to represent and solve Markovian Decision Processes by Probabilistic Logic Programming. This project is free software. Please check the documentation at http://pythonhosted.org/mdpproblog/. positional arguments: {list,show,solve,simulate} available commands: list examples, show and solve models or simulate optimal policy optional arguments: -h, --help show this help message and exit -m MODEL MODEL, --model MODEL MODEL list of domain and instance files -x EXAMPLE, --example EXAMPLE select model from examples -g GAMMA, --gamma GAMMA discount factor (default=0.9) -e EPSILON, --epsilon EPSILON maximum error (default=0.1) -t TRIALS, --trials TRIALS number of trials (default=100) -z HORIZON, --horizon HORIZON simulation horizon (default=30)
输入
sysadmin规划问题的域规范 (models/sysadmin/domain.pl)。
% Network topology propertiesaccTotal([],A,A).accTotal([_|T],A,X):-BisA+1,accTotal(T,B,X).total(L,T):-accTotal(L,0,T).total_connected(C,T):-connected(C,L),total(L,T).accAlive([],A,A).accAlive([H|T],A,X):-running(H,0),BisA+1,accAlive(T,B,X).accAlive([H|T],A,X):-not(running(H,0)),BisA,accAlive(T,B,X).alive(L,A):-accAlive(L,0,A).total_running(C,R):-connected(C,L),alive(L,R).% State fluentsstate_fluent(running(C)):-computer(C).% Actionsaction(reboot(C)):-computer(C).action(reboot(none)).% Transition model1.00::running(C,1):-reboot(C).0.05::running(C,1):-not(reboot(C)),not(running(C,0)).P::running(C,1):-not(reboot(C)),running(C,0),total_connected(C,T),total_running(C,R),Pis0.45+0.50*R/T.% Utility attributes% costsutility(reboot(C),-0.75):-computer(C).utility(reboot(none),0.00).% rewardsutility(running(C,0),1.00):-computer(C).
示例
$ mdp-problog simulate -x sysadmin1 Value(running(c1,0)=0, running(c2,0)=0, running(c3,0)=0) = 16.829 Value(running(c1,0)=1, running(c2,0)=0, running(c3,0)=0) = 19.171 Value(running(c1,0)=0, running(c2,0)=1, running(c3,0)=0) = 19.205 Value(running(c1,0)=1, running(c2,0)=1, running(c3,0)=0) = 23.028 Value(running(c1,0)=0, running(c2,0)=0, running(c3,0)=1) = 19.206 Value(running(c1,0)=1, running(c2,0)=0, running(c3,0)=1) = 23.029 Value(running(c1,0)=0, running(c2,0)=1, running(c3,0)=1) = 21.392 Value(running(c1,0)=1, running(c2,0)=1, running(c3,0)=1) = 25.607 Policy(running(c1,0)=0, running(c2,0)=0, running(c3,0)=0) = reboot(c1) Policy(running(c1,0)=1, running(c2,0)=0, running(c3,0)=0) = reboot(c3) Policy(running(c1,0)=0, running(c2,0)=1, running(c3,0)=0) = reboot(c1) Policy(running(c1,0)=1, running(c2,0)=1, running(c3,0)=0) = reboot(c3) Policy(running(c1,0)=0, running(c2,0)=0, running(c3,0)=1) = reboot(c1) Policy(running(c1,0)=1, running(c2,0)=0, running(c3,0)=1) = reboot(c2) Policy(running(c1,0)=0, running(c2,0)=1, running(c3,0)=1) = reboot(c1) Policy(running(c1,0)=1, running(c2,0)=1, running(c3,0)=1) = reboot(none) >> Value iteration converged in 0.196sec after 40 iterations. >> Average time per iteration = 0.005sec. Expectation(running(c1,0)=0, running(c2,0)=0, running(c3,0)=0) = 16.733 Expectation(running(c1,0)=1, running(c2,0)=0, running(c3,0)=0) = 19.433 Expectation(running(c1,0)=0, running(c2,0)=1, running(c3,0)=0) = 19.108 Expectation(running(c1,0)=1, running(c2,0)=1, running(c3,0)=0) = 23.377 Expectation(running(c1,0)=0, running(c2,0)=0, running(c3,0)=1) = 19.546 Expectation(running(c1,0)=1, running(c2,0)=0, running(c3,0)=1) = 23.287 Expectation(running(c1,0)=0, running(c2,0)=1, running(c3,0)=1) = 21.785 Expectation(running(c1,0)=1, running(c2,0)=1, running(c3,0)=1) = 25.849
许可证
版权所有(c)2016-2017 Thiago Pereira Bueno保留所有权利。
mdpproblog是免费软件:您可以重新分发和/或修改它 根据由 自由软件基金会,许可证的第3版,或者 你的选择)任何更高版本。
mdpproblog的发布是希望它会有用,但是 没有任何保证;甚至没有 适销性或适合某一特定目的的适销性。看小GNU 一般公共许可证了解更多详细信息。
你应该收到GNU Lesser通用公共许可证的副本 以及mdpproblog。如果没有,请参阅http://www.gnu.org/licenses/。