Brian Tanner :: brian@tannerpages.com
This document describes how to use the Matlab RL-Glue Codec, a software library that provides socket-compatibility with the RL-Glue Reinforcement Learning software library. Matlab is a brand new codec, created specifically for the RL-Glue 3.0 release. Special thanks (or pokes) should go to Dale, Doina, and Yaki who gave voices to the countless others who probably have scorned us for not supporting Matlab earlier. It turned out to not even be that difficult :)
For general information and motivation about the RL-Glue1 project, please refer to the documentation provided with that project.
This codec will allow you to create agents, environments, and experiment programs in Matlab.
This software project is licensed under the Apache-2.02 license. We're not lawyers, but our intention is that this code should be used however it is useful. We'd appreciate to hear what you're using it for, and to get credit if appropriate.
This project has a home here:
http://glue.rl-community.org/Home/Extensions/matlab-codec
Compiling and running components with this codec requires Matlab. The codec was developed on Matlab 7.6.0.x, it has not been tested extensively on other versions. Reports from the community suggest that Matlab 7.5 or higher is required to run this codec.
This Matlab codec uses the RL-Glue Java Extension, which means that Matlab needs to be running with the Java Virtual Machine enabled (it is by default). The Java extension does not need to be installed independently.
Possible Contribution: Someone with Matlab experience could help us find out what exact version of Matlab is required to use this codec, and could help us update the codec to be as robust as possible to older versions.
The .tar.gz distribution can be found here:
http://code.google.com/p/rl-glue-ext/wiki/Matlab
To check the code out of subversion:
svn checkout http://rl-glue-ext.googlecode.com/svn/trunk/projects/codecs/Matlab Matlab-Codec
To run the installer, use the installRLGlue() function from within Matlab in the main directory of the package you downloaded. This will suggest installing Matlab to your home directory at ~/rl-glue/codecs/matlab. If you would prefer an alternate location, you can call the function with a path, like:
>>installRLGlue('~/desired/path/to/codec')
The skeleton contains all of the bare-bones plumbing that is required to create an agent / environment / experiment with this codec and might be a good starting point for creating your own components.
The mines-sarsa-sample contains a fully functional tabular Sarsa learning algorithm, a discrete-observation grid world problem, and an experiment program that can run these together and gather results. More details below in Section 2.6.
In the following sections, we will describe the skeleton project. Running and using the mines-sarsa-sample is analogous.
There are several functions that need to be written. They are all contained in a single Matlab source file:
examples/skeleton/skeleton_agent.m
The skeleton_agent.m file has a public function that returns a structure with function pointers to all of the required RL-Glue functions. The structure looks like this:
>> theAgent=skeleton_agent() theAgent = agent_init: @skeleton_agent_init agent_start: @skeleton_agent_start agent_step: @skeleton_agent_step agent_end: @skeleton_agent_end agent_cleanup: @skeleton_agent_cleanup agent_message: @skeleton_agent_message
Alternatively, these different functions could each be in their own skeleton_agent_{init, start, step, end, cleanup, message}.m files. This is a personal choice.
This agent does not learn anything and randomly chooses integer action 0 or 1.
You can compile and run the agent like:
>$ cd examples/skeleton >$ theAgent=skeleton_agent(); >$ runAgent(theAgent);
skeleton_agent() creates a struct with function pointers to the other skeleton_agent methods. runAgent(theAgent) then connects to RL-Glue and runs one step at a time until RL-Glue disconnects.
Alternatively, for a more interactive experience, you can run the agent manually one step at a time:
>$ cd examples/skeleton/ >$ theAgent=skeleton_agent() >$ connectAgent(theAgent); >$ runAgentLoop(theAgent); %run one step >$ runAgentLoop(theAgent); %run one step >$ runAgentLoop(theAgent); %run one step ...
Using this method, you can stop and examine what your agent is learning, and potentially modify, visualize, or analyze it however you like.
You will see something like:
>> runAgent(theAgent) RL-Glue Matlab Agent Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: localhost on port 4096
This means that the skeleton_agent is running, and trying to connect to the rl_glue executable server on the local machine through port 4096!
You can kill the process by pressing CTRL-c on your keyboard.
The Skeleton agent is very simple and well documented, so we won't spend any more time talking about it in these instructions. Please open it up and take a look.
We have provided a skeleton environment with the codec that is a good starting point for environments that you may write in the future. It implements all the required functions and provides a good example of how to compile a simple environment. This section will follow the same pattern as the agent version (Section 2.1). This section will be less detailed because many ideas are similar or identical.
The pertinent file is:
examples/skeleton/skeleton_environment.m
This environment is episodic, with 21 states, labeled
. States {0, 20} are terminal and return rewards of {-1, +1} respectively. The other states return reward of 0.
There are two actions, {0, 1}. Action 0 decrements the state number, and action 1 increments it. The environment starts in state 10.
You can compile and run the environment like:
>$ cd examples/skeleton >$ theEnv=skeleton_environment(); >$ runEnvironment(theEnv);
You will see something like:
RL-Glue Matlab Environment Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: localhost on port 4096
This means that the skeleton_environment is running, and trying to connect to the rl_glue executable server on the local machine through port 4096!
You can kill the process by pressing CTRL-c on your keyboard.
The Skeleton environment is very simple and well documented, so we won't spend any more time talking about it in these instructions. Please open it up and take a look.
The pertinent files are:
examples/skeleton/skeleton_experiment.m
This experiment runs RL_Episode a few times, sends some messages to the agent and environment, and then steps through one episode using RL_step.
>$ cd examples/skeleton >$ skeleton_experiment();
You will see something like:
Experiment starting up! RL-Glue Matlab Experiment Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: 127.0.0.1 on port 4096
This means that the skeleton_experiment is running, and trying to connect to the rl_glue executable server on the local machine through port 4096!
You can kill the process by pressing CTRL-c on your keyboard.
The Skeleton experiment is very simple and well documented, so we won't spend any more time talking about it in these instructions. Please open it up and take a look.
The following will work if you have the rl_glue socket server is installed at /usr/local/bin/rl_glue.
We will run all of the components in the Matlab interpreter. Alternatively, it may be convenient to run rl_glue in a terminal window of its own. In the Matlab interpreter:
>> cd examples/skeleton %Start the rl_glue socket server as a background process >> !/usr/local/bin/rl_glue & >> runAllTogether();
If RL-Glue is not installed in the default location, you'll have to start the rl_glue executable server using its full path (unless it's in your PATH environment variable):
>> /path/to/rl-glue/bin/rl_glue &
In the Matlab window, you should see the following if it worked:
>> runAllTogether RL-Glue Matlab Agent Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: 127.0.0.1 on port 4096 Agent Codec Connected RL-Glue Matlab Environment Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: 127.0.0.1 on port 4096 Environment Codec Connected RL-Glue Matlab Experiment Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: 127.0.0.1 on port 4096 Experiment Codec Connected Experiment starting up! RL_init called, the environment sent task spec: VERSION RL-Glue-3.0 PROBLEMTYPE episodic DISCOUNTFACTOR 1.0 OBSERVATIONS INTS (0 20) ACTIONS INTS (0 1) REWARDS (-1.0 1.0) EXTRA skeleton_environment(Matlab) by Brian Tanner. ----------Sending some sample messages---------- Agent responded to 'what is your name?' with: my name is skeleton_agent, Matlab edition! Agent responded to 'If at first you don't succeed; call it version 1.0 ' with: I don\'t know how to respond to your message Environment responded to 'what is your name?' with: my name is skeleton_environment, Matlab edition! Environment responded to 'If at first you don't succeed; call it version 1.0 ' with: I don\'t know how to respond to your message ----------Running a few episodes---------- Episode 0 48 steps -1.000000 total reward natural end 1 Episode 1 100 steps 0.000000 total reward natural end 0 Episode 2 74 steps -1.000000 total reward natural end 1 Episode 3 34 steps -1.000000 total reward natural end 1 Episode 4 100 steps 0.000000 total reward natural end 0 Episode 5 1 steps 0.000000 total reward natural end 0 Episode 6 70 steps 1.000000 total reward natural end 1 ----------Stepping through an episode---------- First observation and action were: 10 and: 1 ----------Summary---------- It ran for 86, total reward was: -1.000000
Congratulations, you have run an RL-Glue agent / environment / experiment together, all using the Matlab codec!
This function takes a struct as a parameter, and the struct can have any combination of the following three fields set (at least one should be set):
As long as you set one more of these fields, then you can call runRLGlueMultiExperiment and the Matlab codec will make sure that everything gets called when it needs to. Don't forget to make sure you run any components that you don't specify with another codec or another Matlab instance!
More details about the mines-sarsa sample project can be found at their RL-Library home:
http://library.rl-community.org/packages/mines-sarsa-sample
The task specification string3 is created in a semi-automated way using the Java RL-Glue Extension task spec parser/builder.
The SARSA agent parses the task specification string using the Java RL-Glue Extension task spec parser. This agent can receive special messages from the experiment program to pause/unpause learning, pause/unpause exploring, save the current value function to a file, and load the the value function from a file.
The sample experiment then tells the agent to save the value function to a file, and then resets the experiment (and agent) to initial conditions. After verifying that the agent's initial policy is bad, the experiment tells the agent to load the value function from the file. The agent is evaluated again using this previously-learned value function, and performance is dramatically better.
Finally, the experiment sends a message to specify that the environment should use a fixed (instead of random) starting state, and runs the agent from that fixed start state for a while.
The Matlab codec uses the task spec parser implementation from the RL-Glue Java Extension. This task spec parser/builder can be used in environments to create task specification strings for env_init. The sample mines environment in Section 2.6.1 provides an example of creating a task spec in this way. There are also several advanced examples of this in the RL-Library4. The task spec parser/builder can also be used by agents to decode the task spec string for agent_init. The sample sarsa agent in Section 2.6.2 demonstrates how to do this.
In these cases, you can tell your Matlab agent, environment, or experiment program to connect on a custom port and/or to a custom host using the RL_set_port() and RL_set_host() Matlab functions.
For example, the following code:
>> RL_set_port(4097); >> RL_set_host('yahoo.ca') >> cd examples/skeleton >> skeleton_experiment();
That command could give output like:
RL-Glue Matlab Experiment Codec Version: 1.0 ($Revision: 688 $) Connecting to rl_glue at host: yahoo.ca on port 4097
This works for agents, environments, and experiments. In practice though, remember that yahoo.ca probably isn't running an RL-Glue server.
You can specify the port, the host, neither, or both. Ports must be numbers, hosts can be hostnames or ip addresses. Default port value is 4096 and host is 127.0.0.1.
Remember, on most *nix systems, you need superuser privileges to listen on ports lower than 1024, so you probably want to pick one higher than that.
Since the Matlab is built on top of the Java codec, many of the underlying data structures are from the Java codec. We recommend checking out the Java documentation (PDF) (HTML) (JAVADOC) for more information.
So in a given Matlab method, like
function theAction=skeleton_agent_step(theReward, theObservation),
theObservation is actually of type: org.rlcommunity.rlglue.codec.types.Observation.
Java and Matlab play very well together, so you can do things like:
>> testObs=org.rlcommunity.rlglue.codec.types.Observation(); >> testObs.intArray=[1 2 3 4]; >> testObs.doubleArray=[0.1 0.5]; >> testObs.charArray='fun things!'; >> testObs.toString() ans = numInts: 4 numDoubles: 2 numChars: 11 1 2 3 4 0.1 0.5 f u n t h i n g s !
Useful utility methods for connecting, disconnecting, and running with the rl_glue executable server are in the agent directory of the Matlab codec source.
Useful utility methods for connecting, disconnecting, and running with the rl_glue executable server are in the environment directory of the Matlab codec source.
The online FAQ may be more current than this document, which may have been distributed some time ago.
We're happy to answer any questions about RL-Glue. Of course, try to search through previous messages first in case your question has been answered before.
Revision Number: $Rev: 688 $ Last Updated By: $Author: brian@tannerpages.com $ Last Updated : $Date: 2009-02-09 12:13:44 -0700 (Mon, 09 Feb 2009) $ $URL: https://rl-glue-ext.googlecode.com/svn/trunk/projects/codecs/Matlab/docs/MatlabCodec.tex $
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html MatlabCodec.tex -split 0 -dir html -mkdir -title 'RL-Glue Matlab Codec' -local_icons -math
The translation was initiated by Brian Tanner on 2009-02-09