Welcome to DeeR’s documentation!¶

DeeR (Deep Reinforcement) is a python library to train an agent how to behave in a given environement so as to maximize a cumulative sum of rewards. It is based on the original deep Q learning algorithm described in : Mnih, Volodymyr, et al. “Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533. (see What is deep reinforcement learning?)

Here are key advantages of the library:

Contrary to the original code, this package provides a more general framework where observations are made up of any number of elements : scalars, vectors and frames (instead of one type of frame only in the above mentionned paper). The belief state on which the agent is based to build the Q function is made up of any length history of each element provided in the observation.
You can easily add up a validation phase that allows to stop the training process before overfitting. This possibility is useful when the environment is dependent on scarce data (e.g. limited time series).
You also have access to advanced techniques such as Double Q-learning and prioritized Experience Replay that are readily available in the library.

In addition, the framework is made in such a way that it is easy to

build any environment
modify any part of the learning process
use your favorite python-based framework to code your own neural network architecture. The provided neural network architectures are based on Theano but you may easily use another one.

It is a work in progress and input is welcome. Please submit any contribution via pull request.

What is new¶

Version 0.2¶

Standalone python package (you can simply do pip install deer)
Integration of new examples environments : The pendulum on a cart, PLE environment and Gym environment
Double Q-learning and prioritized Experience Replay
Augmented documentation
First automated tests

Future extensions:¶

Add planning (e.g. MCTS based when deterministic environment)
Several agents interacting in the same environment
...

User Guide¶

API reference¶

If you are looking for information on a specific function, class or method, this API is for you.