Skip to content
sofian edited this page Jul 5, 2013 · 11 revisions

Roadmap

0.1 - Released 2011.10.27

Objective: Working implementation of RL & neural nets within the main architecture

  • main abstract components: Agent, Environment, Observation, Action
  • reinforcement learning module:
  • neural network
  • modular policy (e-greedy, softmax)
  • modular reward function
  • basic documentation
  • examples
  • README / LICENSE / INSTALL
  • Easy switch AVR vs PC

0.2 - Released 2013.07.04

Objective: Stabilization and documentation

  • reorganize file structure
  • document and comment all classes
  • create more concise examples
  • clean-up the build process:
  • the objects (.o) should compile in sub-directories (build/)
  • create a standard installation procedure (configure -> make -> install)
  • allow saving/loading of models
  • platform independent (look at Design Patterns for ideas)
  • think about simulation vs hardware, for example to allow for offline pre-training
  • extras:
  • libmapper plugin
  • arduino plugin
  • possibility to stop learning (just act with some policy)

0.3

Objective: Integrate Behavior Trees

  • Port libbehavior to a "static" version
  • Implement a BehaviorTreeAgent
  • Create an example and unit test

References:

0.4

Objective: Integrate one or many types of swarm algorithms / cellular automata / L-system (*) Not sure yet if we really need this... we'll see...

0.5

Objective: Stabilization and documentation

0.6

Objective: Augment the RL library

  • implement discrete-states (ie. non-neural nets)
  • implement continuous actions methods
  • advanced methods in RL (to revisit)
  • gaussian exploration (variant of softmax)
  • ways to solve exploration vs exploitation:
  • Directed exploration strategies
  • limiting greedy exploration
  • E3
  • R-Max
  • Actor-critic
  • hierarchical RL

0.7

Objective: Stabilization and documentation

0.8

Genetic algorithms? Implementing Genetic Algorithms on Arduino Micro-Controllers

0.9

More advanced learning methods

1.0

Full-fledge agent-based framework

Clone this wiki locally