I was going through the code and thinking of the changes that should be made in the case of a more complex game (Chess for example...) where ideally the allowed actions would be a list of tuples (piece position, target position), and I'm having a slight suspicion that the agent code has to be altered in some way. Any input on this ?