The context provided when a malfunction occurs has constraint rules describing actions that have already taken place and ones that require wait actions during the malfunction duration.

Action at timestep 14 in this example causes an issue. The encoding must generate this action, but the malfunction will already occur at timestep 14 and ignores all given actions.

This means primary encodings cannot solve Env's with malfunctions right now. Removing the last action from the context past actions might still cause desync issues. The train must wait at the first timestep of the malfunction, ignoring the last occured action as it did nothing in Flatland either.
Alternatively, a primary encoding can use the malfunction(ID,Duration,Timestep) fact to simulate the lost action. I will try this out next. Maybe this is better than changing the provided context.
The context provided when a malfunction occurs has constraint rules describing actions that have already taken place and ones that require wait actions during the malfunction duration.
Action at timestep 14 in this example causes an issue. The encoding must generate this action, but the malfunction will already occur at timestep 14 and ignores all given actions.
This means primary encodings cannot solve Env's with malfunctions right now. Removing the last action from the context past actions might still cause desync issues. The train must wait at the first timestep of the malfunction, ignoring the last occured action as it did nothing in Flatland either.
Alternatively, a primary encoding can use the malfunction(ID,Duration,Timestep) fact to simulate the lost action. I will try this out next. Maybe this is better than changing the provided context.