Skip to content

Conversation

@Joeytje50
Copy link
Contributor

The game Conquest is a minigame that is part of the MMORPG 'RuneScape 3', but it functions fully like a standalone board game. This implementation makes it possible to play the game using TAG, and supports MCTS agents. Included are also the json files I've used to analyse the game, but if desired I can remove those from my PR in order to clean up the possible clutter in the JSON folder.

Pim Bax added 30 commits October 5, 2024 15:28
The random and MCTS players are still not working yet, but most of the
core gameplay for Conquest is now implemented for human players using
the GUI.
From now on I'll try splitting up changes into actual sensible commits.

Important gameplay features / GUI features that are still missing are:
- Making troops disappear from the board when they are defeated. Not
  hard, just not implemented yet.
- Commands properly being hidden from the opponent, before use
- Being able to set up a layout without having to mess with the files
- Proper user-friendly GUI (e.g. using double click to select / move /
  attack troops), which is a very low priority unless playtesting turns
  into a significant time investment.
- Probably a bunch of other things I'm missing
There seems to be an issue however, since the deck won't be hidden from
the other player.
Computing available actions now actually lists an individual action for
each possible action. That is, a separate action is listed for each
possible troop that can be selected, or move that can be made, or attack
that can be executed, or command that can be applied.

However, the GUI suffers greatly from this; due to this change, the
buttons at the bottom are basically a massive list of 'select' actions,
after which they become a list of 'move' actions, etc. So, to remedy
that, I've made the updateButtons fetch a separate list of generic
actions, which rely on the highlighted cell/troop/command. I'm not sure
if circumventing the ForwardModel is wise, though...

Still to do: computing the actual maximum amount of possible moves at
one time. This is currently 10, which is too small.
Made a lot of generic action methods work through a single abstract
CQAction class, which makes it a lot easier to check some things.

This was mainly used to improve the GUI's checking which actions are
allowed, based on the highlighted cells. This makes it possible to use
the same Action objects, without creating a massive list of possible
action buttons.

Also introduced a utility class which could be useful to move some
methods from CQGameState into, since there are several utility functions
in the CQGameState class now, which don't necessarily have to be in
there per se.
Apply Floodfill to the GUI, because calculating distances requires
access to the gamestate, so the update() function would need to
pre-calculate all distances in order to make drawCell get access to the
info. Since floodfill is more efficient than doing A* on all possible
cells, that is done on every update instead.
Heuristic may need improvement but for now it's probably fine.
Double click your troop -> select it
Double click an empty (accessible) cell -> move there
Double click another (attackable) troop -> attack it
Red and green is both visible on bright backgrounds. Not sure if these
will be final though.
After restructuring the order of the methods in CQGameState, the utility
functions generally fit better in the list of gamestate methods, or in
another class.
Also added a verification to check if a command is already applied to a
troop, to prevent double berserking or something. That would make Winds
of Fate too overpowered (and is not part of the original game).
There still seem to be some issues with copying of commands, which
causes MCTS to be able to execute commands without having to wait for
the cooldown. I'll investigate the root cause of this at some later
point.
The heuristic score works better if it uses the same formula all the way
through, instead of switching to the game score at a terminal state.
Also made some small changes to some other parts: the GUI now shows a
small Conquest icon instead of a specific command, if the enemy has not
shown its commands. Also, command actions now get properly checked
whether or not they can be applied.

But most importantly, this commit implements the different strategies
for Conquest, as described by the RuneScape Wiki's strategy guide. These
will be put up in a tournament to determine the best strategy.
This requires a few modifications:
 - Allow the number of threads nThreads to be specified; default is the
   old behaviour of single-threaded (sequential) execution
 - Change all non-recursive matchup evaluations to be executed by a
   single thread executor, with a pool of `nThreads` threads
 - Wait for all threads to finish; no timeout is specified, but
   potentially threadTimeout could be added as a parameter (note 1)
 - To avoid race conditions, for now the actual execution of the
   game.run() is still all synchronized. However, this is mainly to
   avoid the gamestate from being overwritten (note 2)
 - updatePoints now requires a gamestate to be passed along, because
   there is no longer a guarantee `this.game.getGameState()` is actually
   the relevant gamestate.

[1] awaiting termination requires a timeout to be set, but since this is
not present during normal execution either, a timeout of infinity hours
is set.
[2] If `game.run()` simply returns a copy of the final game state, this
synchronized block can be reduced to a smaller part of the code.
Future improvements after that will have to be based on game.run()
having a local copy of an initial gamestate, instead of using a shared
gamestate inside Game.
Because parallel games require completely separate game states, forward
models, and player agents, I've made a separate runInstance() function,
which uses none of the Game's own variables, instead using scoped
variables that are copies of the Game variables.

Because these also need to be passed along to the terminate() and
oneAction() functions, these need wrapper functions that allow calling
without passing these variables along (for non-parallel running)

With this, basically only the logging needs to be synchronized, with the
rest just making use of their own game instances.

Running this will actually give reasonable results, indicating that it
works correctly.
Having all of this tournament code in the run() method, instead of
having its dedicated method, makes it harder to work with the run method
itself. Splitting it off makes a lot of sense, since it's an entire
functionality that is only needed in some instances.
Due to the way PS is implemented currently, it's not possible to
parallelize the individual evaluations within each run, which would be
far superior time improvement compared to just running all runs in
parallel. This is because the individual evaluations are all overseen by
the NTBEA library, which controls the loop, and has no parallelized
`fitness` loop function, nor can it be @overwritten (since it's
package-private).

However, the individual runs can be made parallel:
 - NTBEA objects have a copy function, to ensure they do not interfere
   with eachother; NTBEA runs are executed on a copy of the main NTBEA
   object.
 - Non-multithreaded runs work the same, but instead just pass on
   `this`.
 - After parallel runs are completed, some final tallying of the scores
   is done in order to get all data in the right place

Still a big TODO: Round Robin tournaments should have the exhaustive
self play converted to an iterative version, instead of a recursive one.
Without iterative version, it is significantly harder to parallelize (or
perhaps impossible; I don't want to know).

After that is done, I think the most important parts of the software has
been parallelized.
Using a separate method to generate a list of matchups, we can
iteratively call each of the matchup evaluations, meaning we can
parallelize the evaluation calls, when parallelization is enabled.

My IDE also decided to clean up the Math.sqrt -> sqrt.
When doing ParameterSearch, only the repeats of the runs are
parallelized, meaning there is no need to allocate more threads than
that. Also modified the param documentation to explain this fact, and
updated the param doc to explain the effect `nThreads` has.
Instead of checking whether or not to parallelize multiple separate
times, all of that decision making is now handled by a single wrapper
method.
For parametersearch, the parallel evaluations make it confusing which
thread is doing what, when debugging is on. With these additions to
GameEvaluator, the hashCode for the evaluator that is outputting the
specific debugging line is also printed, in order to be able to
reconstruct which instance has run which order of matchups.

I did not implement the same in other debug messages, because in those
cases of parallelized messages, I feel like the information in stdout
doesn't necessarily need to be reconstructed in the same way.

I've also made the `verbose` parameter get passed on through to the
tournament after ParameterSearch, since this seems to me like expected
behaviour. If this is not desired I'll just change it back.
Pim Bax added 27 commits February 3, 2025 16:42
Also some small tweaks to agent json
This was only possible in combination with Winds of Fate, which caused
there to be no actions in the rollout. Fixed the rules no longer allow
this. I would need to test if there are indeed guardrails in place in
the original game, but I would assume so.
Also modify the game to allow an existing MCTS parameter to be set to
change the setup that's used.
Also fix an issue with shield wall not functioning properly.
With the old code there was technically a difference between the
intended setup order and the actual order, being flipped by the x and y
axes. Because this means both sides were flipped by the same axes, this
did not change any outcomes, but it is cleaner to do it like this for
consistency
Comment on lines +126 to +129
@Override
public CQMCTSPlayer copy() {
return this;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should actually copy the agent - especially given that you have state in the agent. In RunGames this can cause odd effects, as it copies the agent for each game (see Pull Request 329).

Comment on lines +23 to +24
// HACK: This uses `omaVisits` as a parameter to register which troop setup it uses.
// This avoids having to create a separate MCTSparam for this playout.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honest :-). I think you can get away with this given it's safely in a game-specific subclass.

@Override
protected List<AbstractAction> _computeAvailableActions(AbstractGameState gameState) {
CQGameState cqgs = (CQGameState) gameState;
return cqgs.getAvailableActions();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very uncomfortable with this, as it side-steps a core design point that this logic should be in the forward model and not in the game state.
Could you just move this method into forward model please? (State and Model are deliberately in the same package so that Model can easily access the package-private elements of State)

*
* @return the full list of actions available at the current point in the game.
*/
public List<AbstractAction> getAvailableActions() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should definitely in the forward Model to avoid bloating the game state with game logic.

Comment on lines +446 to +473

public void endTurn() {
if (checkWin()) return; // game has ended, and checkWin() has finished it
if (getCurrentPlayer() == getFirstPlayer() && gamePhase.equals(CQGamePhase.SetupPhase)) {
// Setup phase was completed by starting player, but not by the 2nd player
setGamePhase(CQGamePhase.SetupPhase);
return;
} else {
setGamePhase(CQGamePhase.SelectionPhase);
}
selectedTroop = -1;
for (Troop troop : getTroops(-1)) {
troop.step(getCurrentPlayer());
}
int nextPlayer = getCurrentPlayer() ^ 1;
if (getTroops(nextPlayer).size() == chastisedTroopCount(nextPlayer)) {
removeChastise(nextPlayer);
}
for (Troop troop : getTroops(nextPlayer)) {
if (troop.hasMoved())
System.out.println("Troop was not stepped correctly, somehow...");
}
commandPoints[nextPlayer] += 25;
List<Command> list = chosenCommands[nextPlayer].getComponents();
for (Command cmd : list) {
cmd.step();
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should really in the Forward Model (which is the only place that calls it, in line with the design goal of keeping the rules in FM and only the state here)

Comment on lines +595 to +605
/**
* Flood fill to generate distances to the whole board. (currently only used in GUI)
* Used in case where all distances need to be known, due to being cheaper than performing A* on all cells.
*
* @param source Source cell from which to calculate distances
* @param maxDistance maximum distance to perform floodFill on, or 0 if no maximum is set.
* @return 2d integer array containing distances to the target square, or 9999 if unreachable.
*/
public int[][] floodFill(@NotNull Cell source, int maxDistance) {
int w = cells.length, h = cells[0].length;
List<Cell> openSet = source.getNeighbors(cells); // initial set of neighbors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only used in the GUI, then that is where the code should be. Like astar, this does not seem to actually use (or update) any of the game state data.

getGamePhase(), Arrays.deepHashCode(cells), troops, locationToTroopMap, selectedTroop, getCurrentPlayer(),
gridBoard, Arrays.hashCode(chosenCommands), Arrays.hashCode(commandPoints), getGameTick()
);
} catch (Exception e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 issues with hashCode():

  1. You include some items here which are not in equals(), e.g. and selectedTroop. This breaks the Java equals/hashcode contract which says that if A.equals(B) then they must have the same hashcode.
  2. You are also missing the action stack of extended actions and other superclass stuff. You have bought in game phase and tick explicitly, but a much better way to do this is use Objects.hash(super.hashCode(), ....subclass specific items...). This is already done for equals (as _equals() above is called by the superclass equals() method)

* Deterministic randomness; given a certain game state, this will always result in the exact same random number
* due to being initialised on the hashcode of the current gamestate every time it gets called. Since this game only
* has exactly 1 place that involves randomness, this won't affect very much at all, except that it makes the game
* essentially deterministic. Taking action A in game state S will always result in the exact same game state S'.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting...I'm curious as to what that place is, and why you want to make it deterministic if the game is genuinely random?
However, if there is any randomness then could you add a game parameter to run the game in either deterministic or random mode in CQParameters?


@Override
public abstract boolean equals(Object obj);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, none of the sub-classes have any additional state - they only differ in terms of the instantiating class (and game logic).

It is a good principle that equals/hashcode should refer to data at the same level. And this will avoid in this case a lot of code duplication, which is always risky.

A better pattern here is to implement equals, checking all of the fields included in the hashcode, and having a new _equals() that is abstract. The child classes can then just implement one line for this on the lines of return (obj instanceof SelectTroop)

Troop troop = cqgs.getTroopByLocation(highlight);
cqgs.setSelectedTroop(troop.getComponentID());
cqgs.setGamePhase(CQGameState.CQGamePhase.MovementPhase);
return gs.setActionInProgress(this);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a compilation error for me. setAction Inprogress no longer returns a boolean (and to be honest, neither should execute).just return true here.

@hopshackle
Copy link
Collaborator

Hi Pim,

I pressed the button before writing a summary. This all looks good, and it would be great to get conquest into the main repository. The main issues I currently have are:

  1. There is a chunk of game and other logic in CQGameState that should really be elsewhere (ForwardModel or a utility class for example). This is bloating the class and getting away from the key responsibility of GameState of holding a master copy of the data for the game and providing access to it (with suitable small helper methods).

  2. GameState also needs to have its hashcode standardised (and there is a small amount of work on the CQAction hierarchy to simplify this).

Congratulations on defending your thesis, and apologies that it took me so long to get round to reviewing this.

Cheers,

James

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants