-
Notifications
You must be signed in to change notification settings - Fork 0
Algorithm Crates
The algorithm crates is where all the magic happens. Currently, each algorithm sits in its own crate (that is just a fancy name to say "Rust project") and are compiled separately. I think the idea is that the frontend doesn't need to load all algorithms at once, but for the time being that doesn't seem to happen.
Common code is hereby shared between two crates: graph and algorithm. The graph crate is introduced in another wiki article. We'll focus on the algorithm crate now.
This crate should be thought of as a framework. It defines a code architecture that those crates implementing the algorithms should follow. This gives consistency inside the code base, and allows for easy code reusability. It also gives the frontend a consistent interface that doesn't change much between different algorithms. Do note that some algorithms might want to expose additional functionality to the frontend, so they might opt to do some wasm-bindgen stuff themselves in addition to what the macros from the algorithm crate already define.
This is a declarative macro within the algorithm crate and should be called in each crate implementing an algorithm. It takes exactly one parameter which is a struct name. The Algorithm trait (explained later) needs to be implemented for said struct. If you take a look at the macro, you will see that it mostly passes through functionality from the private module of the crate and annotates functions with wasm-bindgen. This is common practice with macros: Place only the minimal required code inside of the macro, and extract as much functionality into private functions as possible. Do note though that the Rust compiler will not treat this module as private: It is a public module that we call private to convey its intent of not being used by other crates, and we give it the #[doc(hidden)] annotation to ensure it does not appear in any generated documentation. Also, those modules usually do not have any stability guarantees, which is less relevant for a private project but is good to know anyways.
So the TLDR is to always have one line export_algorithm_to_wasm!(MyAlgorithm) in each algorithm implementing crate, and to place functionality that is (a) exported to the frontend and (b) common for all algorithms into this macro. This leads us to the Algorithm trait that is required to be implemented for the algorithm struct.
This trait enables the macro in the previous section to do its magic work. Traits (like interfaces in Java) bundle some functionality (mostly functions, but they may also contain type definitions or constants) for some type. The algorithm trait asks for two types, the VisualisationState and Pseudocode types, a list of example graphs, a list of required and incompatible graph properties and a function to obtain a state machine from and initial state that can be used to run the algorithm.
The VisualisationState trait asks for the graph configuration, and a function to obtain an initial state. Further, it requires to be (de)serializable using serde. This is the most important part of this trait: The serialized version will be passed through to the frontend.
The Pseudocode trait is used to pass the pseudocode to the frontend. It also allows the algorithm to say at which point in the pseudocode it currently is, and pass variables in the pseudocode line to the frontend. Unlike the other traits you wouldn't implement this one directly, but rather use the proc-macro provided.
The Pseudocode macro is a procedural macro that you can use with the #[derive(Pseudocode)] syntax. You are probably familiar with this syntax from other macros, be it Debug or Clone from the Rust standard library or Serialize from serde. Most proc macros simply implement a trait for the type they are derived on, and the Pseudocode macro is no exception. It expects to receive an enum as its input, where the doc "comments" (I prefer to call the doc attributes as they really are just syntactic sugar) of each variant are passed to the frontend as the pseudocode description, and the variants themselves act to inform the frontend at which point in the pseudocode we're at. Do note that the frontend expects the description as LaTeX, but we do some preprocessing: Essentially, the doc attributes are parsed as if they were format strings. If you don't know format strings, take a look at std::fmt. The variables used in the format string are provided by each variant. Just take a look at one of the existing algorithms to gain an idea on how this is used in practice.
Well ... I have to start with saying that developing procedural macros is hard. Proc macros just receive some Rust tokens as input and produce Rust tokens as output, and those output tokens are then processed by the compiler as if it was written in the source code location the macro is called at. You want to have a good understanding of Rust before attempting to do anything with proc macros. And as always, it is a good idea to keep as much code as possible in "private" modules outside the macro. However, proc macros are also really cool to play with and enable so many cool things which is why I love them anyways.
The proc macro is declared in its own crate, just as all proc macros. This is due to the compiler having to compile them first to shared libraries, and then load them during the compile phase. Any code you write into the algorithm-impl crate will only live at compile time. Also, we're using the standard set of libraries for writing proc macros such as syn and quote. Make sure you are familiar with them before touching any proc macros.
Our macro then does a bunch of things: We start by ensuring that we have an enum with only unit or struct variants, no tuple variants. Each variable in a variant may be renamed to fit with weird "identifiers" in the LaTeX code such as v' which is certainly not a valid Rust identifier. We also parse all of the doc attributes as Rust format strings and split them up into the text and variable portions we need to implement the trait we have. In case you are unfamiliar with doc attributes, I just want to mention that the following two lines are exactly identical to the compiler:
///Lorem ipsum
#[doc = "Lorem ipsum"]This means that doc "comments" are not comments and are not removed by the compiler. Rather, all doc "comments" are syntactical sugar for doc attributes, and are read by the compiler as if they were attributes, which are preserved and passed to our macro. At the end, our macro just assembles all of the input we parsed into a suitable implementation for the Pseudocode trait. During this phase, we use canonical paths (such as ::algorithm::Pseudocode) everywhere and no imports. This is due to the macro having no control over the environment it is executed at. Someone could define their own trait and call it Pseudocode even if it has nothing to do with the trait we write an implementation for. This is a really important aspect of every macro (also in part declarative macros, those can be screwed up as well) and why it is so easy to write macros that produce compile-time errors half of the time.
On the note of compile-time errors, we want to use syn::Error or equivalent for error reporting in proc macros, at least until diagnostics in Rust are finally stabilised. This gives the ability to provide "hygiene": The ability to link an error message to a specific location in the macro's input. Also, it allows us to produce multiple error messages at once. Panic-ing in proc macros is discouraged: It produces worse error messages, is unhygienic and allows for only one error per compiler invocation.
After this fun excerpt about macros, let's get into another fun excerpt about futures. But let's get one thing out of the way: Nothing happens asynchronously. Nothing. We just abuse the way Rust futures work. Please start by forgetting everything you have ever heard about asynchronous programming or futures. With a completely erased mind, read this nice part of the async book: The Future Trait.
Now that we have forgotten everything about async and know the internals of Future, we can move on. Let's first agree on discarding the waker, we don't need that. At no point will our algorithm be dependent on external data so we just write some dummy waker that does nothing and will never be called. Also, we don't need any output from the future, as the value of the algorithm for this project is in its execution, not its result. Basically we are now left with
trait SimplestFuture {
fn poll(&mut self) -> Poll;
}
enum Poll {
Ready,
Pending,
}This is not the way Rust defines futures, but it's the most essential part of it. So when we have a future, we want it to return Pending every time we've hit a breakpoint in the algorithm, and return Ready when the algorithm is completely done. The naming is a bit off for us as we aren't waiting for anything with Pending and we don't have any results to show off when Ready, but it's what we got to work with. Now let's look at the Rust compiler. When we give it a function like this
async fn my_algorithm(state: &mut State) {
// do some work
state.breakpoint().await;
// do more work
state.breakpoint().await;
}then the Rust compiler gives us an implementation of the Future trait for free. This is what we use to build our state machine. Our state machine has a storage for the current state, and passes a reference with write access to our async function. Whenever we hit a breakpoint, we store the current state in that storage and await the breakpoint future. The breakpoint's future is designed in a way that it returns Pending exactly once and Ready thereafter. However, remember that futures always run as long as they can. That means that when a breakpoint returns Pending, this is the result of the current poll invocation, but when it returns Ready, we don't return but just continue on till we hit the next breakpoint. That way, we can step through the algorithm with breakpoints, all powered by the Rust compiler itself.
This has a lot of advantages. The first and most prevalent one being that the algorithm is defined in one, readable function. It is also enough of a common interface for the Algorithm trait to work without any further abstraction. And it is extremely cool to use Rust to its absolute limits. However, it also comes with some drawbacks. For example, we cannot guarantee at compile time that only breakpoints and no other futures are await-ed. This won't work, as other futures might actually wait for something, but our dummy waker doesn't support that. Also, it means that going through the code is strictly forward. Going back is not possible, and a history has to be implemented differently. And lastly, you just had to read through all of this text to gain an understanding of how this works.
Well, maybe there's another drawback. We cannot send this weird future thing through the backend-frontend-glue. Actually, we cannot even name the type. It just exists in Rust, so that's where we have to store it. This happens in the private module that was mentioned earlier when we discussed the macros already. We store all state machines in a mutex-guarded map that just assigns a number to each state machine. That number can then be shared with the frontend. It has no meaning to the frontend, but the frontend can use it when it wants something from the backend, and then we can use it to get back the state machine that we wanted.
Besides that, the private module has all the code that is necessary to translate the weird requests coming from the frontend to something we can pass to the Algorithm implementation. We use wasm-bindgen in combination with the json-interface from gloo-utils. wasm-bindgen alone just doesn't have enough types mapped to be a feasible option, and going via JsValue was used previously but turned out to be an absolute pain. Basically, JsValue also uses serde but in an extremely fragile way. Basic changes to the Rust code completely imploded the glue between the backend and frontend. This is why we settled with the json-interface: It gives us room to make changes to the backend code without affecting the frontend too much. However, we do need to keep in mind that the floating-point values NaN, Inf and -Inf do not exist in json. You can look at the dijkstra algorithm to see how we worked around this limitation in the past.
As we discussed earlier, our state machine does not allow us to go back. We still realized this mechanism (not in the state machine) in the rust/algorithm/src/history module and use it in the private macro rust/algorithm/src/private.rs: Essentially we store every visualization state with the corresponding pseudo-code line by cloning the state and storing a unsigned value in a vector.
We then use a counter which tracks at which point of the history we are; you can imagine it as flipping one book page at a time.
Okay, enough introductory talk. Let's actually write an algorithm. First, we add a new folder in the rust directory with the name of our algorithm. We need to create at least two files: Cargo.toml and src/lib.rs. The first one is pretty simple and could look like this:
[package]
name = "GIVE ME A NAME PLEASE"
version = "0.1.0"
publish = false
edition = "2021"
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
algorithm = { path = "../algorithm" }
graph = { path = "../graph" }
console_error_panic_hook = "0.1.7"
serde = { version = "1.0", features = ["derive"] }
wasm-bindgen = "0.2.83"The first section is pretty boring for us. We give our crate a name and ignore almost everything else we can define here. Most of the settings in the [package] section are only useful if we were to publish our code to <crates.io>, but we won't be doing that.
The next section, [lib], is just some boilerplate for wasm.
The [dependencies] section then loads some necessary dependencies, including our algorithm and graph libraries. The example above contains the bare minimum of dependencies, depending on the algorithm you are implementing it might be helpful to add more dependencies.
Let's get started with the source code. Don't worry, the full code is ready for you to copy once you've read through this wiki. We'll first add some minimal boilerplate to the src/lib.rs file.
#![warn(rust_2018_idioms, unreachable_pub)]
#![forbid(elided_lifetimes_in_paths, unsafe_code)]
/// initialise panic handler
#[wasm_bindgen(start)]
pub fn init_panic_handler() {
console_error_panic_hook::set_once();
}Okay, now let's define the algorithm. We start by typedef'ing our graph configuration, and defining a visualisation state whose values will be shared with the frontend:
type MyAlgoConfiguration = (usize,);
#[derive(Clone, Debug, Serialize, Deserialize)]
#[serde(rename_all = "camelCase")]
pub struct MyAlgoVisualisationState {
pub graph: Graph<MyAlgoConfiguration>,
pub helptext: String
}
impl VisualisationState for MyAlgoVisualisationState {
type Configuration = MyAlgoConfiguration;
fn new(graph: Graph<MyAlgoConfiguration>, start_node: usize) -> Self {
Self {
graph: graph.preprocess_links_as_undirected(),
helptext: "Initialized".to_string()
}
}
}Okay, that's great. Now onto some pseudocode
#[derive(Clone, Copy, Debug, Eq, PartialEq, Pseudocode)]
enum MyAlgoPseudocode {
Initialise,
/// Eat {food}.
Eat { food: &'static str },
/// Sleep for {hours}.
Sleep { hours: usize },
/// Repeat.
Repeat
}And finally our algorithm
async fn my_algo(
mut state: MyAlgoVisualisationState,
out: States<MyAlgoPseudocode, MyAlgoVisualisationState>
) {
let mut apple = true;
loop {
if apple {
state.helptext = "Eat an apple to stay healthy".into();
out.yield_state(
MyAlgoPseudocode::Eat { food: "an apple" },
state.clone()
).await;
} else {
state.helptext = "Eat some rice to refill your energy".into();
out.yield_state(
MyAlgoPseudocode::Eat { food: "some rice" },
state.clone()
).await;
}
apple = !apple;
state.helptext = "Get a good night's rest".into();
out.yield_state(
MyAlgoPseudocode::Sleep { hours: 8 },
state.clone()
).await;
state.helptext = "Rinse and repeat".into();
out.yield_state(MyAlgoPseudocode::Repeat, state.clone()).await;
}
}All that's left to do now is export everything to the frontend.
struct MyAlgo;
impl Algorithm for MyAlgo {
type VisualisationState = MyAlgoVisualisationState;
type Pseudocode = MyAlgoPseudocode;
const EXAMPLES: &'static [ExampleGraph<'static>] = &[];
const REQUIRED_PROPERTIES: &'static [Property] = &[];
const INCOMPATIBLE_PROPERTIES: &'static [Property] = &[];
fn new(state: Self::VisualisationState) -> StateMachine<Self::Pseudocode, Self::VisualisationState> {
StateMachine::new(|out| my_algo(state, out))
}
}
export_algorithm_to_wasm!(MyAlgo);To ensure your new algorithm gets built properly, add it to the workspace (the Cargo.toml file in the root of the project) and update the PACKAGE_FOLDERS list in the rust/build.js file.
If you want this code ready for copy-pasting, take a look at the rust/algorithm/example/ folder.
If you require more functionalities that the user should be able to use then you need to export them additionally using wasm_bindgen. Make sure to rename it so that it is not in snake_case but rather in camelCase.
There is no strict rule for how you should do the unit tests but for the sake of unity, we would prefer that you create a module tests.rs in the respective src folder for your algorithm. To test parts of your algorithm, you need to run a state machine and validate intermediate states. You may take a look at other unit tests such from Dijkstra to get a better understanding on how to do this.
To run tests in general, you can use cargo test.
As a member of the Backend, you are also responsible for the bindings. This means, do not forget to add the dependency of your generated wasm binaries to the package.json folder.