Skip to content

Compiler

Lennart Augustsson edited this page Apr 11, 2026 · 8 revisions

The compiler is written in Micro Haskell. It takes a name of a module (or a file name) and compiles to a target (see below). This module should contain the function main of type IO () and it will be the entry point to the program.

Compiler flags

  • --version show version number
  • -i set module search path to empty
  • -iDIR append DIR to module search path
  • -oFILE output file. If the FILE ends in .comb it will produce a textual combinator file. If FILE ends in .c it will produce a C file with the combinators. For all other FILE it will compile the combinators together with the runtime system to produce a regular executable.
  • -r run directly
  • --interactive enter interactive mode, even with module arguments
  • -v be more verbose, flag can be repeated
  • -CW write compilation cache to .mhscache at the end of compilation
  • -CR read compilation cache from .mhscache at the start of compilation
  • -C short for -CW and -CR
  • -T generate dynamic function usage statistics
  • -z compress combinator code generated in the .c file
  • -l show every time a module is loaded
  • -s show compilation speed in lines/s
  • -XCPP run cpphs on source files
  • -Dxxx passed to cpphs
  • -Ixxx passed to cpphs
  • -tTARGET select target
  • -a set package search path to empty
  • -aDIR prepend DIR to package search path
  • -PPKG create package PKG
  • -L[FILE] list all modules in a package
  • -Q FILE [DIR] install package
  • -ddump-PASS debug, show AST after PASS. Possible passes parse, derive, typecheck, desugar, toplevel, combinator, all
  • -- marks end of compiler arguments

With the -v flag the processing time for each module is reported. E.g.

importing done MicroHs.Exp, 284ms (91 + 193)

which means that processing the module MicroHs.Exp took 284ms, with parsing taking 91ms and typecheck&desugar taking 193ms.

With the -C flag, the compiler writes out its internal cache of compiled modules to the file .mhscache at the end of compilation. At startup it reads this file if it exists, and then validates the contents by an MD5 checksum for all the files in the cache. This can make compilation much faster since the compiler will not parse and typecheck a module if it is in the cache. Do NOT use -C when you are changing the compiler itself; if the cached data types change the compiler will probably just crash.

Interactive mode

If no module name is given the compiler enters interactive mode. You can enter expressions to be evaluated, or top level definitions (including import). Simple line editing and TAB completion is available.

All input lines as saved in ~/.mhsi. This file is read on startup so the command history is persisted.

Available commands:

  • :quit Quit the interactive system
  • :clear Get back to start state
  • :del STR Delete all definitions that begin with STR
  • :reload Reload all modules
  • :type EXPR Show type of EXPR
  • :kind TYPE Show kind of TYPE
  • :set [FLAG] Set flag
  • :edit [FILE] edit a file or the last error location
  • :save FILE save all definitions to file
  • :find NAME go to the definition of NAME
  • :main [ARGS] run main with the given arguments
  • :! CMD execute shell command
  • :help show help message
  • expr Evaluate expression.
  • defn Add definition (can also be an import)

Bootstrapping

The compiler can compile itself. To replace bin/mhs with a new version, run make bootstrap. This will recompile the compiler twice and compare the outputs to make sure the new compiler still works.

Bootstrapping with Hugs

It is also possible to bootstrap MicroHs using Hugs. That means that MicroHs can be built from scratch in the sense of bootstrappable.org. To compile with Hugs you need a slightly patched version of Hugs and also the hugs branch of MicroHs.

The patched version of Hugs is needed to work around undefined behavior with arithmetic overflow. Hugs provided by Linux distributions or other third party package managers may or may not work.

Targets

The configuration file targets.conf (in the installation directory) defines how to compile for different targets. As distributed, it contains the targets default and emscripten. The first is the normal target to run on the host. The emscripten target uses emcc to generate JavaScript/WASM. If you have emcc and node installed you can do

mhs -temscripten Example -oout.js
node out.js

to compile and run the JavaScript. The generated JavaScript file has some regular JavaScript, and also the WASM code embedded as a blob. Running via JavaScript/WASM is almost as fast as running natively.

Environment variables

  • MHSDIR the directory where lib/ and src/ are expected to be. Defaults to ./.
  • MHSCC command use to compile C file to produce binaries. Look at the source for more information.
  • MHSCPPHS command to use with -XCPP flag. Defaults to cpphs.
  • MHSCONF which runtime to use, defaults to unix-32/64 depending on your host's word size
  • MHSEXTRACCFLAGS extra flags passed to the C compiler

Compiler modules

  • Abstract, combinator bracket abstraction and optimization.
  • Compile, top level compiler. Maintains a cache of already compiled modules.
  • CompileCache, cache for compiled modules.
  • Deriving, do deriving for various type classes.
  • Desugar, desugar full expressions to simple expressions.
  • EncodeData, data type encoding.
  • Exp, simple expression type.
  • ExpPrint, serialize Exp for the runtime system.
  • Expr, parsed expression type.
  • FFI, generate C wrappers for FFI.
  • Fixity, resolve operator fixities.
  • Flags, compiler flags.
  • Graph, strongly connected component algorithm.
  • Ident, identifiers and related types.
  • IdentMap, map from identifiers to something.
  • Interactive, top level for the interactive REPL.
  • Lex, lexical analysis and indentation processing.
  • Main, the main module. Decodes flags, compiles, and writes result.
  • MakeCArray, generate a C version of the combinator file.
  • Parse, parse and build and abstract syntax tree.
  • StateIO, state + IO monad.
  • SymTab, symbol table manipulation.
  • TCMonad, type checking monad.
  • Translate, convert an expression tree to its value.
  • TypeCheck, type checker.

Clone this wiki locally