GitHub - patrontech/KIBA-Parsimonious: Kotlin parser combinator library

Parsimonious Parser Combinators

Parsers and parser combinators are means of parsing a stream of some type into some other type. For example, one might start with a string whose contents are a mathematical expression such as 2 + 3. We want to get from this string to an array of tokens which can then be further processed to evaluate the expression itself or display it with syntax highlighting or for some other purpose.

I can’t do justice to this topic here. Wikipedia has the canonical article and your favorite search engine will have lots more information.

For those that already know a bit about parser combinators, a Parsimonious parser is a function with the following type signature:

typealias Parser<E, T> = Context<E>.() -> T

A parser combinator, as opposed to a parser, is a function which takes zero or more parsers as input and returns another parser as output. For example, the infix function or is a combinator:

infix fun <E, T> Parser<E, T>.or(rhs: Parser<E, T>): Parser<E, T> {
    val lhs = this
    return {
        try {
            lhs(this)
        } catch (e: ParseException) {
            rhs(this)
        }
    }
}

This is simpler than it looks. The or combinator returns a parser which first attempts the left-hand parser. If that fails, it tries the right-hard parser, and if that fails, parsing fails.

A note on style

Parser combinators are a technique from functional programming, not object-oriented programming. As such, Parsimonious includes only one object, Context<E> and many functions declared at the top level. There are also many infix functions. This may feel strange and not very "Kotlin", but attempting to write parser combinators in an object-oriented style results in code that is difficult to read and extremely verbose.

The S suffix found on identifiers such as ParserS, countS and manyS is used to indicate that these are special variants that work on a Context<Char> and return Strings rather than Char or List<Char>.

The fundamental combinators

While there are others, the three most fundamental combinators are satisfy, count and or. Almost all other combinators are built on top of these. If you understand these, you can understand everything.

`satisfy`

The satisfy combinator takes (E) -> Boolean as its argument. If the element at the Context's index passes the test, it is returned. If not, a ParseException is thrown.

val e = satisfy<Char> { it == 'e' }
val result = Context.fromString("e") parseWith e

`count`

The count combinator takes three arguments: a minimum count, a maximum count, and a parser and attempts to match that parser a valid number of times.

val e = satisfy<Char> { it == 'e' }
val result = Context.fromString("eeeee") parseWith count(2, Int.MAX_VALUE, e)

This attempts to march the e parser at least 2 times, which obviously succeeds.

`or`

The or combinator has already been described above.

Why use parser combinators? They look difficult.

In simple cases, you don't need them. Take this expression for instance:

x AND y AND NOT z

This can be parsed fairly easily using string splitting and regular expressions, although parser combinators are hundreds of times faster in most cases than regular expressions. But things fall apart quickly as soon as we allow parenthetic expressions and recursion:

x AND y AND NOT (z OR NOT (a AND b))

As soon as recursion and nesting are allowed, regular expressions and string splitting fall flat on their faces. The easiest way — really, the only reliable way — to parse the expression above is to use a formal parsing methodology such as a recursive descent parser. Really, we have two options: write a custom recursive descent parser or use a library that helps us implement a recursive descent parser. Parsimonious is one such library.

To see how to implement a parsing grammar for x AND y AND NOT (z OR NOT (a AND b)), look at ConditionTest, which implements and tests a complete grammar for expressions of this type.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.idea		.idea
gradle/wrapper		gradle/wrapper
lib		lib
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parsimonious Parser Combinators

A note on style

The fundamental combinators

`satisfy`

`count`

`or`

Why use parser combinators? They look difficult.

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Parsimonious Parser Combinators

A note on style

The fundamental combinators

satisfy

count

or

Why use parser combinators? They look difficult.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`satisfy`

`count`

`or`

Packages