Skip to content

Regex equivalence command-line interface#1

Open
JLimperg wants to merge 3 commits intopeterthiemann:masterfrom
JLimperg:patch-0
Open

Regex equivalence command-line interface#1
JLimperg wants to merge 3 commits intopeterthiemann:masterfrom
JLimperg:patch-0

Conversation

@JLimperg
Copy link
Copy Markdown

@JLimperg JLimperg commented Dec 6, 2015

This PR contains code for the regex-equiv programme I sent earlier. The regex parser is in src/Regexp/Parser.hs; the rest of this pull request consists mainly of build-related boilerplate because the parser requires some external packages.

For now, the modules
  Examples
  Regexp2
  Tests
are not listed in the Cabal file at all, thus won't be built. It would
be desirable to convert Tests into a proper test suite.
@peterthiemann
Copy link
Copy Markdown
Owner

Using attoparsec seems like overkill to me. Simply using parsec would have done the job just fine,
don't you agree?

@JLimperg
Copy link
Copy Markdown
Author

Certainly. I use Attoparsec over Parsec whenever I can (i.e. whenever I don't need good error messages) because the backtracking semantics are less annoying. But if you prefer Parsec (or the recent fork, Megaparsec), a rewrite should not take long.

(Earley would be another interesting alternative, trading performance for implementation complexity.)

@peterthiemann
Copy link
Copy Markdown
Owner

The thing is that Attoparsec is tuned to run on space efficient text encodings,
whereas Parsec processes plain lists, which we've got here to start with.

Earley is an algorithm to parse general CFG, which is not required in this generality.
It's also not an intrinsically functional algorithm.
Notwithstanding that a combinator parser can deal with non-CFLs.

On Dec 23, 2015, at 7:06 PM, Jannis Limperg notifications@github.com wrote:

Certainly. I use Attoparsec over Parsec whenever I can (i.e. whenever I don't need good error messages) because the backtracking semantics are less annoying. But if you prefer Parsec (or the recent fork, Megaparsec), a rewrite should not take long.

(Earley would be another interesting alternative, trading performance for implementation complexity.)


Reply to this email directly or view it on GitHub.

@JLimperg
Copy link
Copy Markdown
Author

JLimperg commented Jan 3, 2016

I'm not too concerned about efficiency issues because we won't be dealing with particularly big inputs. But as I said, if you like Parsec better, I'll port to Parsec.

(I find Earley appealing because it is able to deal with left-recursive grammars, which would allow for a more natural specification of the grammar for infix operators. I might experiment a bit to see how bad the performance gets.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants