Hacker News

Learning Parser Combinators with Rust

by chwolfeon 4/19/2019, 11:24:38 AM with 11 comments

by YeGoblynQueenneon 4/19/2019, 5:24:27 PM
>> The novice programmer will ask, "what is a parser?"
>> The intermediate programmer will say, "that's easy, I'll write a regular expression."
>> The master programmer will say, "stand back, I know lex and yacc."
The Prolog programmer will write a Definite Clause Grammar [1], which is both a grammar and a parser, two-in-one. So you only have to do the easy and fun bit of writing a parser, which is defining the grammar.
Leaves plenty of time to get online and brag about the awesome power of Prolog or get embroiled in flamewars with functional programming folks [2].
______________
[1] https://en.wikipedia.org/wiki/Definite_clause_grammar
[2] Actually, DCGs are kiiind of like parser combinators. Ish. In the sense that they're executable grammars. But in Prolog you can run your programs backwards so your DCG is both a recogniser and a generator.
by intertextualityon 4/19/2019, 12:59:24 PM
On the reddit discussion of this [0], someone mentioned using a type of
fn parse(&self, input: &mut &str) -> Option<Output>
instead of the article's
fn parse(&self, input: &'a str) -> Result<(&'a str, Output), &'a str>
for composability. I found the article fascinating and plan on going back to see what an xml parsing implementation based on the former would act like.
[0]: https://www.reddit.com/r/rust/comments/bepi63/learning_parse...
by xymostechon 4/19/2019, 6:25:18 PM
This was such a wonderful read! I've been getting into Rust recently, and the sections on dealing with challenges that are specific to Rust were particularly useful. The way they created a new trait to turn `Fn(&str) -> Result<(&str, T), &str>` into `Parser<T>` was insightful, and the discussion of how they dealt with the growing sizes of types was something that I can imagine myself running into in the future.
Most importantly though, when they started writing `and_then`, my eyes lit up and I said "It's a Monad!" I think this is the first time I've really identified a Monad out in the wild, so I enjoyed that immensely.
by louthyon 4/19/2019, 8:39:12 PM
It doesn't feel very declarative in Rust. Personally, I'm finding it hard to see the intent (I haven't written a line of Rust in my life, so take that with a pinch of salt, but I am a polyglot programmer).
Really, Haskell's do notation is the big winner when it comes to parser combinators, as the direction of the flow of the parser is easy to follow, but also you can capture variables mid-flight for use later in the expression without obvious nested scope blocks.
It's possible to capture variables with `and_then` by the looks of it, but any suitably complex parser will start to end up quite an ugly mess of nested scopes.
I ported Haskell's Parsec to C# [1], it has LINQ which is similar to Haskell's Do notation. Simple parsers [2] are beautifully declarative, and even complex ones, like this floating point number parser [3], are trivial to follow.
[1] https://github.com/louthy/language-ext
[2] https://github.com/louthy/language-ext/blob/master/LanguageE...
[2] https://github.com/louthy/language-ext/blob/master/LanguageE...
by xixixaoon 4/19/2019, 12:46:20 PM
Nice article. I finally gave Rust a recently. It's really interesting how new languages evolve, and what "deficiencies" they exert. The article for example uses closures, but it's currently impossible in stable Rust to accept a closure that itself accepts a closure as an argument (while you can easily rewrite the same pattern with structs). The borrow checker could still do better on suggesting fixes to common problems (otherwise it's actually quite elegant). What struck me while reading this was the use of assert_eq!(expected, actual), as I've mostly seen the other order. Sure enough I checked and the macro does not define the order. That's unfortunate, as testing against "fixed" "expected" outcome is very common, and leads to more friendly testing devx (which in general while supported out of the box isn't great).
On the other hand, Rust's IDE support, built-in linting, is seriously impressive.
by norswapon 4/19/2019, 10:01:30 PM
If someone wants to have a look at the code of a cutting-edge parser combinator framework with focus on features + usability, I'll plug this here (it's in Java)
https://github.com/norswap/autumn4
WIP but 1.0.0 will land somewhere within the next two months, with a full user-guide (half of it is already written and available).
Constructive feedback welcome!
by tiuPapaon 4/19/2019, 4:37:28 PM
Okay, I am interested in this topic. Does anyone know of any good resources for exploring parser combinators further?
by ameliuson 4/19/2019, 3:47:48 PM
What is the class of languages that can be parsed with such parsers, in the sense of [1]?
[1] https://en.wikipedia.org/wiki/Context-free_grammar#Subclasse...
by lelfon 4/19/2019, 12:54:21 PM
https://news.ycombinator.com/item?id=19694793
by k0t0n0on 4/19/2019, 2:10:25 PM
nice read; Hi I also wrote a SQL dump parser using rust here the code.
> https://github.com/ooooak/sql-split
by vmchaleon 4/19/2019, 6:45:35 PM
I don't like Rust for this purposes. It doesn't have higher-kinded types and thus no applicatives or monads, which sort of misses the point.
I also object to the idea that parser combinators are an alternative to parser generators. They're each useful in different scenarios. But for something like XML the parser combinators will be slower.
I'd also be curious to see how the efficiency of parser combinators is affected by the absence of laziness in Rust. I seem to recall that laziness makes the analysis more complicated than you'd expect, but I need to find a source...