When parsing text, Pegger on it's own will not return anything useful. Instead it is up to you to define useful rule actions that get executed when a rule passes; usually when a successful match happens.
Recursive / HTML Parsing
A well known limitation of regular expressions is that they can not match nested patterns, such as HTML. (See StackOverflow for explanation.) Pegger to the rescue!
Below is an example that parses nested XML tags and throws a ParseErr should an incorrect end tag be used. It uses the NamedRules helper class to reference rules to get around inherent problems with recursion.
using xml
using afPegger
class Example {
Void main() {
Parser#.pod.log.level = LogLevel.debug
elem := TagParser().parseTags("<html><title>Pegger Example</title><body>Parsing is Easy!</body></html>")
echo(elem.writeToStr) // -->
// <html>
// <title>Pegger Example</title>
// <body>Parsing is Easy!</body>
// </html>
elem = TagParser().parseTags("<html><title>Pegger Example</oops></html>")
// --> sys::ParseErr: End tag </oops> does not match start tag <title>
}
}
class TagParser : Rules {
XElem? root
XElem? elem
XElem parseTags(Str tags) {
Parser(rules).parse(tags.in)
return root
}
Rule rules() {
// use 'NamedRules' to define rules in any order and to avoid recursion
rules := NamedRules()
element := rules["element"]
startTag := rules["startTag"]
endTag := rules["endTag"]
text := rules["text"]
rules["element"] = sequence([startTag, zeroOrMore(firstOf([element, text])), endTag])
rules["startTag"] = sequence([char('<'), oneOrMore(anyAlphaChar), char('>')]) .withAction { pushStartTag(it) }
rules["endTag"] = sequence([str("</"), oneOrMore(anyAlphaChar), char('>')]) .withAction { pushEndTag(it) }
rules["text"] = oneOrMore(anyCharNot('<')) .withAction { pushText(it) }
return element
}
Void pushStartTag(Str tagName) {
child := XElem(tagName[1..<-1])
if (root == null)
root = child
else
elem.add(child)
elem = child
}
Void pushEndTag(Str tagName) {
if (tagName[2..<-1] != elem.name)
throw ParseErr("End tag ${tagName} does not match start tag <${elem.name}>")
elem = elem.parent
}
Void pushText(Str text) {
elem.add(XText(text))
}
}
Note that only Chuck Norris can parse HTML with regular expressions.
Dynamic Rules
Should you require more dynamic behaviour from the rules, you can always implement your own Rule.
Debug
By enabling debug logging, Pegger will spew out a lot of debug / trace information. (Possiblly more than you can handle!) But note it will only emit debug information for rules with names.
Enable debug logging with the line:
afPegger::Parser#.pod.log.level = LogLevel.debug
Which, for the above tag parsing example, will log content like:
SlimerDude Fri 26 Sep 2014
Pegger Preview Release!
For when Regular Expressions just aren't enough!
Peggeris a Parsing Expression Grammar (PEG) implementation. It lets you build text parsers by building up a tree of simple matching rules.Peggerwas inspired by Mouse and Parboiled.Peggeris different to Dmitry's / dsav's PEG in that (currently) all rules are defined with Fantom code.Quick Start
class Example : Rules { Void main() { nameRule := oneOrMore(anyAlphaChar).withAction |Str match| { echo(match) } parseRules := sequence([char('<'), nameRule, char('>')]) Parser(parseRules).parse("<HelloMum>".in) // --> HelloMum } }Usage
When parsing text,
Peggeron it's own will not return anything useful. Instead it is up to you to define useful rule actions that get executed when a rule passes; usually when a successful match happens.Recursive / HTML Parsing
A well known limitation of regular expressions is that they can not match nested patterns, such as HTML. (See StackOverflow for explanation.)
Peggerto the rescue!Below is an example that parses nested XML tags and throws a
ParseErrshould an incorrect end tag be used. It uses the NamedRules helper class to reference rules to get around inherent problems with recursion.using xml using afPegger class Example { Void main() { Parser#.pod.log.level = LogLevel.debug elem := TagParser().parseTags("<html><title>Pegger Example</title><body>Parsing is Easy!</body></html>") echo(elem.writeToStr) // --> // <html> // <title>Pegger Example</title> // <body>Parsing is Easy!</body> // </html> elem = TagParser().parseTags("<html><title>Pegger Example</oops></html>") // --> sys::ParseErr: End tag </oops> does not match start tag <title> } } class TagParser : Rules { XElem? root XElem? elem XElem parseTags(Str tags) { Parser(rules).parse(tags.in) return root } Rule rules() { // use 'NamedRules' to define rules in any order and to avoid recursion rules := NamedRules() element := rules["element"] startTag := rules["startTag"] endTag := rules["endTag"] text := rules["text"] rules["element"] = sequence([startTag, zeroOrMore(firstOf([element, text])), endTag]) rules["startTag"] = sequence([char('<'), oneOrMore(anyAlphaChar), char('>')]) .withAction { pushStartTag(it) } rules["endTag"] = sequence([str("</"), oneOrMore(anyAlphaChar), char('>')]) .withAction { pushEndTag(it) } rules["text"] = oneOrMore(anyCharNot('<')) .withAction { pushText(it) } return element } Void pushStartTag(Str tagName) { child := XElem(tagName[1..<-1]) if (root == null) root = child else elem.add(child) elem = child } Void pushEndTag(Str tagName) { if (tagName[2..<-1] != elem.name) throw ParseErr("End tag ${tagName} does not match start tag <${elem.name}>") elem = elem.parent } Void pushText(Str text) { elem.add(XText(text)) } }Note that only Chuck Norris can parse HTML with regular expressions.
Dynamic Rules
Should you require more dynamic behaviour from the rules, you can always implement your own Rule.
Debug
By enabling debug logging,
Peggerwill spew out a lot of debug / trace information. (Possiblly more than you can handle!) But note it will only emit debug information for rules with names.Enable debug logging with the line:
Which, for the above tag parsing example, will log content like:
Have fun!