I'm thinking about writing a small NB plugins for supporting Fan syntax
At first I'm just planning to do the highlighting and basic stuff. If i have time i might try to do more: completion etc... would be nice, not sure i'll find time though .. but maybe.
Anyway here are 2 questions:
What's my best resource (doc) listing the Fan syntax (keywords/structure etc..) - also are any of the files in adm/tools/ considered fairly "complete"?
Is there available sources for one of the currently supported IDE I could look at / cheat.
tcolarTue 5 May 2009
Found this: http://drorbr.blogspot.com/2009/04/intellij-plugin-for-fan-programming.html that should help a bit :)
brianTue 5 May 2009
I think that would be awesome!
Even basic syntax color coding for major IDEs would really help Fan grow.
Thanks, i don't guarantee anything, but i'll give it a shot as time allows. I've grown to like NB a lot (since 6.5).
tcolarWed 6 May 2009
I've done highlighting and most of the grammar (your grammar doc helps lots !) I was wondering, do you have some large piece of code i could use for testing, or Something with as much of the different syntax representated.
Maybe a compiler test data source or something like that.
Thanks.
tacticsWed 6 May 2009
Fan itself is the largest repository of Fan code. You might want to check out $FAN_HOME/src/compiler/fan/parser/Parser.fan, the source code for the parser.
freddy33Wed 6 May 2009
I used the source files of Fan code itself to test the IntelliJ Parser. It gives great confidence in your Parser once you have no errors parsing the 470 fan files in there. Fan4Idea mercurial repo
tcolarWed 6 May 2009
Thanks. BTW in the grammar page, the syntax for "for" seems to be missing a bracket after <forinit> isn't it ?
http://fandev.org/doc/docLang/Grammar.html
brianWed 6 May 2009
Thanks. BTW in the grammar page, the syntax for "for" seems to be missing a bracket after <forinit> isn't it ?
fixed, thanks
tcolarWed 6 May 2009
Another one :-) <bitAndExpr> an extra parenthesis i'll let you know if i find anymore :)
tcolarWed 6 May 2009
I don't care whether you fix it right away, but you said you fixed the "for" and i reloaded the page and it's still broke.
Maybe the "wiki" parser is getting confused & screwing it up ? i wrote a wiki parser myself and that kind of pages get tricky :)
tcolarWed 6 May 2009
One more, sorry :-) Seems like ternaryExpr is not referenced anywhere in the grammar
should it be: <expr> := <assignExpr> | <ternaryExpr>
not sure....
tcolarWed 6 May 2009
Ok two more, last ones i promess :)
The definition of <formal> .. it says <formaTypeOnly> instead of <formalTypeOnly> (missing a "l")
functime is missing a closing bracket, not 100% sure where it should go, after <type> (i think) or after "|".
Thanks
tcolarWed 6 May 2009
Also is there grammar somewhere for the literals: Boolean,Int,FloatDecimal,Str,Duration,Uri:
If not i can probably figure it out i guess.
andyWed 6 May 2009
I don't care whether you fix it right away, but you said you fixed the "for" and i reloaded the page and it's still broke.
These docs actually come from the latest build (1.0.41 right now). So the online docs don't actually get updated until we roll the next build. So its fixed in the hg repo for the next update.
tcolarWed 6 May 2009
sounds good
brianWed 6 May 2009
I fixed all these. Thanks for catching those problems and taking the time to report them.
Also is there grammar somewhere for the literals: Boolean,Int,FloatDecimal,Str,Duration,Uri
Is there a full, formal langauge definition we could feed to something like antlr? I was playing around a while back with building some stuff in java to play with the syntax...
tcolarSat 9 May 2009
Dam it, apparently NB is phasing out GLF(lang support) in 6.7 i have been implementing all week. New versions will use a plain java parser ... so i guess i'll have to use ANTL, which looks interesting to learn.
So i'll probably work on making an ANTLR setup now i guess.
Kinda pissed i have to start over though :(
freddy33Mon 11 May 2009
Sorry about not reporting all the grammar errors I met on the fly, was sure I'll have some time before someone will implement another Java Parser :) So, here some other grammar/literals issue in the docs I encountered:
The modifiers list is missing quite a bunch:
for fields: const native volatile override virtual final
for methods: native
The number literals can starts with 0 like "02s" 2 seconds
The curly brackets are optional in try/catch/finally and switch blocks (They are written mandatory in the grammar
The range expression was missing the ... but now ..< is OK
I also took some shortcuts that may be sensible ones like:
<namedSuper> is only for simpleType not all type. Simplified a lot my parsing: <namedSuper> := <simpleType> "." "super"
I'm forgetting something, hope this little helps, Good luck, I hope to have a new nice version (with code completion) for our IntelliJ Plugin for JavaOne. Hope to see some of you, Fred.
KevinKelleyMon 11 May 2009
A few things with literals, I think, ought to be tightened up someday. For instance I don't see anything that says
ch := '\u0______0______2______0' // a really big space
isn't perfectly fine. Actually for readability you'd almost want leading/trailing underbars sometimes, and I guess it wouldn't hurt anything to allow that, but for now I'm going with underbars as separator as the doc says.
Underbar as separator: true for parts of a float?
1_._2_e_3_ // none of those underbars legal?
Missing parts in a float not legal?
.1 // should be 0.1 ?
// should be 1 or 1.0 ? 1e1 // missing fraction okay .1e1 // should be 0.1e1 ?
Those all actually seem reasonable to me, and convenient, but I can't tell if they're allowed.
Leading/trailing zeros, okay I guess?
001.100e001 // == 11
tcolarMon 11 May 2009
Thanks much for the infos.
freddy33: Did you write your own parser or did you use ANTLR or javacc ? or maybe IntelliJ has it's own language API ?
I'm doing OK, but the built-in Netbeans language grammar tool (schliemann) makes it hard to deal with the linebreaks !
freddy33Tue 12 May 2009
For the parser IntelliJ proposed a tuned JFlex setup that we used. So, our Fan.flex grammar file is doing the parsing. Now the lexer is not doing much beside giving useful valid tokens, since in an IDE when writing code it permanently has syntax error (Or your not writting anything :).
BTW: At the end the parser is quite "hand-made", it's the ASTNode creation, manipulation and validation that is provided by IntelliJ.
brianWed 13 May 2009
Sorry about not reporting all the grammar errors I met on the fly, was sure I'll have some time before someone will implement another Java Parser
Thanks for reporting all those mistakes Fred - I checked-in corrections.
tcolarWed 20 May 2009
Update & Question Now that i gave up on schliemann and moved to Antlr i made much progress (though there was a learning curve). I have the whole grammar working .. though i skipped "doc" .. what is "doc"?
Anyway now i'm running parser tests against Fan source code, and i ran into this: in Actor.fan (~line 29) :
new make(ActorGroup group, |Obj?,Context-> Obj?|? receive := null)
OK, the grammar page does not seem to allow for this kind of syntax and i'm actually not sure exactly what it means (haven't really done much Fan coding yet :) ... can you explain what kind of construct is that and what is the deal with all the ? .. is this sort some of a closure or is that the syntax for an "Actor" ?
Thanks.
tcolarWed 20 May 2009
I guess it's just that the grammar for type should allow for an optional ? (nullable syntax) right ?
andyWed 20 May 2009
Yeah, those are just nullable types:
new make(ActorGroup group, |Obj?, Context-> Obj?|? receive := null)
first arg to receive may be null
return from receive may be null
receive may be null when passed to make
tcolarFri 22 May 2009
In Obj.fan, line 73:
virtual Obj? trap(Str name, Obj?[]? args)
is "Obj?[]?" a valid syntax ?? what does it mean ? Thanks
tompalmerFri 22 May 2009
is "Obj?[]?" a valid syntax ?? what does it mean ?
I believe it's a nullable list of nullable objects.
tcolarFri 22 May 2009
You must be right, that makes sense now that you mention it. Grammar did not allow for that, will add.
brianFri 22 May 2009
is "Obj?[]?" a valid syntax ?? what does it mean ? Thanks
Any type can have ? appended to it to make it nullable:
Obj? is reference to an Obj which might be null.
Obj?[] is a list of Obj?
Obj?[]? is a reference to a Obj?[] where the reference to the list itself might be null
That's a type reference, like Int.class would be in Java. In Fan, Type#slot references the given slot, and Type# references the given type, much like the trailing # in HTML uris referencing the top of the document.
tcolarMon 25 May 2009
OK. before i left for the weekend i had it all working except the maps. The grammar says it's basically recursive, which a grammar does not like much, is it truly recursive or can i simplify ?
Would something like this be grammatically correct for example, regardless of ugliness ?
Save for the equals sign at the end, using int instead of Int and using String instead of Str, this is valid. It evaluates to a type which maps lists to lists.
brianTue 26 May 2009
The grammar says it's basically recursive, which a grammar does not like much, is it truly recursive or can i simplify ?
Type declarations can be recursive, in which case map types typically must be wrapped with [K:V] to avoid ambiguity. Although in practice, I really doubt anybody would use a parametrized generic type as the key of a map. I also expect that my parser doesn't do a perfect job with pipes for nested functions.
brianTue 26 May 2009
Does that mean Fan support "implicit returns" ? (last variable returned ?)
Yes, you can omit the return keyword if a method body contains only one statement.
tcolarTue 26 May 2009
OK makes sense.
i was trying to make my grammar not need backtracking, i think that's the only thing that does seem to require it, but that's ok, i'll enable it just for this particular rule.
tcolarTue 26 May 2009
Actually to clarify, recursion is fine, but ANTLR does not like "Left recursion", which is the case with map since the brackets are optional and the left most item can be a map. But anyway there are ways to deal with it so it should be ok.
freddy33Tue 26 May 2009
For you information our parser does not support this also. It's the only piece (In SerializationTest ) that does not pass our parser tests.
I pushed a fix for that, and also the missing production for <itAdd> (it is a stmt, not an expr).
tcolarThu 4 Jun 2009
Another thing that doesn't work right are rules like:
"break" <eos>
same for return, continue etc...
Many places in the fan code there are for example break statements that have neither followed by a semicolumn or a newline (i think they it's optional if it's the last/only statement at the end of a block).
Not quite sure yet how to deal with that though. I could just make the eol optional but that's probably to loose.
Or i could try to change the grammar to somehow require the eol except for the last statement of a block but that sounds pretty tricky.
brianThu 4 Jun 2009
The eos production is semi-colon, newline, or if next char is a }
and then the comma-separated items are constructor calls, with no parameters so the () is left off, and with empty it-block initializers {}. (termExpr, termBase, termChain, itBlock)
tcolarMon 8 Jun 2009
Thanks, it should work then, will have to see why it's getting confused here ... It blocks are parsing fine everywhere else. My grammar is probably trying to do something with the "{" before it tries the it block somehow.
Anyway thanks for the explanation.
tcolarTue 9 Jun 2009
Also can a facet be on any kind of typeDef ... or where are they allowed in general.
In particular could you have a facet on a fieldDef, if so the facet and the "direct access field"(@field) have very similar syntax and be confusing ?
Thanks
tcolarTue 9 Jun 2009
sorry that was confusing. I just want to know what elements can have facets.
KevinKelleyTue 9 Jun 2009
564 is a proposal for extending facet syntax to symbols, not sure where that's going though.
Currently (as I read it) facets can be applied to types, and slots. So a field could have a facet.
The direct access field looks like it can only occur inside a block; facets only appear before the type or slot declaration; so yeah, maybe confusing, but I don't think ambiguous.
tcolarTue 9 Jun 2009
Ok, thanks. It's not that confusing to read as a human, but ANTLR does not like that kind of things much in the grammar(not contextual) ... but i should be able to work something out.
andreyTue 9 Jun 2009
Thinking of adding Eclipse Fan support
Hi tcolar,
I'd love to spent some time on simple Eclipse plugin for the Fan language, would you be interested to share efforts on the ANTLR grammar, which I'd be happy to use in the plugin?
Thank you, Andrey
brianTue 9 Jun 2009
@tcolor - facets can appear before any type or slot (method or field) declaration. The @ symbol is only used in expressions, so there should be no ambiguity.
One thing that is not captured in the grammar you might need to consider is that newlines occassionally have semantics with regard to the return statement and the ( and | chars:
These two statements:
foo
(goo)
are not the same as:
foo(goo)
I think groovy does something similar, so you might want to check out how they handled that.
tcolarTue 9 Jun 2009
@andrey: sure, once i'm done. i'm getting very close now.
@brian:I got passed the linebreak "issue" already ... it was by far the trickiest problem ! that's the only thing i'm using a little bit of antl code for (@members). I usually ignore the linebreaks(hidden channel), but when they are meaningful my function looks them up in the hidden channel and that works well.
The facet/field issue is basically the same, i can't tell the lexer how to recognize one from the other, so in the lexer i'll just match @ID and in the parser in case of a field i just check @id, but in case of a facet it'll be either
(@id "=" until_newline())
or just
(@id)
... working on this now, does that make sense ?
Note: right now i'm not planning on parsing what's in the facet ... not sure if there's even a grammar for that anyway.
while > < >= <= and <=> should check against expressions, i'm not sure that works with "is" "as" ans "isnot", those 3 should be checking against a type ratehr than an expression isn't it ?
If that's the case, i don't think the grammar on your page works because it checks against an "elvisExpr", which following that tree do not cover for types.
Example(Dialog.fan):
// swizzle details if passed commands
if (details is Command[]) { commands = details; details = null }
Would i be correct to check for a type rather than an elvisExpr for is.as and isnot ?
Or do you have a suggested way to write the grammar to cover this.
Yep that's basically what i had in mind, just wanted to confirm. Thanks.
KevinKelleyTue 9 Jun 2009
Which brings up another question...
That change looks right; I was wondering if a type could reduce to an <expr> through <simple> -> <literal>, but this works better I think.
But, what about the MyClass# syntax for type? I'm not finding anything that covers that, in the grammar. I think it's maybe another form of literal, yielding an expression, so should
<literal> := <type> "#"
be added? Or did I miss something.
KevinKelleyTue 9 Jun 2009
Ah. Looking in compiler/fan/parser/Parser.fan, I find a pound in termBaseExpr, which seems to also parse literals.
brianTue 9 Jun 2009
you are right, those were missing - here is my fix:
Just hard to parse that stuff :) ... colon can mean so many things :)
Anyway it's getting there.
KevinKelleyWed 10 Jun 2009
I'd have guessed ANTLR could handle that, with its lookahead. Is it the case that it's finding an ambiguity, and you're (I saw the comment) avoiding backtracking?
I'm liking the Earley parser I'm using for this reason; cases where it can't tell which way it should go it adds all the possibilities to the chart, and whichever one ends up "bubbling up" to a complete parse, wins.
I'm giving it an auto-generated lexer, now. Up to recently I was using a separate, custom-built tokenizer on the input source, which is fine for a single use but not so easy to use in the general case of wanting to parse any grammar.
So now I'm allowing literals anywhere in the grammar, so it can read a grammar that's just like it appears on the fandev grammar page, for instance. I've got it working for literals like keywords and operators, now I'm extending a bit to allow regex's. So it'll be able to handle a grammar just the way I want to write it, like
<id> = "[a-zA-Z_][a-zA-Z_0-9]*";
Anyway. Soon as I get that going, I'll be putting the grammar-builder app up and asking for comments.
tcolarWed 10 Jun 2009
I could enable backtracking yes, but i think the maps definition can be truly ambiguous either way (need to review, maybe i missed domething).
But i'll check whatever is done in the Fan compiler parser and try to emulate that, my guess is that it's probably more restrictive that what the grammar page says.
tcolarWed 10 Jun 2009
Nevermind, i had misread the map stuff. It works.
KevinKelleyWed 10 Jun 2009
That map example above, not completely sure but I think the fan parser handles it by doing a lookup on the id, sees that Str and MethodVar both are in the known types table so therefore can be parsed as <mapType> without the optional brackets.
Without knowing that they're type names, it's harder. :-)
I can't tell for sure if it's ambiguous; I'm guessing not (I think between backtracking and lookahead it can figure out that [:] means map, and then mapType can eat one of the colons and leave the other for the ternaryExpr). But it'll have to work for it.
tcolarWed 10 Jun 2009
Yeah i had missed the requirement for [:] so it works. The only issues left with that one is that it's left recursive:
type: map | ....
map : type ': type ...
antlr does not like left recursion, but i think i can deal with that.
tcolarWed 10 Jun 2009
brianWed 10 Jun 2009
I suspect that the grammar is ambigious from a syntax only point of view. They way I got things to work in the compiler is:
tokenize into a list
walk the tokens looking for using and class keywords
build up the type namespace from using/class
full parse resolving identifiers to namespace
So at parse time I actually know that Str is a type name and not an arbitrary identifier. The side effect of this is that I must know all the types of a given using statement (something that is a massive PITA in Java).
The pipe is a huge problem since it is overloaded for bitwise-or. I've floated the idea of changing the bitwise or, but doesn't seem much support for that.
But I think whatever we do, it is critical to ensure Fan can be tooled well (even if we need to change the syntax of the language). So lets see if you guys can make your stuff work.
tcolarWed 10 Jun 2009
I still have an issue with the ternaryExpr
I don't think the way it's defined now works, because you could have something like:
bar = foo / (x>5?3:2) /4
tcolarWed 10 Jun 2009
nevermind, that should be OK, i need a break :) It doesn't parse right though for some reason, will debug and see what the deal is a.
tcolarThu 11 Jun 2009
I'm down to ~39 files not cleanly parsing out of the 680+ in the fan distro. At least half of them are because of the way i parse strings (need to fix issues with "\" and "\\" ...)
So it's getting there, not a lot of issues left.
tcolarThu 11 Jun 2009
The new slotLiteral definition is causing me trouble, i have 2 questions:
what is it for ?
SlotLiteral := [<type>] "#" <id>
Problem is that a statement like this:
// fieldDef, define foo as the type for Obj
foo := Obj#
// fielDef: define field called "bar"
bar
Could also be interpreted as a slotLiteral (over 2 lines):
([<type>] "#" <id>)) - Obj#bar
So i was wondering what to do about this ambiguity, any suggestion ?
tcolarThu 11 Jun 2009
Is it Ok to say that for a slotLiteral it as to be in "one word" (type#id) no space / linebreak allowed between type,# and id?
brianThu 11 Jun 2009
It works just like return, (, and [ in that newlines are significant.
If # if followed by a newline, then it assumes end of expression (even if next line begins with a identifier)
tcolarThu 11 Jun 2009
Cool, that works. Thanks.
tcolarThu 11 Jun 2009
I'm a down to 6.
In FacetsTest.fan, the last line reads:
@ma="ma" @mb='b' mixin FacetsM {}
Is that suppose to parse to a facet with content called ma with value
"ma" @mb='b' mixin FacetsM {}
or is it a mixin with 2 facets ma and mb ?
Thanks
brianThu 11 Jun 2009
It is @id=expr, so there should be two facets ma with a Str value, and mb with an Int value of the char b.
tcolarSat 13 Jun 2009
OK, i got all the files parsing correctly now, except for this:
flux::Main.exit(frame)
What is that suppose to parse as ?
flux::Main would parse as a simpleType right now, but that's not allowed as a termBase, should it be ? or is that something else ?
Other than that the only other issue i have, which does not occur in your source files, is that sometimes there is confusion on the : between maps and ternaryExpr.
Either way it's working well enough that i will try to get it hooked into NetBeans at this point .... as times allow (very limited right now) .
brianSat 13 Jun 2009
looking thru the grammar there is definitely something missing there. Term base and type literals should support any type including a fully qualified type. I'll take a look tomorrow at how to update the grammar.
tcolarSun 14 Jun 2009
OK, let me know. Maybe KevinKelley suggestion of literal reducing to type is correct ?
Either way let me know, note that even without this, the only file that failed parsing
is Commands.fan so it's not that common i guess.
brianSun 14 Jun 2009
I reworked the grammar quite a bit and added a <typeBase> production to match how Parser was really working - changeset.
KevinKelleySun 14 Jun 2009
That looks better, thanks. It's hard to keep a language clean, but if isn't, it's next to impossible to do anything with the parse trees you get from it.
I'm getting successful parses and parse trees for the Fan grammar, now, from Pfui ((PEP (Earley Parser) for Fan, with User Interface).
Still tweaking on the app for usability, and now starting on decorating the parse tree with semantic actions for the fanfold editor.
Lots to do, making progress...
(I've also found a Java grammar, 1.4 or 1.5 level, that is successfully parsing. Might be able to do some things with that, at least in terms of simple transformations of code, sort of a translator-helper or something)
tcolarSun 14 Jun 2009
Cool, thanks for taking the time to update it. Will update the antlr grammar accordingly. After that it should be "final" (well for now) :)
tcolarSun 14 Jun 2009
That changelog looks more complex than it is :)
Anyway i updated teh antlr grammar and voila! all Fan distro files parsing fully (except gamma.fan ... which is not a fan file ... whatever it is) :)
JohnDGMon 15 Jun 2009
(I've also found a Java grammar, 1.4 or 1.5 level, that is successfully parsing. Might be able to do some things with that, at least in terms of simple transformations of code, sort of a translator-helper or something)
This is something I have interest in. Fan is great, but it's not going to help you unless your own code is written in Fan. A Java-to-Fan translator should be relatively straightforward (as long as idiomatic Fan is not the goal -- there would be no closures in the converted code, for example).
Anyway i updated teh antlr grammar and voila! all Fan distro files parsing fully (except gamma.fan
That's great tcolar. The next step is error recovery. I understand ANTLR provides some help for this, but it's still a bit of work. You're goal should be able to detect all errors in a Fan file simultaneously, rather than stopping at the first one -- when you get this working, it will be immediately useful for an IDE.
tcolarMon 15 Jun 2009
Yes I'll work on error recovery as well, antlr has facilities for that. At this point i'm gonna be mostly following the same procedure as explained here (just about the only "doc" on writing a NetBeans plugin with the new parsing API):
tcolar Tue 5 May 2009
I'm thinking about writing a small NB plugins for supporting Fan syntax
At first I'm just planning to do the highlighting and basic stuff. If i have time i might try to do more: completion etc... would be nice, not sure i'll find time though .. but maybe.
Anyway here are 2 questions:
tcolar Tue 5 May 2009
Found this: http://drorbr.blogspot.com/2009/04/intellij-plugin-for-fan-programming.html that should help a bit :)
brian Tue 5 May 2009
I think that would be awesome!
Even basic syntax color coding for major IDEs would really help Fan grow.
Keywords and grammar are documented in docLang::Grammar
tcolar Tue 5 May 2009
Thanks, i don't guarantee anything, but i'll give it a shot as time allows. I've grown to like NB a lot (since 6.5).
tcolar Wed 6 May 2009
I've done highlighting and most of the grammar (your grammar doc helps lots !) I was wondering, do you have some large piece of code i could use for testing, or Something with as much of the different syntax representated.
Maybe a compiler test data source or something like that.
Thanks.
tactics Wed 6 May 2009
Fan itself is the largest repository of Fan code. You might want to check out
$FAN_HOME/src/compiler/fan/parser/Parser.fan
, the source code for the parser.freddy33 Wed 6 May 2009
I used the source files of Fan code itself to test the IntelliJ Parser. It gives great confidence in your Parser once you have no errors parsing the 470 fan files in there. Fan4Idea mercurial repo
tcolar Wed 6 May 2009
Thanks. BTW in the grammar page, the syntax for "for" seems to be missing a bracket after <forinit> isn't it ?
http://fandev.org/doc/docLang/Grammar.html
brian Wed 6 May 2009
fixed, thanks
tcolar Wed 6 May 2009
Another one :-) <bitAndExpr> an extra parenthesis i'll let you know if i find anymore :)
tcolar Wed 6 May 2009
I don't care whether you fix it right away, but you said you fixed the "for" and i reloaded the page and it's still broke.
Maybe the "wiki" parser is getting confused & screwing it up ? i wrote a wiki parser myself and that kind of pages get tricky :)
tcolar Wed 6 May 2009
One more, sorry :-) Seems like ternaryExpr is not referenced anywhere in the grammar
should it be: <expr> := <assignExpr> | <ternaryExpr>
not sure....
tcolar Wed 6 May 2009
Ok two more, last ones i promess :)
Thanks
tcolar Wed 6 May 2009
Also is there grammar somewhere for the literals: Boolean,Int,FloatDecimal,Str,Duration,Uri:
If not i can probably figure it out i guess.
andy Wed 6 May 2009
These docs actually come from the latest build (1.0.41 right now). So the online docs don't actually get updated until we roll the next build. So its fixed in the hg repo for the next update.
tcolar Wed 6 May 2009
sounds good
brian Wed 6 May 2009
I fixed all these. Thanks for catching those problems and taking the time to report them.
Take a look at docLang::Literals
cheeser Wed 6 May 2009
Is there a full, formal langauge definition we could feed to something like antlr? I was playing around a while back with building some stuff in java to play with the syntax...
tcolar Sat 9 May 2009
Dam it, apparently NB is phasing out GLF(lang support) in 6.7 i have been implementing all week. New versions will use a plain java parser ... so i guess i'll have to use ANTL, which looks interesting to learn.
So i'll probably work on making an ANTLR setup now i guess.
Kinda pissed i have to start over though :(
freddy33 Mon 11 May 2009
Sorry about not reporting all the grammar errors I met on the fly, was sure I'll have some time before someone will implement another Java Parser :) So, here some other grammar/literals issue in the docs I encountered:
I also took some shortcuts that may be sensible ones like:
I'm forgetting something, hope this little helps, Good luck, I hope to have a new nice version (with code completion) for our IntelliJ Plugin for JavaOne. Hope to see some of you, Fred.
KevinKelley Mon 11 May 2009
A few things with literals, I think, ought to be tightened up someday. For instance I don't see anything that says
isn't perfectly fine. Actually for readability you'd almost want leading/trailing underbars sometimes, and I guess it wouldn't hurt anything to allow that, but for now I'm going with underbars as
separator
as the doc says.Underbar as separator: true for parts of a float?
Missing parts in a float not legal?
Those all actually seem reasonable to me, and convenient, but I can't tell if they're allowed.
Leading/trailing zeros, okay I guess?
tcolar Mon 11 May 2009
Thanks much for the infos.
freddy33: Did you write your own parser or did you use ANTLR or javacc ? or maybe IntelliJ has it's own language API ?
I'm doing OK, but the built-in Netbeans language grammar tool (schliemann) makes it hard to deal with the linebreaks !
freddy33 Tue 12 May 2009
For the parser IntelliJ proposed a tuned JFlex setup that we used. So, our Fan.flex grammar file is doing the parsing. Now the lexer is not doing much beside giving useful valid tokens, since in an IDE when writing code it permanently has syntax error (Or your not writting anything :).
So, may be our list of tokens may help you also.
BTW: At the end the parser is quite "hand-made", it's the ASTNode creation, manipulation and validation that is provided by IntelliJ.
brian Wed 13 May 2009
Thanks for reporting all those mistakes Fred - I checked-in corrections.
tcolar Wed 20 May 2009
Update & Question Now that i gave up on schliemann and moved to Antlr i made much progress (though there was a learning curve). I have the whole grammar working .. though i skipped "doc" .. what is "doc"?
Anyway now i'm running parser tests against Fan source code, and i ran into this: in Actor.fan (~line 29) :
OK, the grammar page does not seem to allow for this kind of syntax and i'm actually not sure exactly what it means (haven't really done much Fan coding yet :) ... can you explain what kind of construct is that and what is the deal with all the
?
.. is this sort some of a closure or is that the syntax for an "Actor" ?Thanks.
tcolar Wed 20 May 2009
I guess it's just that the grammar for type should allow for an optional
?
(nullable syntax) right ?andy Wed 20 May 2009
Yeah, those are just nullable types:
receive
may be nullreceive
may be nullreceive
may be null when passed tomake
tcolar Fri 22 May 2009
In Obj.fan, line 73:
is "Obj?[]?" a valid syntax ?? what does it mean ? Thanks
tompalmer Fri 22 May 2009
I believe it's a nullable list of nullable objects.
tcolar Fri 22 May 2009
You must be right, that makes sense now that you mention it. Grammar did not allow for that, will add.
brian Fri 22 May 2009
Any type can have
?
appended to it to make it nullable:Obj?
is reference to anObj
which might be null.Obj?[]
is a list ofObj?
Obj?[]?
is a reference to aObj?[]
where the reference to the list itself might be nullYou can think of it like
( Obj?[] )?
tcolar Fri 22 May 2009
Thanks. One more: In WebStep.fan, line 24:
Does that mean Fan support "implicit returns" ? (last variable returned ?)
Thanks.
tcolar Sat 23 May 2009
Made great progress today, about 70% of the fan source code files parse fully and i only have few item to work on now.
it seems like the comma might be optional in the function formals as seen in Actor.fan (~54):
>new makeCoalescing(ActorGroup group,
Is that correct ?
tcolar Sat 23 May 2009
Nevermind, please ignore previous question
tcolar Sat 23 May 2009
One more:
what's the sharp about ?
Thanks
tompalmer Sat 23 May 2009
That's a type reference, like
Int.class
would be in Java. In Fan,Type#slot
references the given slot, andType#
references the given type, much like the trailing#
in HTML uris referencing the top of the document.tcolar Mon 25 May 2009
OK. before i left for the weekend i had it all working except the maps. The grammar says it's basically recursive, which a grammar does not like much, is it truly recursive or can i simplify ?
Would something like this be grammatically correct for example, regardless of ugliness ?
tactics Mon 25 May 2009
Save for the equals sign at the end, using
int
instead ofInt
and usingString
instead ofStr
, this is valid. It evaluates to a type which maps lists to lists.brian Tue 26 May 2009
Type declarations can be recursive, in which case map types typically must be wrapped with
[K:V]
to avoid ambiguity. Although in practice, I really doubt anybody would use a parametrized generic type as the key of a map. I also expect that my parser doesn't do a perfect job with pipes for nested functions.brian Tue 26 May 2009
Yes, you can omit the
return
keyword if a method body contains only one statement.tcolar Tue 26 May 2009
OK makes sense.
i was trying to make my grammar not need backtracking, i think that's the only thing that does seem to require it, but that's ok, i'll enable it just for this particular rule.
tcolar Tue 26 May 2009
Actually to clarify, recursion is fine, but ANTLR does not like "Left recursion", which is the case with map since the brackets are optional and the left most item can be a map. But anyway there are ways to deal with it so it should be ok.
freddy33 Tue 26 May 2009
For you information our parser does not support this also. It's the only piece (In SerializationTest ) that does not pass our parser tests.
tcolar Fri 29 May 2009
Is this correct:
should the first elvisExpr be optional maybe ?
i can't see what in the grammar allow a simple expression like "3 > 2" otherwise.
tcolar Fri 29 May 2009
nevermind.
tcolar Tue 2 Jun 2009
Token.java: 257
is that a valid construct... i can understand a static block, but what for ? the method is static to start with.
Am i missing something ?
Thanks
tcolar Tue 2 Jun 2009
Ho i see, just a field def followed by static init block.
OK will add that to the grammar
tcolar Wed 3 Jun 2009
OK, one more: In some FWT/ui demo (demo.fan) you have code like this:
I understand what that does and how it's supposed to be parsed for the most part.
but i'm not sure where this fits in the grammar:
I mean what's with the commas ? are they optional or do they mean something i missed?
Thanks.
brian Wed 3 Jun 2009
the commas mean to call add the expr to it, see collections
tcolar Wed 3 Jun 2009
Ha, ok. Cool feature .. once you know what it is :)
Will add that to my grammar then.
Thanks
tcolar Thu 4 Jun 2009
Also the grammar says:
This can't be right, cause this would make the ternaryExpr required.
brian Thu 4 Jun 2009
It should be:
I pushed a fix for that, and also the missing production for
<itAdd>
(it is a stmt, not an expr).tcolar Thu 4 Jun 2009
Another thing that doesn't work right are rules like:
same for return, continue etc...
Many places in the fan code there are for example break statements that have neither followed by a semicolumn or a newline (i think they it's optional if it's the last/only statement at the end of a block).
Not quite sure yet how to deal with that though. I could just make the eol optional but that's probably to loose.
Or i could try to change the grammar to somehow require the eol except for the last statement of a block but that sounds pretty tricky.
brian Thu 4 Jun 2009
The
eos
production is semi-colon, newline, or if next char is a}
See Parser.endOfStmt
tcolar Thu 4 Jun 2009
Cool, that's what i did in my grammar too.
tcolar Fri 5 Jun 2009
also formals in functiontype/closure need to allow for just a comma. ex: formals : COMMA | (formal (COMMA formal)*);
brian Fri 5 Jun 2009
Not sure I follow that:
tcolar Fri 5 Jun 2009
Maybe it's more of a functional call, anyway i'm trying to figure out/parse this(KeyTest.fan):
Maybe i'm missing something, can you explain what that does in case i'm misunderstanding.
I don't think the grammar allows the |,| right now.
brian Fri 5 Jun 2009
OK, I see -
|,|
is sugar for|->Void|
, but wasn't in the grammar. I updated the grammar to be:tcolar Fri 5 Jun 2009
Ok cool, that's what i thought.
tcolar Mon 8 Jun 2009
I'm making good progress, though i have very limited time right now.
I have another construct i'm not sure what to do about:
So pipeline is a list ... but of what ? what is
FindResourceStep {}'' ? (in particular what are the brackets about).
what does it mean and what is it grammatically ?
Thanks
brian Mon 8 Jun 2009
It is basically just an empty it-block:
KevinKelley Mon 8 Jun 2009
Should be a literal list... in the grammar
and then the comma-separated items are constructor calls, with no parameters so the () is left off, and with empty it-block initializers
{}
. (termExpr, termBase, termChain, itBlock)tcolar Mon 8 Jun 2009
Thanks, it should work then, will have to see why it's getting confused here ... It blocks are parsing fine everywhere else. My grammar is probably trying to do something with the "{" before it tries the it block somehow.
Anyway thanks for the explanation.
tcolar Tue 9 Jun 2009
Also can a facet be on any kind of typeDef ... or where are they allowed in general.
In particular could you have a facet on a fieldDef, if so the facet and the "direct access field"(@field) have very similar syntax and be confusing ?
Thanks
tcolar Tue 9 Jun 2009
sorry that was confusing. I just want to know what elements can have facets.
KevinKelley Tue 9 Jun 2009
564 is a proposal for extending facet syntax to symbols, not sure where that's going though.
Currently (as I read it) facets can be applied to types, and slots. So a field could have a facet.
The direct access field looks like it can only occur inside a block; facets only appear before the type or slot declaration; so yeah, maybe confusing, but I don't think ambiguous.
tcolar Tue 9 Jun 2009
Ok, thanks. It's not that confusing to read as a human, but ANTLR does not like that kind of things much in the grammar(not contextual) ... but i should be able to work something out.
andrey Tue 9 Jun 2009
Thinking of adding Eclipse Fan support
Hi tcolar,
I'd love to spent some time on simple Eclipse plugin for the Fan language, would you be interested to share efforts on the ANTLR grammar, which I'd be happy to use in the plugin?
Thank you, Andrey
brian Tue 9 Jun 2009
@tcolor - facets can appear before any type or slot (method or field) declaration. The
@
symbol is only used in expressions, so there should be no ambiguity.One thing that is not captured in the grammar you might need to consider is that newlines occassionally have semantics with regard to the
return
statement and the(
and|
chars:These two statements:
are not the same as:
I think groovy does something similar, so you might want to check out how they handled that.
tcolar Tue 9 Jun 2009
@andrey: sure, once i'm done. i'm getting very close now.
@brian:I got passed the linebreak "issue" already ... it was by far the trickiest problem ! that's the only thing i'm using a little bit of antl code for (@members). I usually ignore the linebreaks(hidden channel), but when they are meaningful my function looks them up in the hidden channel and that works well.
The facet/field issue is basically the same, i can't tell the lexer how to recognize one from the other, so in the lexer i'll just match @ID and in the parser in case of a field i just check @id, but in case of a facet it'll be either
or just
... working on this now, does that make sense ?
Note: right now i'm not planning on parsing what's in the facet ... not sure if there's even a grammar for that anyway.
tcolar Tue 9 Jun 2009
For reference here is what I have now:
http://svn.colar.net/Fan_antlr/src/Fan.g
But at this point it's just for reference, it's not finished / cleaned up / perfect.
brian Tue 9 Jun 2009
From grammar point of view it is an arbitrary expression. From language point of view it must be a serializable object literal.
tcolar Tue 9 Jun 2009
All right, the facets where a pain but i finaly got them figured out. Another question(sorry):
while > < >= <= and <=> should check against expressions, i'm not sure that works with "is" "as" ans "isnot", those 3 should be checking against a type ratehr than an expression isn't it ?
If that's the case, i don't think the grammar on your page works because it checks against an "elvisExpr", which following that tree do not cover for types.
Example(Dialog.fan):
Would i be correct to check for a type rather than an elvisExpr for is.as and isnot ?
Or do you have a suggested way to write the grammar to cover this.
Thanks.
brian Tue 9 Jun 2009
How about this change:
tcolar Tue 9 Jun 2009
Yep that's basically what i had in mind, just wanted to confirm. Thanks.
KevinKelley Tue 9 Jun 2009
Which brings up another question...
That change looks right; I was wondering if a type could reduce to an <expr> through <simple> -> <literal>, but this works better I think.
But, what about the MyClass# syntax for type? I'm not finding anything that covers that, in the grammar. I think it's maybe another form of literal, yielding an expression, so should
be added? Or did I miss something.
KevinKelley Tue 9 Jun 2009
Ah. Looking in compiler/fan/parser/Parser.fan, I find a
pound
in termBaseExpr, which seems to also parse literals.brian Tue 9 Jun 2009
you are right, those were missing - here is my fix:
tcolar Tue 9 Jun 2009
Yeah i had that already as well, doen the same way, except for the name:
But i didn't have slotLitteral so i'll add that.
tcolar Tue 9 Jun 2009
BTW, i have a bunch of other things i had to fix, was planning on listing them all once done.
tcolar Tue 9 Jun 2009
Another one that's missing is the curry. I added it to termBase
Does that sound right ?
KevinKelley Tue 9 Jun 2009
It shows up in the parser source under prefixExpr
tcolar Tue 9 Jun 2009
I gotta look at that parser code some more :)
tcolar Wed 10 Jun 2009
I'm almost done, still have to deal with:
Just hard to parse that stuff :) ... colon can mean so many things :)
Anyway it's getting there.
KevinKelley Wed 10 Jun 2009
I'd have guessed ANTLR could handle that, with its lookahead. Is it the case that it's finding an ambiguity, and you're (I saw the comment) avoiding backtracking?
I'm liking the Earley parser I'm using for this reason; cases where it can't tell which way it should go it adds all the possibilities to the chart, and whichever one ends up "bubbling up" to a complete parse, wins.
I'm giving it an auto-generated lexer, now. Up to recently I was using a separate, custom-built tokenizer on the input source, which is fine for a single use but not so easy to use in the general case of wanting to parse any grammar.
So now I'm allowing literals anywhere in the grammar, so it can read a grammar that's just like it appears on the fandev grammar page, for instance. I've got it working for literals like keywords and operators, now I'm extending a bit to allow regex's. So it'll be able to handle a grammar just the way I want to write it, like
Anyway. Soon as I get that going, I'll be putting the grammar-builder app up and asking for comments.
tcolar Wed 10 Jun 2009
I could enable backtracking yes, but i think the maps definition can be truly ambiguous either way (need to review, maybe i missed domething).
But i'll check whatever is done in the Fan compiler parser and try to emulate that, my guess is that it's probably more restrictive that what the grammar page says.
tcolar Wed 10 Jun 2009
Nevermind, i had misread the map stuff. It works.
KevinKelley Wed 10 Jun 2009
That map example above, not completely sure but I think the fan parser handles it by doing a lookup on the id, sees that Str and MethodVar both are in the known types table so therefore can be parsed as <mapType> without the optional brackets.
Without knowing that they're type names, it's harder. :-)
I can't tell for sure if it's ambiguous; I'm guessing not (I think between backtracking and lookahead it can figure out that
[:]
means map, and then mapType can eat one of the colons and leave the other for the ternaryExpr). But it'll have to work for it.tcolar Wed 10 Jun 2009
Yeah i had missed the requirement for [:] so it works. The only issues left with that one is that it's left recursive:
antlr does not like left recursion, but i think i can deal with that.
tcolar Wed 10 Jun 2009
brian Wed 10 Jun 2009
I suspect that the grammar is ambigious from a syntax only point of view. They way I got things to work in the compiler is:
using
andclass
keywordsSo at parse time I actually know that
Str
is a type name and not an arbitrary identifier. The side effect of this is that I must know all the types of a givenusing
statement (something that is a massive PITA in Java).The pipe is a huge problem since it is overloaded for bitwise-or. I've floated the idea of changing the bitwise or, but doesn't seem much support for that.
But I think whatever we do, it is critical to ensure Fan can be tooled well (even if we need to change the syntax of the language). So lets see if you guys can make your stuff work.
tcolar Wed 10 Jun 2009
I still have an issue with the ternaryExpr
I don't think the way it's defined now works, because you could have something like:
tcolar Wed 10 Jun 2009
nevermind, that should be OK, i need a break :) It doesn't parse right though for some reason, will debug and see what the deal is a.
tcolar Thu 11 Jun 2009
I'm down to ~39 files not cleanly parsing out of the 680+ in the fan distro. At least half of them are because of the way i parse strings (need to fix issues with "\" and "\\" ...)
So it's getting there, not a lot of issues left.
tcolar Thu 11 Jun 2009
The new slotLiteral definition is causing me trouble, i have 2 questions:
Could also be interpreted as a slotLiteral (over 2 lines):
([<type>] "#" <id>)) - Obj#bar
So i was wondering what to do about this ambiguity, any suggestion ?
tcolar Thu 11 Jun 2009
Is it Ok to say that for a slotLiteral it as to be in "one word" (type#id) no space / linebreak allowed between type,# and id?
brian Thu 11 Jun 2009
It works just like
return
,(
, and[
in that newlines are significant.If
#
if followed by a newline, then it assumes end of expression (even if next line begins with a identifier)tcolar Thu 11 Jun 2009
Cool, that works. Thanks.
tcolar Thu 11 Jun 2009
I'm a down to 6.
In FacetsTest.fan, the last line reads:
Is that suppose to parse to a facet with content called ma with value
or is it a mixin with 2 facets ma and mb ?
Thanks
brian Thu 11 Jun 2009
It is
@id=expr
, so there should be two facetsma
with aStr
value, andmb
with an Int value of the charb
.tcolar Sat 13 Jun 2009
OK, i got all the files parsing correctly now, except for this:
What is that suppose to parse as ?
flux::Main would parse as a simpleType right now, but that's not allowed as a termBase, should it be ? or is that something else ?
Other than that the only other issue i have, which does not occur in your source files, is that sometimes there is confusion on the
:
between maps and ternaryExpr.Either way it's working well enough that i will try to get it hooked into NetBeans at this point .... as times allow (very limited right now) .
brian Sat 13 Jun 2009
looking thru the grammar there is definitely something missing there. Term base and type literals should support any type including a fully qualified type. I'll take a look tomorrow at how to update the grammar.
tcolar Sun 14 Jun 2009
OK, let me know. Maybe KevinKelley suggestion of literal reducing to type is correct ?
Either way let me know, note that even without this, the only file that failed parsing
is Commands.fan so it's not that common i guess.
brian Sun 14 Jun 2009
I reworked the grammar quite a bit and added a
<typeBase>
production to match how Parser was really working - changeset.KevinKelley Sun 14 Jun 2009
That looks better, thanks. It's hard to keep a language clean, but if isn't, it's next to impossible to do anything with the parse trees you get from it.
I'm getting successful parses and parse trees for the Fan grammar, now, from
Pfui
((PEP (Earley Parser) for Fan, with User Interface).Still tweaking on the app for usability, and now starting on decorating the parse tree with semantic actions for the fanfold editor.
Lots to do, making progress...
(I've also found a Java grammar, 1.4 or 1.5 level, that is successfully parsing. Might be able to do some things with that, at least in terms of simple transformations of code, sort of a translator-helper or something)
tcolar Sun 14 Jun 2009
Cool, thanks for taking the time to update it. Will update the antlr grammar accordingly. After that it should be "final" (well for now) :)
tcolar Sun 14 Jun 2009
That changelog looks more complex than it is :)
Anyway i updated teh antlr grammar and voila! all Fan distro files parsing fully (except gamma.fan ... which is not a fan file ... whatever it is) :)
JohnDG Mon 15 Jun 2009
This is something I have interest in. Fan is great, but it's not going to help you unless your own code is written in Fan. A Java-to-Fan translator should be relatively straightforward (as long as idiomatic Fan is not the goal -- there would be no closures in the converted code, for example).
That's great tcolar. The next step is error recovery. I understand ANTLR provides some help for this, but it's still a bit of work. You're goal should be able to detect all errors in a Fan file simultaneously, rather than stopping at the first one -- when you get this working, it will be immediately useful for an IDE.
tcolar Mon 15 Jun 2009
Yes I'll work on error recovery as well, antlr has facilities for that. At this point i'm gonna be mostly following the same procedure as explained here (just about the only "doc" on writing a NetBeans plugin with the new parsing API):
http://blogtrader.net/dcaoyuan/entry/erlang_plugin_for_netbeans_in2
Overall it's quite a bit more work than i thought it would be !
tcolar Sun 28 Jun 2009
Just wanted to give an update. I have been very busy at work in last 2 weeks so haven't had much time at all.
However i worked on it today, and i now have the ANTLR generated lexer used by Netbeans (Fan NB plugin) So that means:
This is just about all it does at this time (step1) but at least it's now usable.
Next step is to integrate the Parser ... which is what is needed for all the neat stuff (error checking, code folding, and so on).
Hopefully i'll have more free time by the end of next week and be able to get moving on this.