Writing the full Expression parser for the IntelliJ plugin, I bump into one single limitation of the grammar. I mean that beside this issue, I managed to identify all the types of the ASTNodes (Identifier, Class reference, Slots, ...). The single issue I have is with List literals. List literals type can be inferred, or declared. The problem is that when they are declared, you don't actually need to declare them has a List type since the last open bracket starting the literal list is used to "declare" it as a list type. So we end up with: Widget[n, m] being actually Widget[][n,m] and the parser cannot separated this expression from an indexed expression like widget[n] where n is an integer! We then cannot provide contextual help inside the bracket statement, until we fully know what is Widget and widget. Usually numbers and ranges can be IDE assisted just knowing we are inside brackets of an indexed expression. In the entire fan code based this issue appears only in 9 files. I know that in development I use directly [1,2]["a","b"] or Num[][1,2] but Num[1,2] not that much?
Anyway, not very critical, but I think it's an interesting finding about this very nice grammar.
brianMon 6 Apr 2009
Fred,
The way I handle that is with the two pass parser. The first pass looks at the using statements to build a map of all the types imported by the compilation unit. Then after that first pass we know all identifiers which represent type names. Once you know whether an identifier is a type or not, the grammar is unambiguous.
One downside to this design is that you can't use a identifier which maps to an imported type for anything else. For example in Java you can do stuff like:
String String = "hello";
You can't do that in Fan due to the two pass compiler design.
The other downside is that you need to all the types declared within a given pod name. Of course in Fan this is easy. For the Java FFI it is a bit trickier.
jodastephenMon 6 Apr 2009
I know this might seem like heresy, but why not force all types to start with a capital letter, and all identifiers to start with anything except a capital letter (typically a lower case letter. This matches the vast majority of code written, and must make a big difference to the overall parser. It also avoids code like the above becoming invalid just because an import statement is added.
brianMon 6 Apr 2009
I know this might seem like heresy, but why not force all types to start with a capital letter, and all identifiers to start with anything except a capital letter (typically a lower case letter.
I considered that approach a couple of years ago when this came up, but it was a non-starter. Today the problem is that it would totally break Java FFI since Java uses screaming caps convention for constant fields.
freddy33 Mon 6 Apr 2009
Writing the full Expression parser for the IntelliJ plugin, I bump into one single limitation of the grammar. I mean that beside this issue, I managed to identify all the types of the ASTNodes (Identifier, Class reference, Slots, ...). The single issue I have is with List literals. List literals type can be inferred, or declared. The problem is that when they are declared, you don't actually need to declare them has a List type since the last open bracket starting the literal list is used to "declare" it as a list type. So we end up with:
Widget[n, m]
being actuallyWidget[][n,m]
and the parser cannot separated this expression from an indexed expression likewidget[n]
where n is an integer! We then cannot provide contextual help inside the bracket statement, until we fully know what isWidget
andwidget
. Usually numbers and ranges can be IDE assisted just knowing we are inside brackets of an indexed expression. In the entire fan code based this issue appears only in 9 files. I know that in development I use directly[1,2]
["a","b"]
orNum[][1,2]
butNum[1,2]
not that much?Anyway, not very critical, but I think it's an interesting finding about this very nice grammar.
brian Mon 6 Apr 2009
Fred,
The way I handle that is with the two pass parser. The first pass looks at the using statements to build a map of all the types imported by the compilation unit. Then after that first pass we know all identifiers which represent type names. Once you know whether an identifier is a type or not, the grammar is unambiguous.
One downside to this design is that you can't use a identifier which maps to an imported type for anything else. For example in Java you can do stuff like:
You can't do that in Fan due to the two pass compiler design.
The other downside is that you need to all the types declared within a given pod name. Of course in Fan this is easy. For the Java FFI it is a bit trickier.
jodastephen Mon 6 Apr 2009
I know this might seem like heresy, but why not force all types to start with a capital letter, and all identifiers to start with anything except a capital letter (typically a lower case letter. This matches the vast majority of code written, and must make a big difference to the overall parser. It also avoids code like the above becoming invalid just because an import statement is added.
brian Mon 6 Apr 2009
I considered that approach a couple of years ago when this came up, but it was a non-starter. Today the problem is that it would totally break Java FFI since Java uses screaming caps convention for constant fields.