P.S. Obviously I couldn't use this character in the post, so I replaced it with a <magic> :-)
KevinKelleyWed 19 Jun 2013
That 4-byte sequence [-16, -97, -116, -128] is uF0978C80 which looks like (from the utf8 wikipedia) 21 bits of data; UniSearcher calls it "cyclone" in miscellaneous symbols and pictographs; I guess it's outside of the basic plane anyway.
I guess that's why Java's treating it as 2 codepoints.
Quick look in Fantom source, src/sys/java/fan/sys/InStream.java at the readUtf method, appears only to recognize up to 3-byte encodings, and reports that error for anything else.
Yuri Strot Wed 19 Jun 2013
I have found some interesting character which can't be parsed by Fantom streams.
There is a simple Java program:
Which works correctly and prints this:
Now if I try to use this character in Fantom I will get this:
The same problem with this symbol in a file.
P.S. Obviously I couldn't use this character in the post, so I replaced it with a <magic> :-)
KevinKelley Wed 19 Jun 2013
That 4-byte sequence
[-16, -97, -116, -128]
isuF0978C80
which looks like (from the utf8 wikipedia) 21 bits of data; UniSearcher calls it "cyclone" inmiscellaneous symbols and pictographs
; I guess it's outside of the basic plane anyway.I guess that's why Java's treating it as 2 codepoints.
Quick look in Fantom source, src/sys/java/fan/sys/InStream.java at the readUtf method, appears only to recognize up to 3-byte encodings, and reports that error for anything else.