This method may consume bytes/chars past the end of the serialized object (we may want to add a "full stop" token at some point to support compound object streams).
Is there any news regarding adding such token?
ivanThu 1 Oct 2009
Found temporary workaround:
out := `/home/ivan/output`.toFile.out
3.times
{
buf := StrBuf()
buf.out.writeObj(`/home/ivan/`)
out.writeUtf(buf.toStr)
}
out.flush.close
in := `/home/ivan/output`.toFile.in
3.times { echo(in.readUtf().in.readObj) }
brianThu 1 Oct 2009
Yeah, it is difficult to parse text off a stream without some sort of look ahead (which tends to be on a token basis, not necessarily a char basis).
The way I typically handle it is to create some record separator myself in the stream, then use that to chunk the stream into serialized objects.
Although I am open to trying to improve the current design with some "stop token".
SlimerDudeTue 17 Jun 2014
Related to the above, it still seems to be the case that sys::InStream.readObj can still only read one Obj from a Stream.
I ran into this when writing the Binary object in BSON. Wanting to serialise it, I thought "Easy! Just provide a toStr() and a fromStr() to write / read the values and mark it as @Serializable {simple=true}"
Essentially all I had was an Int and Str, so I tried this:
But then I got an EOS Err when reading myStr, presumably due to readObj():
This method may consume bytes/chars past the end of the serialized object
My work around was to seek() to the end of the first Obj, and continue reading. It seems to work fine:
static new fromStr(Str str) {
buf := str.toBuf
myInt := buf.readObj
// this next line is horrible, but works!
buf.seek(Buf().writeObj(myInt).pos)
myStr := (Str) buf.readObj
return Binary(myInt, myStr)
}
I was just wondering if this method of seeking to end of an Object could be utilised by InStream.readObj() so it can read multiple objects from the same stream. (For it would be really useful!)
brianTue 17 Jun 2014
Its really just a text tokenizing thing that you are typically looking ahead at a few tokens. So that code was all designed to suck in the entire stream or else use some other breaking mechanism to combine multiple objects together. The seek trick only works off a random access file (wouldn't work off a socket stream say)
SlimerDudeTue 2 Sep 2014
I've been re-(looking at / thinking about) this.
I can see that all the code is in the java fanx.serial package and that the tokenising you talk about is in the aptly named Tokenizer class. I was trying to understand why when reading an Obj you would need to read beyond the end of the Obj.
Complex objects seem easy enough - they end with a } - so just don't read beyond the last }!
I guess the problem is with (simple!) literals, especially numerical ones. For example, is 42 one number or is it two numbers, a 4 followed by a 2?
The idea of a stop token seems a bit cludgy to me, but with the above literal problem I don't see a way around it. Unless ObjEncoder.java when encoding top level Objs wrote out literals long hand, say 42 became sys::Int("42") or similar.
Going back to the stop token, would a ; char work? It's readable and understandable. It already represents end of statement, so for it to further represent end of object isn't such a big leap.
rasaWed 3 Sep 2014
@brian
Its really just a text tokenizing thing that you are typically looking ahead at a few tokens. So that code was all designed to suck in the entire stream or else use some other breaking mechanism to combine multiple objects together. The seek trick only works off a random access file (wouldn't work off a socket stream say)
You made so many pros&cons compromises during the Fantom design, so I don't understand why do you bother with such things like file deserialization through sockets. I find it so rare so the solution might be to copy the stream to temp file and then use random access to desearialize objects from it. If someone wants deserializtion over stream than it can switch to Java serialization capabilities. Besides, what's the benefit of having txt serialized files at the remote place?
ivan Thu 1 Oct 2009
Hello, Looks like there is a problem with
sys::InStream.readObj
. Here is an dummy example:This example succesfully reads first object, but then fails with IOErr: Unexpected symbol: / (0x2f)
However everything works fine when there is EOS after first object.
Fandoc for sys::Instream.readObj says:
Is there any news regarding adding such token?
ivan Thu 1 Oct 2009
Found temporary workaround:
brian Thu 1 Oct 2009
Yeah, it is difficult to parse text off a stream without some sort of look ahead (which tends to be on a token basis, not necessarily a char basis).
The way I typically handle it is to create some record separator myself in the stream, then use that to chunk the stream into serialized objects.
Although I am open to trying to improve the current design with some "stop token".
SlimerDude Tue 17 Jun 2014
Related to the above, it still seems to be the case that
sys::InStream.readObj
can still only read one Obj from a Stream.I ran into this when writing the Binary object in BSON. Wanting to serialise it, I thought "Easy! Just provide a
toStr()
and afromStr()
to write / read the values and mark it as@Serializable {simple=true}
"Essentially all I had was an Int and Str, so I tried this:
But then I got an EOS Err when reading
myStr
, presumably due toreadObj()
:My work around was to
seek()
to the end of the first Obj, and continue reading. It seems to work fine:I was just wondering if this method of seeking to end of an Object could be utilised by
InStream.readObj()
so it can read multiple objects from the same stream. (For it would be really useful!)brian Tue 17 Jun 2014
Its really just a text tokenizing thing that you are typically looking ahead at a few tokens. So that code was all designed to suck in the entire stream or else use some other breaking mechanism to combine multiple objects together. The seek trick only works off a random access file (wouldn't work off a socket stream say)
SlimerDude Tue 2 Sep 2014
I've been re-(looking at / thinking about) this.
I can see that all the code is in the java
fanx.serial
package and that the tokenising you talk about is in the aptly namedTokenizer
class. I was trying to understand why when reading an Obj you would need to read beyond the end of the Obj.Complex objects seem easy enough - they end with a
}
- so just don't read beyond the last}
!I guess the problem is with (simple!) literals, especially numerical ones. For example, is
42
one number or is it two numbers, a4
followed by a2
?The idea of a stop token seems a bit cludgy to me, but with the above literal problem I don't see a way around it. Unless
ObjEncoder.java
when encoding top level Objs wrote out literals long hand, say42
becamesys::Int("42")
or similar.Going back to the stop token, would a
;
char work? It's readable and understandable. It already represents end of statement, so for it to further represent end of object isn't such a big leap.rasa Wed 3 Sep 2014
@brian
You made so many pros&cons compromises during the Fantom design, so I don't understand why do you bother with such things like file deserialization through sockets. I find it so rare so the solution might be to copy the stream to temp file and then use random access to desearialize objects from it. If someone wants deserializtion over stream than it can switch to Java serialization capabilities. Besides, what's the benefit of having txt serialized files at the remote place?