I discovered recently, whilst trying to read and parse a SQL file, that Buf.eachLine() and Buf.readAllLines() both truncate lines at 4096 chars.
str := "".padl(5000, 'a') + "\n" + "".padl(5000, 'b') + "\n"
echo("\n eachLine:")
str.toBuf.eachLine {
echo("Line is ${it.size} chars")
}
echo("\n readAllLines:")
str.toBuf.readAllLines.each {
echo("Line is ${it.size} chars")
}
eachLine:
Line is 4096 chars
Line is 904 chars
Line is 4096 chars
Line is 904 chars
readAllLines:
Line is 4096 chars
Line is 904 chars
Line is 4096 chars
Line is 904 chars
There is no mention of this in the docs, and no means to increase the limit.
Only by looking at the Java source did I find they both made a call to in.readLine() with the default max length of 4096.
While I understand the need for a limit when dealing with streams, it seems unnecessary when using Bufs when everything is already in memory. Though it would be nice to keep the Buf and Stream APIs the same.
A compromise may be to propagate the truncation limit down to eachLine() and readAllLines(), as in:
Str[] readAllLines(Int? max := 4096)
Void eachLine(|Str line| f, Int? max := 4096)
The change should also be backwards compatible.
Work Around
For now you can use readLine() instead. It's a lot more code and finicky to use, but you are able to specify a truncation limit.
echo("\n readLine:")
buf := str.toBuf
line := null as Str
while ((line = buf.in.readLine(Int.maxVal)) != null) {
echo("Line is ${line.size} chars")
}
readLine:
Line is 5000 chars
Line is 5000 chars
Side Note
A lot of the docs for Buf simply say:
Convenience for InStream.XXX
Making you navigate to InStream to read about it! It'd be really nice if those few lines of documentation from InStream could be copied over to Buf!
SlimerDudeWed 30 Mar 2016
P.S. Here's a Top Tip when converting existing code from using eachLine() to readLine()...
Make sure you change any return statements to continue! Otherwise the return will now exit the method, not the closure!
str.toBuf.eachLine |line| {
if (...) {
return // <--- from this
}
}
buf := str.toBuf
line := null as Str
while ((line = buf.in.readLine(Int.maxVal)) != null) {
if (...) {
continue // <--- to this
}
}
Doh!
brianWed 6 Apr 2016
Definitely not good to have that omitted from the docs - I pushed a fix for that.
Adding maxLine to eachLine though won't work because the closure param needs at the end - that is why I originally designed it like it is
SlimerDudeWed 6 Apr 2016
eachLine() won't work because the closure param needs at the end
Oh yeah, good point. Nothing stopping maxLine from being added to readAllLines() though! :)
brianFri 16 Sep 2016
After a bit of internal discussion as this is a lingering issue we changed the default behavior to be max of null for readLine and associated helper methods like eachLine. It always seems to cause odd bugs which seem to out weight the benefits of trying to be safe in memory consumption (we do have methods like readAllStr and readAllBuf anyhow).
SlimerDude Wed 30 Mar 2016
I discovered recently, whilst trying to read and parse a SQL file, that
Buf.eachLine()
andBuf.readAllLines()
both truncate lines at 4096 chars.There is no mention of this in the docs, and no means to increase the limit.
Only by looking at the Java source did I find they both made a call to
in.readLine()
with the default max length of 4096.While I understand the need for a limit when dealing with streams, it seems unnecessary when using Bufs when everything is already in memory. Though it would be nice to keep the Buf and Stream APIs the same.
A compromise may be to propagate the truncation limit down to
eachLine()
andreadAllLines()
, as in:The change should also be backwards compatible.
Work Around
For now you can use
readLine()
instead. It's a lot more code and finicky to use, but you are able to specify a truncation limit.Side Note
A lot of the docs for
Buf
simply say:Making you navigate to
InStream
to read about it! It'd be really nice if those few lines of documentation fromInStream
could be copied over toBuf
!SlimerDude Wed 30 Mar 2016
P.S. Here's a Top Tip when converting existing code from using
eachLine()
toreadLine()
...Make sure you change any return statements to continue! Otherwise the
return
will now exit the method, not the closure!Doh!
brian Wed 6 Apr 2016
Definitely not good to have that omitted from the docs - I pushed a fix for that.
Adding maxLine to eachLine though won't work because the closure param needs at the end - that is why I originally designed it like it is
SlimerDude Wed 6 Apr 2016
Oh yeah, good point. Nothing stopping
maxLine
from being added toreadAllLines()
though! :)brian Fri 16 Sep 2016
After a bit of internal discussion as this is a lingering issue we changed the default behavior to be max of null for readLine and associated helper methods like eachLine. It always seems to cause odd bugs which seem to out weight the benefits of trying to be safe in memory consumption (we do have methods like readAllStr and readAllBuf anyhow).