I keep finding myself wanting a method on List which iterates from index 1 to the end (skipping the first item) - kind of like cdr in Lisp.
Dummy example (if we didn't have List.join):
s = list.first.toStr
list.something |x| { s += ",$x" }
What should we call such a method?
Another option would be eachRange:
s = list.first.toStr
list.eachRange(1..-1) |x| { s += ",$x" }
andyThu 7 May 2009
Yeah I've wanted to iterate a subrange a few times myself. I think eachRange is what I'd like to have.
qualidafialThu 7 May 2009
Functionally this would be equivalent to list.slice(range).each, but without the copy overhead, correct?
brianThu 7 May 2009
Functionally this would be equivalent to list.slice(range).each, but without the copy overhead, correct?
Correct. Right now you have to use a for loop which seems a little un-Fan.
qualidafialThu 7 May 2009
I was just thinking that slice on a read-only list should share the same backing array as the original read-only list, the same way java.lang.String.substring shares its char[] array with the original string.
JohnDGFri 8 May 2009
What you really want, qualidafial, is probably something akin to the JFC's subList, which is merely a view on the original collection. This works for any collection.
s = list.first.toStr
list.sub(1..-1).each |x| { s += ",$x" }
Of course, slice on an immutable list (not just read-only -- that's insufficient) should indeed share the same backing.
KevinKelleyFri 8 May 2009
@Brian - I've ran into that thing too, where you want to hit all but the first (or last) of a list. Usually it's in a situation where I need both an element and its preceding (or following) element. Which is a bit awkward; each acts as an iterator on the list, but doesn't give you its neighbors. So, I've done
list.each |item, index| { if (index>0) doSomething(item, list[index-1]) }
more or less.
So I'm ambivalent on the Lisp cdr thing; On the one hand, when I found Lisp and its minimalist variants I really got stoked about how everything could be expressed so elegantly. But on the other hand, when I tried to put it to practical use, I kept finding that it was hard to turn practical problems into elegant answers. At least in reasonable time.
eachRange could be useful; but it's not complete. It helps sometimes but doesn't fix the problem.
I guess eachRange is just an inversion of something like:
...so maybe I'm suggesting, subsume this into a request for a Range.do method. or something. Range.each gives you:
(1..<list.size).each |i| { echo(list[i]) }
which to me anyway, seems to generalize better.
brianFri 8 May 2009
Regarding slice - today it does not share the same backing array (since I don't maintain an offset). Although it is a certainly an optimization we can add in the future, it doesn't effect the API semantics. Read-only lists do support the same backing store, and then copy on write to the original mutable list. That tends to be the really common case, which I optimized heavily for.
But I think that is really orthogonal to this discussion, I don't think you should be required to do slice[...].each. There should be a dedicated method for that. So I'm kind of thinking eachRange is the right way to go (which seems more OO than new methods on Range).
JohnDGFri 8 May 2009
What do you think about List.sub or List.range, retrieving a view of the list, ala java.util.List.subList? This would give you much more than just eachRange.
Helper methods are great, but even greater still are generic methods that can be used to do a lot of different things (and my experience with the JCF suggests that sublists are a very powerful abstraction).
// Loop over everything after first
list.range(1..-1).each |x| { ... }
// Clear everything after first
list.range(1..-1).clear
brianFri 8 May 2009
What do you think about List.sub or List.range, retrieving a view of the list, ala java.util.List.subList? This would give you much more than just eachRange.
How is that different than slice? Are you suggesting some expected semantics that it tracks modifications to the backing store? That seems kind of confusing to me.
JohnDGFri 8 May 2009
A sublist is a view of a list backed by the original list.
If you clear a sublist, you clear the portion of the original list that it represents. If you add items to a sublist, they are added to place in the original list that corresponds to the end of the sublist. Etc.
From the outside, a sublist appears to be an ordinary list, and all methods have normal semantics, it's just that the modifications actually occur to a region of the original list.
It's a very powerful construct that allows you to write much cleaner code (instead of keeping track of start/end indices, you just operate on lists, which are sublists of other lists).
JohnDGFri 8 May 2009
A slice is a copy of a region. A sublist is a view of a region. Both are lists, and were you to discard the original, a sublist would be semantically equivalent to a slice. But the whole point of sublists is that whatever you do to them affects the original list.
JohnDGFri 8 May 2009
Oh and one more thing: a sublist has virtually no overhead to construct (i.e. no copying of data).
brianFri 8 May 2009
A sublist is a view of a list backed by the original list.
Are you proposing a mutable sub-view over a mutable list? That seems really confusing to me, and I can't ever imagine a situation where I'd want that.
However, having an efficient way to get an read-only sub-view on a read-only list seems really useful - and doesn't cause any weird side-effects. In which case that is just an optimization on slice for read-only/immutable lists:
subList := list.ro[0..10]
But regardless, I still think we need a eachRange - that seems common enough that it shouldn't warrant multiple methods.
andyFri 8 May 2009
I think I side with Brian's comments on the sub-view concept. But regardless, eachRange is definitely something common enough to include in the API.
KevinKelleyFri 8 May 2009
The book Beautiful Code has a nice chapter on doing multi-dimensional array iterators for NumPy -- an N-dimensional array package for Python.
The author of that came up with a pretty well-generalized solution, apparently, with the concept of slicing to get access to a subset of the original array, without copying. One example use case that pops to mind is doing image processing -- iterate over an image array, taking 3x3 slices for blur/sharpen/whatever.
So, I think it's a powerful and useful idea. But it's not like it's a quick API change, it's deeper than just sugar.
I would suggest that, if somebody thinks it's useful, start with NumPy as an example and make a pod.
JohnDGFri 8 May 2009
Are you proposing a mutable sub-view over a mutable list? That seems really confusing to me, and I can't ever imagine a situation where I'd want that.
There's a reason it's in the JCF.
But regardless, I still think we need a eachRange
Fine with me, as long as it stops somewhere. Wouldn't want each sys class to have 50 special purpose methods when 5 generic ones used combinatorially would suffice.
brianFri 8 May 2009
I added an eachRange method - I think no matter what else we do that is useful.
qualidafialFri 8 May 2009
There's a reason it's in the JCF.
There's also some problems with the way it's done in JCF. Removals on the sublist are supported but additions are not. Structural changes on the superlist at or before the upper end of the sublist range do not shift the range accordingly on the sublist like you'd hope--it's just a shifted view with no correlation to events that happen in the original. Put in Fan terms:
Int[] a = [1,2,3,4,5]
Int[] b = a.sub(1..3)
echo(b) => [2,3,4]
a.removeAt(0)
echo(b) => [3,4,5]
Even if you could make the list track the original in a light-weight manner, there's still a question of whether additions on the superlist should be included in the sublist at either edge of the range:
Int[] a = [1,2,3,4,5]
Int[] b = a.sub(1..3)
echo(b) => [2,3,4]
a.insert(2, 0)
echo(b) => [2,0,3,4]
a.insert(1, -1)
echo(b) => [-1,2,0,3,4] or [2,0,3,4] ??
Fine with me, as long as it stops somewhere. Wouldn't want each sys class to have 50 special purpose methods when 5 generic ones used combinatorially would suffice.
I totally agree here--it feels like API bloat is starting to creep in. There's a more elegant solution to this somewhere, we just haven't found it yet.
brianFri 8 May 2009
I totally agree here--it feels like API bloat is starting to creep in. There's a more elegant solution to this somewhere, we just haven't found it yet.
I should note I have a philosophy of a few classes, but as many convenience methods on a given class as makes sense. So I have no problem with 50 methods on List if they are useful - I think that is way better than having methods which operator on Lists spread out across multiple classes (ie I hate the way Java's collection APIs do this). I subscribe to the "HumaneInferace" school of thought. What belongs and what doesn't is really a matter of taste.
But obviously combinatorial methods is better when it makes sense - which is why I tend to have convenience methods on all the key built-in types to translate into other types.
alexlamslFri 8 May 2009
Just make sure that it doesn't end up like Perl whereby no two programmer can understand each other's writings ;-)
brian Thu 7 May 2009
I keep finding myself wanting a method on List which iterates from index 1 to the end (skipping the first item) - kind of like cdr in Lisp.
Dummy example (if we didn't have List.join):
What should we call such a method?
Another option would be
eachRange
:andy Thu 7 May 2009
Yeah I've wanted to iterate a subrange a few times myself. I think
eachRange
is what I'd like to have.qualidafial Thu 7 May 2009
Functionally this would be equivalent to
list.slice(range).each
, but without the copy overhead, correct?brian Thu 7 May 2009
Correct. Right now you have to use a
for
loop which seems a little un-Fan.qualidafial Thu 7 May 2009
I was just thinking that
slice
on a read-only list should share the same backing array as the original read-only list, the same wayjava.lang.String.substring
shares itschar[]
array with the original string.JohnDG Fri 8 May 2009
What you really want, qualidafial, is probably something akin to the JFC's subList, which is merely a view on the original collection. This works for any collection.
Of course,
slice
on an immutable list (not just read-only -- that's insufficient) should indeed share the same backing.KevinKelley Fri 8 May 2009
@Brian - I've ran into that thing too, where you want to hit all but the first (or last) of a list. Usually it's in a situation where I need both an element and its preceding (or following) element. Which is a bit awkward;
each
acts as an iterator on the list, but doesn't give you its neighbors. So, I've donemore or less.
So I'm ambivalent on the Lisp cdr thing; On the one hand, when I found Lisp and its minimalist variants I really got stoked about how everything could be expressed so elegantly. But on the other hand, when I tried to put it to practical use, I kept finding that it was hard to turn practical problems into elegant answers. At least in reasonable time.
eachRange
could be useful; but it's not complete. It helps sometimes but doesn't fix the problem.I guess
eachRange
is just an inversion of something like:...so maybe I'm suggesting, subsume this into a request for a
Range.do
method. or something.Range.each
gives you:which to me anyway, seems to generalize better.
brian Fri 8 May 2009
Regarding slice - today it does not share the same backing array (since I don't maintain an offset). Although it is a certainly an optimization we can add in the future, it doesn't effect the API semantics. Read-only lists do support the same backing store, and then copy on write to the original mutable list. That tends to be the really common case, which I optimized heavily for.
But I think that is really orthogonal to this discussion, I don't think you should be required to do
slice[...].each
. There should be a dedicated method for that. So I'm kind of thinkingeachRange
is the right way to go (which seems more OO than new methods on Range).JohnDG Fri 8 May 2009
What do you think about
List.sub
orList.range
, retrieving a view of the list, alajava.util.List.subList
? This would give you much more than justeachRange
.Helper methods are great, but even greater still are generic methods that can be used to do a lot of different things (and my experience with the JCF suggests that sublists are a very powerful abstraction).
brian Fri 8 May 2009
How is that different than
slice
? Are you suggesting some expected semantics that it tracks modifications to the backing store? That seems kind of confusing to me.JohnDG Fri 8 May 2009
A sublist is a view of a list backed by the original list.
If you clear a sublist, you clear the portion of the original list that it represents. If you add items to a sublist, they are added to place in the original list that corresponds to the end of the sublist. Etc.
From the outside, a sublist appears to be an ordinary list, and all methods have normal semantics, it's just that the modifications actually occur to a region of the original list.
It's a very powerful construct that allows you to write much cleaner code (instead of keeping track of start/end indices, you just operate on lists, which are sublists of other lists).
JohnDG Fri 8 May 2009
A slice is a copy of a region. A sublist is a view of a region. Both are lists, and were you to discard the original, a sublist would be semantically equivalent to a slice. But the whole point of sublists is that whatever you do to them affects the original list.
JohnDG Fri 8 May 2009
Oh and one more thing: a sublist has virtually no overhead to construct (i.e. no copying of data).
brian Fri 8 May 2009
Are you proposing a mutable sub-view over a mutable list? That seems really confusing to me, and I can't ever imagine a situation where I'd want that.
However, having an efficient way to get an read-only sub-view on a read-only list seems really useful - and doesn't cause any weird side-effects. In which case that is just an optimization on
slice
for read-only/immutable lists:But regardless, I still think we need a
eachRange
- that seems common enough that it shouldn't warrant multiple methods.andy Fri 8 May 2009
I think I side with Brian's comments on the sub-view concept. But regardless,
eachRange
is definitely something common enough to include in the API.KevinKelley Fri 8 May 2009
The book
Beautiful Code
has a nice chapter on doing multi-dimensional array iterators for NumPy -- an N-dimensional array package for Python.The author of that came up with a pretty well-generalized solution, apparently, with the concept of slicing to get access to a subset of the original array, without copying. One example use case that pops to mind is doing image processing -- iterate over an image array, taking 3x3 slices for blur/sharpen/whatever.
So, I think it's a powerful and useful idea. But it's not like it's a quick API change, it's deeper than just sugar.
I would suggest that, if somebody thinks it's useful, start with NumPy as an example and make a pod.
JohnDG Fri 8 May 2009
There's a reason it's in the JCF.
Fine with me, as long as it stops somewhere. Wouldn't want each
sys
class to have 50 special purpose methods when 5 generic ones used combinatorially would suffice.brian Fri 8 May 2009
I added an
eachRange
method - I think no matter what else we do that is useful.qualidafial Fri 8 May 2009
There's also some problems with the way it's done in JCF. Removals on the sublist are supported but additions are not. Structural changes on the superlist at or before the upper end of the sublist range do not shift the range accordingly on the sublist like you'd hope--it's just a shifted view with no correlation to events that happen in the original. Put in Fan terms:
Even if you could make the list track the original in a light-weight manner, there's still a question of whether additions on the superlist should be included in the sublist at either edge of the range:
I totally agree here--it feels like API bloat is starting to creep in. There's a more elegant solution to this somewhere, we just haven't found it yet.
brian Fri 8 May 2009
I should note I have a philosophy of a few classes, but as many convenience methods on a given class as makes sense. So I have no problem with 50 methods on List if they are useful - I think that is way better than having methods which operator on Lists spread out across multiple classes (ie I hate the way Java's collection APIs do this). I subscribe to the "HumaneInferace" school of thought. What belongs and what doesn't is really a matter of taste.
But obviously combinatorial methods is better when it makes sense - which is why I tend to have convenience methods on all the key built-in types to translate into other types.
alexlamsl Fri 8 May 2009
Just make sure that it doesn't end up like Perl whereby no two programmer can understand each other's writings ;-)