All Topics

#399 Tuples and Multiple Return Values

katox Thu 20 Nov 2008

Continuing discussion from Quick composites and pattern matching and python tuples.

jodastephen:

Perhaps we are at a point where the diverging forces pull Fan in one direction or another. I don't believe Fan is a functional language. I see it as an OO language that takes the best ideas and what we've learnt from Java and C#.

Multiple return values have always come up as a solution looking for a problem in my world. Values can be returned as a map, array, or dedicated little class. But allowing tuples opens a door that many, many developers will abuse like nobodies business. Like all language design its about the tradeoff, and I think tuples are very rarely needed, and very easily abused. Hence we should leave them out.

This thread shouldn't be interpreted as just being about simple one or two field classes. We should consider easy creation of larger domain objects as a goal. The fields of Fan greatly help this over Java, but we need some way to get equals/hashCode/constructor behaviours too. The examples I started with demonstrate this - for simple classes I should not need to write an equals method.

The need to handle multiple return values comes up often. It has been widely discussed even for procedural languages. There are different solutions to that - usually the languages allow in/out parameters or suggest to return a map, a list or an array (mostly untyped). I find those solutions quite faint.

This is a new refactoring feature introduced in IDEA 8 - introducing a parameter object. It works but it also clutters your code with utilitarian classes with no other reason but how to workaround the MVR problem. IMHO type safe tuples capture the programmer's intent much more cleanly (and it is also way shorter).

Stephen, what kind of abuse do you have in mind when it comes to tuples?

jodastephen Thu 20 Nov 2008

Should a method returning a point x and y return a tuple? or a Point object:

x,y := line.startPoint

point := line.startPoint

Which provides more real semantic information? The kind that will last the lifetime of a large enterprise project.

More specifically, how do you document the meaning of the return values if you use a tuple. You would need to write it in the method startPoint() and in every similar method. With a Point class, you define and document it once.

Basically, I'm arguing that a desire to return two values is driven by one of two things:

return two related values - in this case, the values have some meaning together, so good OO design practice means they should be represented as a class
return two unrelated values - in this case your API is almost certainly wrong. Perhaps the two return values are really the state of a processor or builder object of which the method you are calling should actually be residing.

Allowing a tuple in either case would allow developers to be lazy, and worse quality systems to be produced.

I'd bat the question back. Why do you think the use case comes up often? Do you code in an FP style perhaps which causes tuples to be a higher requirements?

katox Thu 20 Nov 2008

What about this

class Canvas {
  ...
  Void drawPoint(Int x, Int y, Colour colour) { gc.putPixel(x,y, colour); }
}

..
line.draw (10, 20, c)

and this

Class Point {
   Int x
   Int y

   new make(Int x, Int y) {
       this.x = x
       this.y = y
   }

   /* + equals, hash, compare */
 }

class Canvas {
   ...
   Void drawPoint(Point p, Colour colour) { gc.putPixel(p.getX, p.getY, colour) }
}
...
line.draw(Point(10,10, c))

now back to your original question

Which provides more real semantic information?

Neither. It depends on the context. If the complete API is build around Point object then it makes sence to define it that way disregaring that there is a lot of boilerplate. It just pays off later. If the other APIs are built mostly using primitive types and then the first way may be better. Which one do you find more readable BTW?

You introduce a type Point to put x and y logically together. Fine. Do you also make colour added subtype ColouredPoint carrying this information (+ redefinition of equals, hash, compare and make)? It is often more convenient not to do so.

The same holds for return types. You can mess up the API - nothing can stop you from that either way.

I see following problems with ad-hoc classes

lots of boilerplate - quite unneccessary if you don't reuse the class + you can mess up the equals,compare,hash functions
ad-hoc classes have no behaviour - how is that OO?
there are always very similar ad-hoc classes which are used for similar purposes - but not quite - there are subtle differences and they usually don't implement any common interface so you have to go with reflection or even more boilerplate code - this leads to brittle code
fallback to Obj[] is horrifying

This was about putPixel. What about getPixel? Do you feel a major difference because you want to read a pixel instead of writing it? Why?

I admit I seldomly use tuples in public APIs but they are very useful internally. It's a sort of convenience feature (saves quite a lot of typing) but programmers are lazy. There is nothing wrong with being lazy to type inexpressive clutter instead of terse simple code.

I also try to stick with primitive types in APIs (or already declared classes) because of the reasons roughcasted above. Tuples work just fine for string parsing, expressing immediate state information and representing not very related values which are always used separately in existing APIs.

Tuples also can save you some troubles if you need to return some value and a some return code information. You don't need to save the state to the object itself and then query it again. You just accept it in the return tuple or you can ignore it (that's very similar to how Fan handles unneccessary input parameters). It's type safe and you can document it completely the same way you document your input parameters.

Allowing a tuple in either case would allow developers to be lazy, and worse quality systems to be produced.

That's a sheer speculation. Frankly, I had no problems with this - unlike in our Java code where lazy developers don't implement equals correctly. Or they insert (to seemingly random places) "exception handling" by {} to avoid checked exceptions.

Maybe someone with experience with larger Python or Ruby projects could share a view on this.

Why do you think the use case comes up often?

It does. I think it depends how you are used to it. I think it is a very similar to closures btw. You can express the same in Java - but it is longer and it doesn't have any added value to make a wrapper class around. And a combination of closures and tuples is also very interesting and useful.

helium Thu 20 Nov 2008

There arise some questions when you integrate tuples.

() is the emtpy tuple, the unit tupel, i.e. Void. Can we now denote it's value? Does it ever make sense in a language with such an unexpressive type system?

How do we treat singleton tuples? Are they just what they are: single values or is there some special (useless?) type for singleton tuples (Python has (42,) for a singleton tuple consisting of 42).

Are tuples immutable? Than they can be covariant.

Void foo(x : (Base, Base)) { ... }

tupple := (DerivedFromBase(), DerivedFromBase())

foo(tuple)  // OK, if tuples are immutable, unsafe otherwise

Immutable tuples would also allow bigger tuples to be subtypes of smaller tuples, i.e. (T, U, V) <: (T, U) and (T, U) <: T (or (T, U) <: (T) if you tag singleton tuples). <: here means "is subtype of".

Recap:

unit tuple denotable?
singleton tuples just normal single values?
immutabe?
if immutable covariant?
if immutable are bigger tuples subtypes of smaller tuples?

BTW, this last rule would imply that every thing is a subtype of the empty tuple i.e. Void.

helium Thu 20 Nov 2008

Maybe someone with experience with larger Python or Ruby projects could share a view on this.

Someone with experience with larger Ruby projects wouldn't be of much help as Ruby does not have tuples.

katox Thu 20 Nov 2008

Someone with experience with larger Ruby projects wouldn't be of much help as Ruby does not have tuples.

No, but it does allow MVR. See this MVR example.

jodastephen Fri 21 Nov 2008

> Which provides more real semantic information? > Neither. It depends on the context. If the complete API is build around Point object then it makes sence to define it that way disregaring that there is a lot of boilerplate. It just pays off later. If the other APIs are built mostly using primitive types and then the first way may be better. Which one do you find more readable BTW?

I find the version with Point to be a more readable API. When I'm scanning the docs or outline view its obvious what the method is expecting. Now the API writer may choose to allow an Int, Int input as well, but thats for convenience, not the main API.

I see following problems with ad-hoc classes. Lots of boilerplate - quite unneccessary if you don't reuse the class + you can mess up the equals,compare,hash functions

Thats where the thread started. I want to remove that boilerplate. Very much.

ad-hoc classes have no behaviour - how is that OO?

Because they are grouped together as an object. I'm not an OO zealot BTW. I'm not nearly as bothered by lack of behaviour as many people are. I believe that the mere act of grouping related things together in the same object is a big win for long term understanding and readability.

I admit I seldomly use tuples in public APIs but they are very useful internally. It's a sort of convenience feature (saves quite a lot of typing) but programmers are lazy. There is nothing wrong with being lazy to type inexpressive clutter instead of terse simple code.

Thats fine, but I don't have faith that this wouldn't be abused by typical developers by adding it to public APIs.

Immutable tuples would also allow bigger tuples to be subtypes of smaller tuples, i.e. (T, U, V) <: (T, U) and (T, U) <: T (or (T, U) <: (T) if you tag singleton tuples). <: here means "is subtype of".

Another reason for me to oppose tuples. IMO, most developers wouldn't appreciate the subclassing going on here. In addition, I suspect that this would break Fan's generics (not enough letters in the alphabet).

brian Fri 21 Nov 2008

My own observations...

Most of my work tends towards public APIs designed for a very long lifespan. As such I agree that taking the time to design a named class for structures tends to be the best choice. Although certainly I've found that when creating scripts in Python that tuples tend to be very handy. I wouldn't wish to withhold a productivity feature because it can be abused - good designers create good APIs, and bad designers create bad APIs no matter what tools they are given.

I personally don't mind living without tuples. Lack of tuples isn't something I've considered a pressing need to fix in Java or C#. So they aren't on my short term radar.

I agree that we should try to make these "struct" classes easier to code. However, although I see the problem and would love a solution, I'm very skeptical of a solution that is not general purpose or seems auto-magical in anyway. I'm not sure we've actually seen any concrete proposals other than potentially a new keyword for fields or tuples.

There arise some questions when you integrate tuples.

All good points - I think we'd tend towards something like (,) for empty tuples. Although we need to first agree that we actually want tuples.

helium Fri 21 Nov 2008

Another reason for me to oppose tuples. IMO, most developers wouldn't appreciate the subclassing going on here. In addition, I suspect that this would break Fan's generics (not enough letters in the alphabet).

The alphabet isn't the problem, IMO: T0, T1, T2, T3, T4, ..., T100, ...

I don't realy understand this generics system anyway. Normaly you just have type-level variables. And like for any variable you can choose any name you want. Value-level variables can hold values and have a type, type-level variables hold types and have a kind.

And the subtyping is just something that is possible to implement, not something that has to be implemented.

helium Fri 21 Nov 2008

And Fan's generics system has other problems, too. For example, Fan's list seem to lack something like takeWhile and dropWhile. But I can't write them on my own in some list utility class until the Fan list implements these methods (if they ever will), can I?

brian Fri 21 Nov 2008

But I can't write them on my own in some list utility class until the Fan list implements these methods (if they ever will), can I?

Why can't you write them yourself?

And Fan's generics system has other problems, too. For example, Fan's list seem to lack something like takeWhile and dropWhile.

I'm not familiar with those methods - but does takeWhile do the same thing as eachBreak?

I'm open to adding any commonly used methods to List - can you propose a specification and some examples for those two methods in another thread?

alexlamsl Fri 21 Nov 2008

And Fan's generics system has other problems, too. For example, Fan's list seem to lack something like takeWhile and dropWhile.

takeWhile - findAll

dropWhile - exclude

Edit: please do not escape my "'" when they are part of the inline code style...

helium Fri 21 Nov 2008

see other thread

katox Fri 21 Nov 2008

* unit tuple denotable?

Void

* singleton tuples just normal single values?

yes

* immutabe?

yes

* if immutable covariant?

yes

* if immutable are bigger tuples subtypes of smaller tuples?

could be, I don't see a nasty case

Some pro-tuple arguments:

very lightweight, a developer can avoid creating classes which are never reused
syntactically fits. There is no confusion about using () you already do that when passing parameters. Even better, a function gets a tuple and returns a tuple (singleton tuple having a shortcut with no parenthesis).
tuples allow simple function chaining. While this is possible in Java there is a limitation when you want to use more than one single object as a parameter. In Fan, there are many functions with more than one parameter.
it can be done in a type safe manner
tuples allow APIs where there is no need to cache some status information (it can be returned along with the result and throw away when unneccessary - underscore (a, _, b) notation
99% of usages are pairs or triples so it is usually very comprehensible
they may be very convenient when used with Maps:

Scala example:

val tastiness = Map("Apple" -> 5, "Pear" -> 3, "Orange" -> 8, "Mango" -> 7, "Pineapple" -> 8)

println("On a scale from 1-10:")
tastiness.foreach { tuple:(String, Int) =>
  val (fruit, value) = tuple

  println("    " + fruit + " : " + value)
}

Of course it can be replaced by a Pair class. But show me a language where such a Pair class is used constently in all APIs. How many Pair classes do you need?

helium Fri 21 Nov 2008

> * unit tuple denotable?

Void

You suggest to denote the value as Void? I'd suggest ().

katox Fri 21 Nov 2008

Ah, I misunderstood that. I mean type Void and notation ().

katox Sun 23 Nov 2008

Regarding the Point object - I'm now toying with SWT StyledText - I found following API excerpts quite significant:

Point computeSize(int wHint, int hHint, boolean changed) Computes the preferred size of thisStyledText.
Point getLocationAtOffset(int offset) Returns the upper-left corner of the character at the zero-based offset specified by offset.
int getOffsetAtLocation(Point point) Returns the zero-based offset into the text of the character at the location specified by point.
Point getSelection() Returns the current selection. The returned Point's x member contains the offset of the first selected character, and the y member contains the offset after the last selected character.
Point getSelectionRange() Returns the selection as the offset of the first selected character, contained in the returned Point's x member, and the length of the selection, contained in the y member.
String getTextRange(int start, int length) Returns a copy of the text in this StyledText starting at the offset specified by start and continuing for length characters.
void replaceTextRange(int start, int length, String text) Replaces the text from the zero-based offset specified by start and continuing length characters with the text specified by text.
void setSelection(int start, int end) Sets the selection beginning at the character at the zero-based index specified by start and ending at the character at the zero-based index specified by end, and scrolls the selection into view.
void setSelection(Point point) Sets the selection beginning at the character at the zero-based index specified by point.x and ending at the character at the zero-based index specified by point.y, and scrolls the selection into view.

So what do we have? A Point class which is used for the following:

width and height in pixels of the component (1)
x and y pixel coordinates within the widget (2,3)
start offset and length of the text selection in the widget (4,5)
start and end offsets of the text selection in the widget (9)

It is basically a tuple, a pair, which happens to have a name Point (and a defined class in Java). Notice that the usage is inconsistent even more because

computeSize returns a Point for width and height but accepts only a pair for a preferred size.
Because getSelection returns a Point there has to be a method setSelection (Point point) to allow the function chaining but there is also setSelection(int start, int length) which is used in 90% of the cases in the source code.
getSelectionRange returns a Point but replaceTextRange doesn't accept it - it is paired only with getTextRange which uses an input pair.

I agree that a good API design doesn't come for free. It is a hard thing to do but as you can see declaring a Point class - quite long in Java, see org.swt.eclipse.graphics.Point - didn't help at all. It is even worse now because the documentation for Point class is misleading.

Done properly, there should have been at least 4 such classes grouping x and y with the proper semantics. Of course, then you would have a problem because you can't use them interchangeably. Also, you should define a number of convenience methods or drop multiparameter inputs completely.

This is one of the cases where I would really think of using tuples even in a public API. Tuples actually increase the consistency of the language!

helium Sun 23 Nov 2008

OK this usage of a class called Point can only decribed by WTF?!?

brian Sun 23 Nov 2008

I'd have to say that overall I find SWT very well designed. But the way they use the Point class to represent locations and sizes it just crazy. Forcing developers to use x and y to denote width and height to save a class doesn't seem like a reasonable design trade off.

You can see the fwt APIs actually define a proper Point, Size and even a Hints class. So I don't think this is a use case for tuples - I think I would have defined those classes the same even if we had tuples.

katox Sun 23 Nov 2008

But the way they use the Point class to represent locations and sizes it just crazy.

It's not just to avoid creating a class. Even creating those four classes could be reasonable. The problem is elsewhere - you have to consistently use those classes in all APIs.

If you have the input parameters defined as (startOffset, endOffset) tuples I see no point (or Point :-)?) to define a return value of Offset. If you do that you'll be stuck with patterns like

offset := getTextSelection()
setTextSelection(offset.x, offset.y)
range := Range(offset.x, offset.y-offset.x)
setStyle(range, newStyle)

having to declare two methods instead of one to be able to chain

b.setTextSelection(a.getTextSelection)

The worst thing is that there is a little chance that unrelated APIs - Str, Text, TextEditor, File etc. would use the same Offset class. There would be an impedance mismatch even if you managed to define everything using correct wrappers for your API part. Would you use reflection to shovel data from one type of Offset class to another?

It is often better to stick with basic types because it reduces the boilerplate. Of course, the more reuse the better reason to introduce a new class. Declaring a new class for each tiny bit just hurts interoperability and slows down everything.

I'd say the various range classes are candidates for tuples. You mostly operate on lower or upper bound, you use them separately. There is usually no usage of the introduced class by itself (but to put it somewhere as a parameter or back).

There is a chance that someone will abuse tuples. This (real life!) example has shown how can you abuse little classes. The tuple syntax would be natural for this type of range APIs. I'd argue that the SWT developer would mess the API much less with tuples.

katox Sun 23 Nov 2008

You can see the fwt APIs actually define a proper Point, Size and even a Hints class.

That's fine. On the other hand Fan also uses (start, length) pairs in TextWidget, RichTextModel, RichText. I'd expect the return type in correspoding pair functions the same - it is only logical to do so.

See GTK+ for instance. It has quite sane API, very similar to fwt in the end. They return parameters in *x, *y out pointer values (worse notation than tuples - you have to document it perfectly).

tomcl Sun 16 Mar 2014

I thought I might revive this discussion (and a few others) about tuples and pattern matching. I've been puzzling about how such things could fit in which Fantom's otherwise very nice structure.

I come from a background where I value them. Thinking about how/why they would fit into Fantom, and where I've found them useful, I'm inclined to agree that tuples are often a poor man's record (I know formally they are a Cartesian product, but practically in use cases the n elements in a tuple should usually be documented by names).

Tuples are so nice partly because their use has zero overhead, and often the lack of documentation is balanced by the lack of clutter.

There is another powerful reason for liking tuples: they allow type inference across packing and unpacking of data, without the high overhead of a class definition. The example everyone sees is return of multiple values from a function.

So they are a lightweight equivalent of an anonymous const class with lots of type inference, if you don't mind adding field names (which I don't).

I was thinking that this could be shoehorned into Fantom's type system as type inference across (type elaborated) Maps with compile-constant keys, in which case existing syntax gives you what you want. But maybe it makes more sense as an anonymous const class with syntactic sugar for creating type definition and instance at same time, with type inference for fields. You'd want these classes to be duck typed, and maybe restricted in some ways.

The other thing that would be nice, pattern matching with switch on pattern match, could fit into this structure properly only by an extension of the type system to allow types that are unions of these restricted classes. That is maybe a step too far.

You can do pattern matching dynamically, but the whole point here is that mis-spelt field names in a pattern can only be caught by the compiler - as is needed - if the compiler has a static idea of such a union and so knows what are the possible names.

I don't know if this is naive? Or if this could be implemented using a DSL...

Tom

brian Mon 17 Mar 2014

Hi Tom, welcome to Fantom!

I think implementing tuples properly in the type system would be a big undertaking since typically the expectation is that each position has its own unique type such as:

<Int,Str> foo() { return <3, "str"> }

If you just supported 8 positions though like we do Func, then we could make it work pretty similar.

If we didn't care about typing, then implementing a destructured assignment from a list would be fairly easy:

Obj[] foo() { return [3, "str"] }

num, str := foo()

But that would sort of suck without good type inference. Using maps has the same problem, the value has one fixed type. I agree they would be nice, but it has never inflicted enough pain that I've felt the need for them. There are bigger pain points I'd rather address first.