All Topics

#331 Derived types

jodastephen Sun 10 Aug 2008

I'd like to float an idea that removes a language feature, duration literals, and replaces it with a more powerful feature which includes field validation (another requested feature). It may also affect facets.

The proposal is to allow what I'll call for now derived types. A derived type is a special kind of type that can be easily constructed and used as though it was another type, specifically a literal type.

// today
class Person {
  Str surname
  Str forename
  Int age
}

// with derived types
class Person {
  Surname surname
  Forename forename
  Age age
}
derived Surname : Str {
  static new make(Str str) {
    if (str == null) throw NullErr()
    if (str.size == 0) throw EmptyErr()
    if (str.size > 32) throw TooLongErr()
    derived = str
  }
}
derived Forename : Str {
  static new make(Str str) {
    if (str == null) throw NullErr()
    if (str.size == 0) throw EmptyErr()
    if (str.size > 32) throw TooLongErr()
    derived = str
  }
}
derived Age : Int {
  static new make(Int val) {
    if (val == null) throw NullErr()
    if (val < 0 || val > 125) throw InvalidErr()
    derived = val
  }
  Bool isAdult() {
    return derived > 16
  }
  Bool isChild() {
    return derived <= 16
  }
}

Obviously, the proposal involves writing a lot more code, but the benefit is in sharing rules and conceptual knowledge across a large application. Think data dictionary or XSD types. The real benefit comes with having a real place to have coded rules for types, and the ability to reuse those same rules on other owning types:

class FindPersonRequirements {
  Surname surname
  Age age
}

In terms of implementation, it might not be necessary to compile these to real classes in the bytecode. They might become static methods on a helper (avoiding excessive object creation by embedding the underlying value in the owning object). In particular, the derived type "constructor" gets run whenever the parent object is created. Derived types are const if they wrap a const type.

The key to this proposal is that constructing and extracting the value is easy - simply use the literal:

person := Person {surname="Smith"; forename="John"; age=42}
Str surname = person.surname
person.forename = "Stephen"

So, there is auto conversion (a request from yet another thread) to and from a derived type. However there is no auto conversion between two derived types:

person.surname = person.forename  // does not compile

This feature could remove the need for duration literals, by adding derived types based on an Int for Seconds, Minutes and Hours. It would also be possible to add derived types for distance, weight, voltage, day of month, apples... And these don't need to be language features with new literals - they just reuse this general feature.

This might affect facets. If a facet could be consided to be a representation of a derived type, then the thread on namespaced facets has a solution here.

And of course, derived types are entirely optional in any system.

I think this could really help take Fan to the next level of OO where properties on objects are properly defined and usable as objects in their own right, if desired. For example, see the isChild() method on Age.

alexlamsl Mon 11 Aug 2008

The example can be written today as:

// with derived types
class Person {
  Surname surname
  Forename forename
  Age age
}

class Surname : Str {
  const Str str;
  new make(Str str) {
    if (str == null) throw NullErr()
    if (str.size == 0) throw EmptyErr()
    if (str.size > 32) throw TooLongErr()
    this.str = str
  }
}

class Forename : Str {
  const Str str;
  new make(Str str) {
    if (str == null) throw NullErr()
    if (str.size == 0) throw EmptyErr()
    if (str.size > 32) throw TooLongErr()
    this.str = str
  }
}

class Age : Int {
  const Int val;
  new make(Int val) {
    if (val == null) throw NullErr()
    if (val < 0 || val > 125) throw InvalidErr()
    this.val = val
  }
  Bool isAdult() {
    return val > 16
  }
  Bool isChild() {
    return val <= 16
  }
}

sharing rules and conceptual knowledge across a large application

Sounds like the kind of code reuse that can be achieved by existing OO constructs (even in Java or .NET). IMHO people tend to do this all the time:

class Person {
  Str surname
  Str forename
  Int age
}

is simply a product of laziness - they would like to get the right code structure without worrying about validation and schematics. The propsed language feature does not seem to tackle such behaviour.

And if derived type is simply exploded into Str with a bunch of static helper methods, it would be a bit too magical for me - in the sense that it seems more like a way to circumvent final class restrictions on inheritance.

jodastephen Mon 11 Aug 2008

I think you've kind of missed the key point. Yes, the definition of the derived types is easy, and can be done using existing classes - thats a Good Thing, and keeps the type system simple.

The key aspect is the auto-conversion:

Str surname := person.surname
person.surname = "Smith"

That is the key to encouraging developers to make the right choices wrt the data in their application.

alexlamsl Mon 11 Aug 2008

Sorry for not stating that clearly enough - I feel that would be a bit too magical.

Assuming Str is not final, and Surname : Str:

Str surname := person.surname

That you can do it today.

person.surname = "Smith"

Would seems like a syntactic sugar of:

person.surname = new Surname("Smith");

Which is why it feels more like a way to circumvent final class restrictions on inheritance to me.

That is the key to encouraging developers to make the right choices wrt the data in their application.

I think I understand that point of view (and the accompanying frustration and rant...) - however I think there are 2 underlying issues:

Implementors do not think about validation when they first code their solution.
Library users do not maximise code reuse and reinvent the wheel time and time again.

What we can do today cannot help (1) much - and I would say the same about your proposal.

However, given a careful implementor, the use of say Person would enforce users to reuse Surname et. al. However, we cannot expect them to discover and reuse Surname if they have not started from Person.

In this scenario today's solution would bring light to users about the existence of Surname, whilst your proposal seem to make it less likely to happen.

Just my 2 cents really :-/

brian Mon 11 Aug 2008

Interesting proposal. I don't really want a new extension mechanism when we already have subclassing and composition. But I think the heart of this proposal is about the auto-conversion with literals.

Personally I prefer JohnDG's proposal as a more general purpose feature which would also solve this issue.

For example using composition (assuming I left Int final):

const class Age
{
  new make(Int int) { this.int = int }

  static Age fromInt(Int int) { return Age(int) }
  Int toInt() { return int }

  private const Int int
}

Then assignment between Int and Age:

i := 45
a := Age(35)

a = i   =>  a = Age.fromInt(i)
i = a   =>  i = a.toInt

That seems quite elegant to me and doesn't complicate the type system.

However, as I said previously - although I like this feature, I'm not ready to add it until we get more experience with the current feature set.

JohnDG Mon 11 Aug 2008

I agree it does not make sense to support "derived types" when there's a much simpler, more general, and more unifying concept lurking beneath it -- that of user-managed auto-conversion between disparate types (toX, fromX).

However, I do think that if auto-conversion were supported some day, it would be nice to make duration literals unmagical. e.g. introduce a class of the type:

class IntWithUnit {
   new make(Int int, Str unit) {
      ...
   }
   ...
}

Declarations such as "1day" would be automatically compiled to instances of this new class type (IntWithUnit(1, "day")). Assigning to a literal such as 1megaday would have the potential to throw a runtime exception.

Then Date would simply have toIntWithUnit and fromIntWithUnit methods that facilitate assignment to and from duration "literals".

The beauty of this approach is that it allows extensible, user-defined literals such as meters, kilometers, miles, feet, ounces, pounds, joules, watts, and so on.

(Note that any static solution would likely fail to easily capture the combinatorial explosion of suffixes that occur in the metric system, but such cases are handled easily with a more dynamic solution, such as representing the unit with a string.)

brian Mon 11 Aug 2008

The beauty of this approach is that it allows extensible, user-defined literals such as meters, kilometers, miles, feet, ounces, pounds, joules, watts, and so on.

As I've said before, I don't believe in using the type system for units of measurement. I don't know about scientific applications, but monitoring and control systems never use units in this way - they are always out-of-band metadata and rarely hardcoded directly into algorithms. So I don't consider the problem a typing problem, but rather a metadata or relationship problem.

jodastephen Mon 11 Aug 2008

I think that this proposal tackles a slightly different topic to general purpose conversion (which I'm dubious about).

Another way to think about this is to ask why Fan supports Enums? What is so special about an enum that means it gets a dedicated language feature?

Dates are the classic example of this. A month is represented as an enum, constraining its value to 1 to 12. But day of month is left as a plain Int, allowing the caller to pass in any old value unless validation is repeatedly made, despite the fact that we know its constraints of 1 to 31. What is the actual difference between these two cases? Surely not that one happens to have a pretty name?

Once you start to see enums as a weird special case of this proposal, I think it becomes a lot more appealing. Or maybe, we should define day of month as an enum with 31 values - _1, _2, _3 etc.?

The example can be written today...

BTW, the use of a derived keyword, or similar syntax, would also result in the creation of equals and hashcode methods, thus the current version of a Surname class is actually a lot longer than the proposals shorthand form.

I don't believe in using the type system for units of measurement. I don't know about scientific applications, but monitoring and control systems never use units in this way

Yes, you've mentioned this before, and that control systems are a particular focus of yours. But this is an area that has been raised by a few of us here. Fan appeals to a market beyond that of science and control systems. It appeals to the general business marketplace. And in those applications that I build I need a representation for miles, kilometers, kilograms and so forth. I have no requirement for complex out-of-band metadata, I just need to know that a value is miles as opposed to kilometres. And I really, really don't want to use an Int protected only by documentation.

JohnDG Mon 11 Aug 2008

As I've said before, I don't believe in using the type system for units of measurement.

You already do, at least with time literals.

In any case, lots of people DO believe in using types for units of measurement (see here, for example), and from category theory I can see exactly why: the concept is natural and prevents the same kinds of errors that typing prevents.

brian Tue 12 Aug 2008

But day of month is left as a plain Int, allowing the caller to pass in any old value unless validation is repeatedly made, despite the fact that we know its constraints of 1 to 31. What is the actual difference between these two cases?

The difference is that a valid day is not a stand alone type between 1 and 31, but rather is dependent on contextual information like the month and year. I do get your point, but in practice this sort of things happens very often - it is the container class that decides validity (probably by trapping sets). Although I do agree that sometimes we can pull out reusable validation into a class (and you can do that with the existing type system). But in my experience it more like 90/10 where the container is/isn't responsible. So I'm not inclined to complicate the type system when it can be done today (maybe with a little extra pain). And I still prefer the general purpose auto-conversion versus a more specialized feature.

And in those applications that I build I need a representation for miles, kilometers, kilograms and so forth

As I've said before Fan is definitely going to have representations for units with a built-in database of units which support conversion between units of like dimension (based on oBIX).

In any case, lots of people DO believe in using types for units of measurement

I agree you could use a type system for units, but that doesn't mean you should. To implement a proper type system for units of measurement is extremely complicated because of issues like:

type compatibility is really based on dimension (length), not units (km, miles)
expressions such as 5meter * 10sec yield new types
logarithmic and radian units throw a wrench into everything

You don't have tackle those issues, but in that case you don't really need any extensions to the type system either. Although your proposal to somehow map number literals such as 5foo to a constructor like (5, "foo") is not quite as ambitious and is a pretty cool idea.

jodastephen Tue 12 Aug 2008

The more general extensible literals could build on the underscore that is already valid as a separator:

days := 3_Days
metres := 5.2_Metres
dayOfMonth := DayOfMonth_12

Now, I would map these onto classes. You appear to have a different mapping using strings in mind. Lets ignore that for the moment, and just consider whether extensible unit literals like the above would be sweet. It definitely has the benefit of removing one highly specific language feature (duration literals).

JohnDG Tue 12 Aug 2008

type compatibility is really based on dimension (length), not units (km, miles)

Sure. You and I know that, but the average programmer doesn't care.

expressions such as 5meter * 10sec yield new types

I don't think anyone was proposing we support that. You need the ability to do abstract algebra and dynamic types -- likely of use in a scientific context but of no use for the average Java developer, who just wants to write expressions like, Length length := 10meters.

logarithmic and radian units throw a wrench into everything

I don't see how, in the limited case discussed above. The only context in which unit literals could be used is as the RHS of an assignment (where the type is known) or when passed as a parameter. e.g. Angle theta = 3.14rad or Angle theta = 180deg. The actual mathematics are handled by the class in question. Angle would likely only provide access to the angle via toDeg and toRad methods, to make the representation explicit to the callee.

Now, I would map these onto classes. You appear to have a different mapping using strings in mind.

Not sure I like that. Imagine the possibilities for meters:

yottameter
zettameter
exameter
petameter
terameter
gigameter
megameter
kilometer
hectometer
decameter
decimeter
meter
centimeter
millimeter
micrometer
nanometer
picometer
femtometer
attometer
zeptometer
yoctometer

Then if you want to get fancy and support more than meters for your Length type, then you need to add the British system. Not sure how many classes in total, but likely dozens.

It would be much easier to simply parse a string and lookup the scaling factor in a map.

Fantom

#331 Derived types

jodastephen Sun 10 Aug 2008

alexlamsl Mon 11 Aug 2008

jodastephen Mon 11 Aug 2008

alexlamsl Mon 11 Aug 2008

brian Mon 11 Aug 2008

JohnDG Mon 11 Aug 2008

brian Mon 11 Aug 2008

jodastephen Mon 11 Aug 2008

JohnDG Mon 11 Aug 2008

brian Tue 12 Aug 2008

jodastephen Tue 12 Aug 2008

JohnDG Tue 12 Aug 2008