#447 New-Verify & Static-New - enhanced construction proposal

jodastephen Sun 1 Feb 2009

This is a proposal to tackle object construction based partly on the previous one and partly on a broader understanding from many other posts.

Today

Lets consider a Rectangle class extending a Shape. Today in Fan:

class Shape {
  const Str name

  new make(Str name) {
    this.name = name
  }
}

class Rectangle : Shape {
  const Point botLeft
  const Point topRight

  new make(Point botLeft := null, Point topRight := null) : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  }

  static Rectangle makeBySize(Point botLeft, Size size) {
    return make(botLeft, Point(botLeft.x + size.w, botLeft.y + size.h))
  }
}

Rectangle(point1, point2)
Rectangle {botLeft=point1; topRight=point2}
Rectangle.makeBySize(point1, Size(20, 40))

So, the rectangle can be constructed by constructor or construction-with-block, or by addition factory. This is fine until we want to validate that the rectangle has an area no larger than 1000. Now we have to make the constructor private and rework:

class Rectangle : Shape {
  // fields and makeBySize unchanged

  static Rectangle make(Point botLeft, Point topRight) {
    if (topRight.x - botLeft.x * topRight.y - botLeft.y > 1000) throw ArgErr()
    return internalMake(botLeft, topRight)
  }

  private new internalMake(Point botLeft, Point topRight) : super.make("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  }
}

Rectangle(point1, point2)
//Rectangle {botLeft=point1; topRight=point2}  // no longer compiles
Rectangle.makeBySize(point1, Size(20, 40))

We've had to write more code - an extra private method which manually references super.make rather than just super. This extra code isn't ideal. But we also can't use a with-block now.

Proposal

The proposal is based around the noted concept that "a constructor is static on the outside and instance on the inside". The aim is to leave existing Fan constructors alone, so they still work exactly as is (thus all existing code works unchanged). I propose the addition of static new "construction factories":

class Rectangle : Shape {
  // fields and makeBySize unchanged

  static new make(Point botLeft := null, Point topRight := null) {
    return new : super("Rect") {
      this.botLeft = botLeft
      this.topRight = topRight
    }
  }
}

Rectangle(point1, point2)
Rectangle {botLeft=point1; topRight=point2}  // compiles again!
Rectangle.makeBySize(point1, Size(20, 40))

The static new marks the method as being a "static method that performs construction". It enables construction-with-blocks and acts as the return type (an instance of the class in question). Unlike standard constructors, the body of a static new is statically scoped. The conversion to instance scope is done explicitly, by calling new {...}.

The embedded new{...} calls super in a similar manner to current Fan constructors. Within the embedded new{...} the scope is no longer static, it is the scope of the new instance.

Note: An alternate way of thinking of this approach is simply embedding the internalMake method within the static method.

So how do a standard new constructor and a static new constructor relate? Well, the standard new constructor is merely a shorthand form. The following two are identical in meaning:

new make(Point botLeft := null, Point topRight := null) : super("Rect") {
  this.botLeft = botLeft
  this.topRight = topRight
}

static new make(Point botLeft := null, Point topRight := null) {
  return new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  }
}

This is a simple mental model - easy to explain and consistent.

Some more detail:

1) Validating construction-with-blocks now becomes possible and obvious:

static new make(Point botLeft := null, Point topRight := null) {

  return new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  } verify {
    if (this.topRight.x - this.botLeft.x * this.topRight.y - this.botLeft.y > 1000) throw ArgErr()
  }
}

Once the new {} block has completed, the verify block (optional) is called. This design (a linked block) is necessary to ensure that there is a separation point where the instance is fully validated, and can then be safely published. The scope within verify is that of the new instance.

2) Safe publishing is easy:

static new make(Point botLeft := null, Point topRight := null) {
  created := new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  } verify {
    if (this.topRight.x - this.botLeft.x * this.topRight.y - this.botLeft.y > 1000) throw ArgErr()
  }
  // created instance is now fully locked down and safe to publish
  SomeExternalService.addInstance(created)
  return created
}

Once the closing brace of the new {..} verify {...} expression has completed the instance is valid. At that point const is locked down and any Java Memory Model issues addressed. Simply by assigning the result from new {..} to a variable, we can now safely publish the instance.

3) The necessary parameters for the superclass can be calculated if they are complex:

static new make(Point botLeft := null, Point topRight := null) {
  superParam := "Rect"
  if (botLeft == null) {
    superParam += "TBD"
  } else {
    superParam += " ${botLeft.toStr}"
  }
  return new : super(superParam) {
    this.botLeft = botLeft
    this.topRight = topRight
  }
}

This is tricky to achieve today, as the super() call has to go first preventing subclasses from calculating superclass parameters.

4) Truly locked down classes that want to block construction-with-blocks are easier to write. Today you need a public factory method and a private internal constructor. With this proposal, you simply change from static new to static:

// today
static Rectangle make(Point botLeft, Point topRight) {
  return internalMake(botLeft, topRight)
}
private new internalMake(Point botLeft, Point topRight) : super("Rect") {
  this.botLeft = botLeft
  this.topRight = topRight
}

// with the proposal, no construction-with-block allowed
static make(Point botLeft, Point topRight) {
  return new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  }
}

Again, this can be viewed as embedding the "internal make" within the factory.

5) The overall sequence of events is as follows:

static new make(<params>) {
  ...                         // perform any processing in static scope
  created :=                  // assign new instance to a local variable, or just return
    new                       // creates a new instance
      : super(<superParms>){  // calls superclass factory
     this.<field> = <field>   // assign fields using new instance scope
   }                          // calls construction-with-block
   verify                     // null fields validated (Str vs Str?)
   {                          // superclass verify block called
     if ...                   // verify the instance is valid, instance scope
   }                          // new instance locked down
                              // remainder of superclass factory method called
   ...                        // remainder of this factory method called in static scope
                              // alternate instance (singleton) can be returned
}

So, the process is fully defined wrt superclasses, nulls and field verification. const fields can be changed by new {...}, the construction-with-block or verify{...}. Once new {...} verify {...} is complete, the instance is completely safe to use.

Summary

  • Add new expression form new : super(superParams) {...} verify {...}
    • super and verify are optional
  • Add static new factories
    • Allow construction-with-blocks
    • Are statically scoped
    • Can embed a new {...} verify {...} expression
  • Treat current Fan constructors as a shorthand for static new
  • All current Fan code continues to work
    • A few cases could probably be adjusted to remove "internalMake"

Benefits:

  • Ability to alter parameters in a "constructor" before calling super()
  • Ability to validate result from construction-with-block
  • No need to write "internalMake"
  • Existing constructors are logical subset of full functionality
  • Everything captured in one method (share code by writing helper methods if needed)

I suspect the only hard part is translating it to bytecode! Otherwise, it seems to actually match the goals we've been looking for.

brian Sun 1 Feb 2009

Stephen,

Lots of good ideas in there. It will take a little while to digest. Couple questions:

  • in order to use a with-block with a new static method it must be guaranteed that a newly created object is returned (so you can't cache or intern). If that is the case, then what are the advantages compared to normal constructor?
  • I think a key idea is the notion of chaining a verify block to the end of the new block. In implementation I think that this would have to be pulled out into a synthetic method so that call sites could invoke it after the with-block right? I think that the verify is serving a very similar purpose as some sort of onNew callback right? What do you see as the advantages of this proposal over an onNew callback?

I am getting ready to head out for the week. Not sure if I'll have Internet access. Hopefully we can still get a good discussion going.

jodastephen Sun 1 Feb 2009

in order to use a with-block with a new static method it must be guaranteed that a newly created object is returned (so you can't cache or intern).

Not sure thats true. The with-block applies to the new {...} expression, so it is the presence of that which actually matters. (We may need a rule that a static new must contain one new {...} on each code path) Once that is completed, there is no reason that the static factory cannot replace the instance with a cached instance.

I think a key idea is the notion of chaining a verify block to the end of the new block. In implementation I think that this would have to be pulled out into a synthetic method so that call sites could invoke it after the with-block right? I think that the verify is serving a very similar purpose as some sort of onNew callback right? What do you see as the advantages of this proposal over an onNew callback?

The implementation would probably involve two calls from the client, one before and one after the construction-with-block.

This approach provides more than an onNew because:

  • it is easier to read all the creation logic in one place
  • it provides a place to handle the instance once it is fully and safely created (something that might vary depending on which static method is used)
  • it allows superclass parameters to be setup in a complex way
  • it avoids the need to write annoying internalMake methods

JohnDG Mon 2 Feb 2009

I think it's too soon to decide on this very specific proposal, because it may be solved by more general features:

  1. Aspect-oriented programming. Brian indicated he was receptive to the idea of embedding some level of AOP into Fan.
  2. Some subset of AOP, such as is required to receive notification before or after particular methods are invoked (this was discussed in the context of FWT).
  3. Design-by-contract. For example, class invariants would solve the verification problem (and more besides).
  4. Constraints. For example, constraints on fields. Constraints are particularly interesting in Fan because post-with block is a natural place to verify them, allowing multi-valued constraints, where one field is constrained based on the value of another field (in such cases, it would not be possible to verify after each field is set, because more than one must be set before the constraints are satisfied).

freddy33 Wed 4 Feb 2009

First, having AOP as a first class citizen in Fan is the main reason why I'm interested by it. I started working with it because Brian hinted the idea of a Python decorator in the future (on Java Posse google group).

So, I'm for AOP 100%, but AOP on pure constructor is always a problem. Having only static factory methods for construction make the writing, usage and understanding of AOP on constructors a LOT easier. So, solving the issue of Stephen with AOP is actually cleaner with the "static new" factory methods. My 2 cts on this:

  • Having more flexibility and readability in factory methods will increase the long term quality of a code base.
  • AOP and constraints are a pain when constructors can be extended (the super.make() will be decorated).

brian Fri 6 Feb 2009

I think John and Freddy33 both bring the conversation back to an interesting point - what are the key concepts in Stephan's original post which can (and should) be handled with more general purpose features?

One thing that struck me about setting const fields only during construction is that the problem has some of the same tones as a previous discussion about how to pass a reference off to another thread and then ensure it isn't used. Setting const fields is really about tracking references too to ensure that once a reference is "out in the wild" it is truly immutable. Just a random train of thought.

A key idea in Stephen's original post was the ability to declare a verify block which is run at the call site after the original method is called. That indeed sounds a lot a decorator or AOP technique.

jodastephen Fri 6 Feb 2009

I think the interesting part about the proposal is that it makes Fan more general purpose - there cease to be constructors in the classic sense - you only have static methods and instance methods, with the ability to create an instance on demand.

(Now reflection and fcode/bytecode might see a different picture, and use invokespecial, but that is mostly hidden).

The verify aspect isn't one I originally had in the proposal, and I was just verifying the created instance in the part of the static new after the new {}. But this doesn't provide for any point where the instance is truly locked down.

That lockdown is vital for safe usage. For example, today in Fan, you can create an instance which has only const fields, yet the only place during initialisation where that instance can be passed to another service is in the constructor. This means that another object (or thread?) could see an instance change despite it having const fields.

With the new verify approach, that bug becomes harder to make (because the reference is being used in a particular way within a tightly constrined block):

static new make(Point botLeft := null, Point topRight := null) {
  return new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  } verify {
    if (this.topRight.x - this.botLeft.x * this.topRight.y - this.botLeft.y > 1000) throw ArgErr()
    SomeExternalService.addInstance(this)  // BUG!!!
  }
}

Any use of this within new verify that exposes the instance is probably a bug. In fact, there is the possibility to enforce that by using the keyword new instead of this:

static new make(Point botLeft := null, Point topRight := null) {
  return new : super("Rect") {
    new.botLeft = botLeft
    new.topRight = topRight
  } verify {
    if (new.topRight.x - new.botLeft.x * new.topRight.y - new.botLeft.y > 1000) throw ArgErr()
    SomeExternalService.addInstance(new)  // DOESN'T COMPILE
  }
}

(The rule is that you can't pass new to a method)

EDIT: the following is a bad idea. Please ignore

In addition, it might make sense to treat the new {...} block as a with-block (no code allowed). To make this readable, I'd suggest the colon-assign operator we've talked about before:

static new make(Point botLeft := null, Point topRight := null) {
  return new : super("Rect") {
    botLeft: botLeft
    topRight: topRight
  }
}

EDIT: this is a bad idea, as it breaks existing Fan constructors being a shorthand of the longhand above. Please ignore the code block above

On AOP, I understand why that might seem appealing, but the steps of prepare, create, setup-data, construction-with-block, verify, publish are core elements of the language, not of something "additional" like AOP. Each step is required. None are optional.

When learning the language, it shouldn't be necessary to understand the AOP side (which few developers understand today) just to understand object construction.

A key idea in Stephen's original post was the ability to declare a verify block which is run at the call site after the original method is called. That indeed sounds a lot a decorator or AOP technique.

I'd suggest its more easily explained as a split constructor. Because that is what new verify really is.

JohnDG Fri 6 Feb 2009

I like the idea of making all constructors static methods that use a new construct -- that's pretty similar to what has been proposed before. Although instead of using the new keyword as a return value, I'd prefer using either a new keyword for the type of the enclosing class (This has been suggested previously), or the ability to use type literals for types (e.g. this.type or more generally x.type if x is a class literal or statically known).

static This make(Point botLeft := null, Point topRight := null) {
  return new : super("Rect") {
    this.botLeft = botLeft
    this.topRight = topRight
  } 
}

This brings Fan back to an old idea: There are no constructors, per se, just static factory methods, with make being the default factory method (in which case, super("Rect") could be encoded equally well as super.make("Rect")).

I'm not sure I like the idea of a verify block, because it's so specific to this feature, yet is subsumed by more general and powerful constructs.

I do feel there should be a distinct point beyond which setting const fields is illegal. Like Andy, I don't like the idea of readonly const (smacks of redundancy and hackery). But again, I'm not sure a verify block is the way to do it.

Here's an idea that might be worth exploring: unification of named parameters and with blocks.

For invoking a method with named parameters, the left hand side of each assignment refers to the parameter name, while the right hand side refers to some variable in scope. This bears a strong superficial similarity to with blocks.

Suppose we allowed calling methods with named parameters, but instead of using parentheses, we required curly braces:

exponent := 10
result := pow { base = 10; exponent = exponent; }

where pow is a method accepting two parameters with the names base and exponent.

Looks a lot like a with block, doesn't it?

Now suppose that we re-introduce the colon operator : as a synonym for a method called with. Here's the important part: the compiler automatically generates the with method in such a way that all non-'const' parameters are passed to it, and stored in the fields of the class.

How does this solve the const problem? Because now the user is forced to choose between two different styles of invoking the constructor:

// Tuple invocation
pt := Point(x1, y1)

// Named parameter invocation
pt := Point { x = x1, y = y1 }

In either case, this is simply an invocation of the default make factory method. const fields can only be set in the constructor, but now you can call the constructor with named parameters (i.e. with-block style).

If x and y were const, then the following would be illegal:

pt := Point { x = x1, y = y1 } : { x = x2, y = y2 }

You know where construction ends and normal with blocks begin because of the special colon (with) operator. Moreover, construction isn't bifurcated because you have a choice of either tuple construction or named parameter construction, but not both.

JohnDG Fri 6 Feb 2009

As for implicit add, I'm not really sure how that would generalize to named parameters, but it would likely be syntax sugar involving an add operator (such as the comma , or Stephen's <<<), and a list in the formal parameter list.

helium Fri 6 Feb 2009

@JohnDG: I like the way you've expanded on my with method idea I posted elsewhere. So +1

@jodastephen:

if (this.topRight.x - this.botLeft.x * this.topRight.y - this.botLeft.y > 1000) throw ArgErr()

This code does not what you think it does.

jodastephen Fri 6 Feb 2009

I like the idea of making all constructors static methods that use a new construct -- that's pretty similar to what has been proposed before

But making existing Fan constructors a shorthand hasn't been suggested before IIRC.

instead of using the new keyword as a return value, I'd prefer using either a new keyword for the type of the enclosing class

The new keyword is also used to enable the construction-with-block. The rest of your post is about removing the construction-with-block, and in that case This might make more sense.

Here's an idea that might be worth exploring: unification of named parameters and with blocks.

We have to choose whether we want split construction or not. The advantage of a named parameters approach is that it is more efficient. Only one method needs to be generated in fcode from the client passing the parameters (named or not).

The problem with this is Brian's requirements don't allow it, and Fan's codebase isn't written that way. There are lots of classes which rely on being initialised in two parts (constructor and construction-with-block) where the c-w-b provides extra info.

I should note that if named parameters were to be chosen, I'd prefer to see it using round brackets not curly.

As such, I'm arguing that the new verify approach is the minimum possible set of syntax to meet the currently stated goals.

So, here is the key question - can we restrict setting of const to either constructor or c-w-b but not both? If we do restrict it, then the language is a lot simpler. But a little less powerful.

helium: This code does not what you think it does.

Its only example code. If I've got the maths wrong, I'm not fussed, but if the semantics of the reference to this or new are wrong I'd like to know.

brian Sat 7 Feb 2009

I think the interesting part about the proposal is that it makes Fan more general purpose - there cease to be constructors in the classic sense - you only have static methods and instance methods

I don't understand this then. If you go back to the original arguments, remember that you must have real constructors if you want do deal with subclasses. Factories don't work because they don't work with subclassing. How constructors and verifiers work with subclasses is kind of the crux of the whole problem. One thing we do not want is to have ctor call sites generating calls to every synthetic verification method in the class hierarchy (which gets into how you chain superclass verifies).

Remember this key point - constructors must be callable as both static factories for client code and instance methods by subclasses who allocate a different type.

Here's an idea that might be worth exploring: unification of named parameters and with blocks.

I really like that idea - I'm not sure where it leads, but the thought of turning with-blocks into named parameters has some interesting possibilities. One thing it might be able to do is solve the immutable setter problem (the Scala mailing list has had some discussions regarding using named params to help with the immutable setter issue).

jodastephen Sat 7 Feb 2009

I don't understand this then. If you go back to the original arguments, remember that you must have real constructors if you want do deal with subclasses.

What this proposal says is that you have a constructor expression block (new verify) not a constructor method.

In the bytecode new verify might well use invokespecial (twice).

new Foo
invokespecial Foo.new
process construction-with-block
invokespecial Foo.verify

(I haven't tested/checked this, but don't see why it wouldn't work.)

BTW, I like named parameters, but I'm unsure that with-blocks are exactly the same. Try working out the method signature in bytecode - does the client pass a map?

brian Sat 7 Feb 2009

What this proposal says is that you have a constructor expression block (new verify) not a constructor method.

How does a subclass call it then?

BTW, I like named parameters, but I'm unsure that with-blocks are exactly the same. Try working out the method signature in bytecode - does the client pass a map?

I haven't looked at how other languages do it (especially a performant language like C#). I was thinking of using bitmasks:

Void foo(Str? a := null, Str? b := null)

Today generates something like this:

void foo() { foo(null, null) }
void foo(Str a) { foo(a, null) }
void foo(Str a, Str b) { /* body */ }

In addition we would generate something like:

void foo(Obj x, int mask)
{
  foo( (mask & A) != 0 ? (Str)x : null
       (mask & B) != 0 ? (Str)x : null ) 
}

Although this would multiple the complexity of reflection, curry, etc. I am still finding boundary conditions regarding how default params interact with other features.

jodastephen Tue 10 Feb 2009

How does a subclass call it then?

Consider three actors, Application, Target, Super

Application creates object using new bytecode
Application calls Target.make(...) via invokespecial
    Target prepares data for parent
    Target calls Super.make(...) via invokespecial
        Super prepares data for parent
        Super calls any superclass it has via invokespecial
        ...on return...
        Super assigns state from new {...} block
    ...on return...
    Target assigns state from new {...} block
...on return...
Application assigns state from construction-with-block
Application calls Target.verify(...) via invokespecial
    Target calls Super.verify(...) via invokespecial
        Super calls any superclass it has via invokespecial
        ...on return...
        Super verifies state using verify {...} block
    ...on return...
    Target verifies state using verify {...} block
    ...state locked...
    Target calls Super.complete(...) via invokespecial
        Super calls any superclass it has via invokespecial
        ...on return...
        Super completes processing with code after verify {...} block
    ...on return...
    Target completes processing with code after verify {...} block
..on return...
fully initialised and locked object

So, we need three invokespecial methods

  • one which prepares for the superclass and assigns the state from new {...}
  • one which verifies the state from verify {...}
  • one which completes the processing

As far as I can tell, this is the minimum necessary to meet all the goals (setting const in all places, verifying, safe publishing). I'd definitely say that this is complex...but it should be remembered that Java cannot do safe publishing today from a constructor for example.

Login or Signup to reply.