Let's say type.builder is a compiler-generated custom builder class for type. Further, suppose that it has fields for all fields of the class type, and that it also replicates the Void-returning methods of type. Finally, the builder class supplies a method called apply, which takes an object of type type, and replicates all the set fields and invoked methods to the supplied object.
Now this isn't enough. We still need a few more things: first, we need a syntax for invoking a builder. That's simple enough, since we've already decided on a syntax for with blocks:
pt := Point
pt {
x = 5
y = 2
}
The with block constructs a builder with the specified settings.
Now we need to modify the syntax a bit so we have a hook into application of the with block. To do this, we can use the : method as syntax sugar for a method called with. Thus the syntax of the above application becomes:
Every type's builder supplies some utility methods. One such method is: isSet, which determines if a specified field has been set. This allows one to query to make sure certain fields are or are not set in the with block:
class Point {
...
Obj with(Builder builder) {
if (!builder.isSet(Point#x)) throw new NoArgException()
if (!builder.isSet(Point#y)) throw new NoArgException()
return super.with(builder)
}
...
}
The natural method of chaining to super allows each superclass to perform its own validation.
Now we're almost done, but we still need to do a bit more work. First, we need to add a new postfix operator: the comma operator. The postfix operator is syntax sugar for a method called add that returns Void.
From the outside, you would use the comma operator like this:
myList a, b, c, d,
which would compile down to:
myList.add(a).add(b).add(c).add(d)
Now recall that the type.builder for type automatically replicates all Void returning methods, so they can be invoked directly from the builder (because the invocation may or may not be applied, this replication can happen only for Void returning methods). In particular, this means that if a class has a method called add that returns Void, it will be replicated in the builder.
This means that you can now use syntax like this:
Menu:
{
Menu:
{
text = "File"
MenuItem { text = "Open"; onSelect=&open },
MenuItem { text = "Save"; onSelect=&save },
},
Menu:
{
text = "Help"
MenuItem { text = "About"; onSelect=&about },
},
}
which is nearly identical to today's syntax for with blocks.
Now what else can we do with this? Well, if we want, we can validate the Void returning methods invoked inside the builder, simply by passing a custom visitor to the apply method to verify that certain methods are or are not called (perhaps with certain parameters). If things check out, then we can call the apply method on this. Thus, we can validate both fields and method invocations.
We can also handle immutable structures, if we like. The with method need not return this, but could return an entirely new copy of the object, if the instance was immutable (this is the reason why with is typed to return Obj). Here's a sketch of how the method might be implemented in Obj:
class Obj {
...
Obj with(Builder builder) {
if (isImmutable()) {
for (/* all fields*/) {
if (!builder.isSet(field)) {
thisFieldValue := ??? // Get the value of this field
builder.??? // set thisFieldValue
}
}
newInstance := ???
return builder.apply(newInstance)
}
else {
return builder.apply(this)
}
}
...
}
JohnDGTue 17 Feb 2009
Now there's still the problem of construction & validation. At what point is an object considered to be constructed?
I don't like the current system of bifurcated construction. I never have liked it. So instead, I suggest that privileged with blocks must be explicitly denoted.
Here's an example of what this could look like:
const class Email {
const Str address
new make(Builder builder) {
...
builder.apply(this)
}
}
The make constructor accepts a builder as an explicit argument. Therefore, the builder is allowed to set const fields and perform other privileged operations.
If instead the constructor had been written like so:
class Point {
new make(Float x, Float y) {
...
}
}
then this signifies that the author of the API does not wish to allow privileged building.
Of course, you're still free to use a with-block after construction, like so:
p := Point(4, 2): { x = 5; y = 2; }
but the operations in the with block are no longer privileged, because they take place following construction of the object.
Now in some cases, it's convenient (if usually a design smell indicating merging of concerns) to pass into a constructor some core essential information, and then based on that information, do core-dependent initialization in the subsequent with block. This usage pattern can still be accommodated:
This is possible because the init method is a Void returning method that is therefore replicated in the Connection.type.builder object.
Perhaps, the compiler's auto-supplied make function would always accept a builder, allowing privileged construction by default.
brianWed 18 Feb 2009
This seems to feel pretty good to me - I think it is kind of somewhere between what Stephan was proposing and I was kind of thinking for first with-blocks.
I am not sure I have fully digested either proposal yet, I really need to spent time thinking about it.
At first I was hoping we might be able to delay this issue to 1.1 and just force people to use constructors like Java. I definitely don't want to add more features right now. But the more Fan code I write, the more I think we are going to have really tackle this problem in the next month or two and nail it down.
I think the key thing you spelled out that I was already thinking is that to make this work we have to generate a mirror builder class with all the same fields and methods. The problem is I really do not want to be doing that for every single class. How we can avoid that?
JohnDGWed 18 Feb 2009
I think the key thing you spelled out that I was already thinking is that to make this work we have to generate a mirror builder class with all the same fields and methods. The problem is I really do not want to be doing that for every single class. How we can avoid that?
Dynamics or subclassing. I'm not sure how fast dynamics are in Fan, but it's been made pretty fast in some languages like pnuts (to be made faster still with JVM7+).
Subclassing would require all virtual methods which seems nasty.
Another approach is to emulate the existence of a builder class but not actually create it. Instead, you would store closures that record setting/invocation and which can be played back later (similar to how mocking frameworks work, but you can make it more efficient).
A key question is whether the type of the builder must be known statically. If there is no such restriction, dynamics is a natural method to implement this, counting on method caching and/or JVM7+ to make it more efficient. If the type of builder must be known statically, you can do pure emulation (awkward but arbitrarily fast; I haven't taken a look yet to see how you do this for primitives).
jodastephenWed 18 Feb 2009
I think the key thing you spelled out that I was already thinking is that to make this work we have to generate a mirror builder class with all the same fields and methods. The problem is I really do not want to be doing that for every single class. How we can avoid that?
I had the thought that perhaps the builder class is actually the same type as the class its building for.
This wouldn't work if this were Java, and the fields of the main class were actually marked as final. But maybe Fan can compile all its classes with non-final instance variables? That way, the builder might appear to be a different type in the Fan type system, but is implemented as the same type in bytecode.
Of course this all depends on what is necessary to safely publish an object at the bytecode level. In the Java Memory Model, this requires final instance variables, but I suspect that is not what is actually needed at the bytecode level.
Overall, I think John has taken the basics of what I came up with and made something that looks good. I don't like the comma suffix (I'd prefer something more obvious, such as <<< as mentioned elsewhere). I quite like the isSet idea too.
As with John, I don't like the split constructor concept. I've pointed out lots of problems it has. Perhaps we could simplify this:
// today in Fan
conn := Connection("localhost", 8880) { external = false }
// John suggested (option A)
conn := Connection { init("localhost", 8880) external = false }
// perhaps we could do one of these?
// option B
conn := Connection { ("localhost", 8880) external = false }
// option C
conn := Connection ("localhost", 8880, { external = false })
Option C seems quite friendly, because it can be refactored like methods that take a closure as the last argument if desired:
// option C plus with-block
conn := Connection ("localhost", 8880) : { external = false }
Of course, there is a difference, as this version creates the object and then changes it after the constructor, whereas option C performs all the work in the constructor.
brianWed 18 Feb 2009
I think the three options for "what is a builder":
it is the class itself
it a mirror class with all the same fields and dummy methods (first class builder)
it a class per call site (first class with-block)
The simplest thing by far is that the "builder" is just the object itself and you are just setting its fields and calling its methods. Which is pretty much we do today.
This is what I am thinking - the whole notion of first class with-blocks, builders, etc is very cool and very powerful. But I really do not want to add any more complex features to Fan 1.0. I am still trying to ensure the features we have now all play well together.
So I come back around to this premise: what is the simplest thing we can do that solves the common uses cases, yet doesn't shut the door to adding these more advanced features in the next release?
I think the simplest thing is to just add a callback for validation:
run constructor
apply with-block
call validation method
The validation method implicitly calls the super class versions so that you couldn't by-pass your super class validation.
That solution doesn't require any new syntax, and I think it can be made to be backward compatible with first class with-blocks or builds.
JohnDGWed 18 Feb 2009
The problem with Option B or C is that it requires the builder to be aware of what parentheses mean -- i.e. what does it mean to say, ("localhost", 8880), and what should be done when such a notation is encountered???
Option A is very non-magical and we get it for free because all Void-returning methods are replicated in the builder object.
I should mention there are a number of correct, object-oriented ways to deal with this problem. The reason they're not used is because they involve intermediate objects. e.g.:
But such clean abstractions make refactoring and code maintenance a lot easier.
So I don't view the core-dependent initialization as important enough to merit any special syntax, because in my experience, it's always indicative of a missing abstraction or merging of concerns. So that it can be solved "for free" by a non-magical builder solution is sufficient (in my view).
As for a different operator for add, I wouldn't have a large problem with it; it would be more visible, but the reason I like the comma operator is because it's already used for list and map construction. And the similarity there would go a long way toward making third-party collection libraries feel more like first-class citizens. e.g.:
tree := Node { child1, child2, Node { subchild1, subchild2 }, }
set := Set { a, b, c, d, }
So to summarize the discussion so far.
New Builder Object
Every type contains a new method called builder which constructs a type-dependent subclass of Builder. This subclass replicates all fields and Void-returning methods of the type itself.
Point.type.builder
It is possible to directly instantiate a builder object.
In addition to replication of all fields and Void-returning methods, every Builder object contains some utility methods.
Void apply(T obj). This method plays back field setting and method invocation on the specified object. The order of settings and invocations is preserved.
Bool isSet(Field f). This method determines if the specified field was set. Note that this is purely a convenience method as it would be possible to discern this information merely by passing a visitor to the apply() method.
Builder objects can be implicitly constructed using a builder block, which consists of a series of field initializations and method invocations enclosed by curly braces.
Point.type.builder builder = {
x = 0
y = 0
}
When a builder block is used to construct a Builder object, the type must be statically known by the compiler. This means the Builder object must be assigned to a typed variable, or passed to a typed parameter of a method or constructor.
Point { x = 0; y = 0; } // LEGAL, assuming constructor accepts Builder
builder := { x = 0; y = 0; } // ILLEGAL -- type unknown
pt.with { x = 0; y = 0; } // LEGAL, assuming 'with' accepts Builder
With Operator & Method
A new operator is introduced, :, pronounced with. This operator is a shortcut to a method called with. Obj supplies a default implementation of this method, which accepts a Builder object and returns an Obj.
The default implementation provides the following functionality:
If the object is mutable, then the with method simply invokes the builder's apply method to the this object, thus committing the changes to the object. The method then returns the this object.
If the object is immutable, then the with method creates a new object, and uses the builder's set fields whenever they have been specified, and the this object's fields whenever the builder does not have a value for a field. The method then returns the new object, which is also immutable.
Optionally, the implementation of with could provide additional functionality:
Verifying that already initialized, const fields are not set.
Verifying that non-nullable fields are not initialized to null.
Looking at facets on fields to enforce constraints.
pt := Point()
pt: { x = 0; y = 0; }
// Assuming Person is immutable
person = person.with { firstName = "joe" }
// OR:
person = person: { firstName = "joe" }
Add Operator & Method
A new postfix operator is introduced, ,, pronounced add. This operator is a shortcut to a method called add that accepts a single argument and returns Void.
It is possible to use the operator on any object that has such an add method.
list a,b,c,d, // Adds a,b,c,d
From inside the class with the add method, the object may be omitted, like normal.
a,b,c,d,
If a class has an add method, then because it returns Void, it will be replicated in the Builder object for the class. This means it is possible to use the add method in builder blocks.
Menu:
{
Menu:
{
text = "File"
MenuItem { text = "Open"; onSelect=&open },
MenuItem { text = "Save"; onSelect=&save },
},
Menu:
{
text = "Help"
MenuItem { text = "About"; onSelect=&about },
},
}
Constructor Syntax
(This section needs more work.)
A constructor may return a value of the exact same class. This allows substituting existing classes for "new" objects.
new make(Settings s) {
int hash = hashSettings(s)
if (cache[hash] != null) return cache[hash]
...
}
Construction & Verification
If a constructor accepts a Builder argument, then if the builder is applied inside the constructor, all operations are privileged: the builder may set const fields, for example. Once constructed, it is not possible to use a builder to perform privileged operations.
// Assuming Point is const object & has builder constructor
pt := Point { x = 4; y = 1; } // LEGAL
pt := Point : { x = 4; y = 1; } // ILLEGAL because Point already constructed
The Fan idiom is to perform field validation inside an overridden with method that delegates to super after performing validation.
class User {
Str name
...
override Obj with(Builder builder) {
if (name == null && !builder.isSet(User#name)) throw new ArgError()
else if (builder.isSet(User#name) &&
builder.name == null) throw new ArgError()
...
return super.with(builder)
}
}
This idiom has many advantages:
It works seamlessly with immutable or mutable objects. That is, the verification logic does not need to care whether super.with will return a new object or the existing one. Since any object can be made immutable, this is a very important property.
It ensures every superclass has a chance to perform its own verification.
It allows atomic changes to objects, which is not possible in languages such as C# and Java (many times, if you set fields individually, the object will pass through an invalid state, but with a builder block, it can be ensured that all related fields can be set "simultaneously"). The verification logic can therefore be used to prevent invalid states.
Logic that appears frequently in verifiers is factored out into methods of Builder (the above pattern, for example, might be common enough to justify a new method in Builder, or at least a static method of BuilderUtils).
For constructors, the Fan idiom is generally to delegate to the with method. In situations where this is not desirable, common verification logic is factored out and invoked both in with and in the constructor.
class User {
new make(Builder builder) { return with(builder) }
}
class User {
new make(Builder builder) { verify(builder); builder.apply(this) }
override Obj with(Builder builder) {
verify(builder)
return builder.apply(this)
}
private Void verify(Builder builder) {
...
}
}
Alternate Syntax for Builder Blocks
An alternate syntax for builder blocks uses square brackets instead of curly braces.
Menu:
[
Menu:
[
text = "File"
MenuItem [ text = "Open"; onSelect=&open ],
MenuItem [ text = "Save"; onSelect=&save ],
],
Menu:
[
text = "Help"
MenuItem [ text = "About"; onSelect=&about ],
],
]
This alternate syntax has two advantages over curly braces:
It makes it very clear a builder block is being used, as opposed to a closure.
It makes strides toward unification with List and Map.
list := MyList[a, b, c, d,]
// Or with some syntax sugar:
MyList list := [a, b, c, d,]
JohnDGWed 18 Feb 2009
But I really do not want to add any more complex features to Fan 1.0.
This means Fan will go out of the door with no way to modify immutable structures, and no backwards-compatible way to intercept with blocks or do implicit adds because no operator are used for either.
Fan in its current state cannot evolve in a backwards-compatible way with the notion of builders as expressed in this proposal. Moreover, I think there was some consensus in the "what's needed for Fan 1.0" that the existing bifurcated construction is troublesome and that we need a way to modify immutable data.
I'm fine with Fan not changing for a 1.0 release, but I do think that kills this proposal and is going to impose an awkward grammar on future support for these concepts.
There's clearly a more general concept behind with blocks and it seems after many months of trying to find it, we're pretty close. It seems a shame to throw that away.
That said, in my opinion, and with the exception of fine-grained concurrent applications, Fan is already a better Java/C#.
brianWed 18 Feb 2009
I'm fine with Fan not changing for a 1.0 release, but I do think that kills this proposal and is going to impose an awkward grammar on future support for these concepts.
I think that was what I was trying to ask. What can we do now so to leave the door open for 1.1 where we don't have an awkward grammar.
At the least I think we really need to change add to be postfix comma operator (although I am not sure if/how that works outside of a with-block).
What other syntax changes do we need now so that can evolve?
I am not giving up this discussion though, I just want to make sure we have a back up plan.
I have really started thinking about first class with-blocks where we generate a class at the call site instead of a mirror builder class.
Suppose the "with-block" was really a LINQ like block of exprs? I could either apply the exprs or actually access the AST at runtime? One key thing LINQ does is do change what gets generated based on the target of the expression:
pt { x = 4; y = 6 } // generate efficient builder for pt target
expr := { x = 4; y = 6 } // generate "AST" with list of 2 set exprs
JohnDGWed 18 Feb 2009
If you're looking for minimal set of changes, I'd recommend the following:
: to prefix a with block. You can always delete this symbol and retain backward compatibility (if it turns out not to be needed).
Reserve the method name with so that no one can use it.
Require , as a postfix operator for adding items inside a with block.
Use an @onConstruction facet or similar to hook into verification. A future Obj.with method could always call methods so annotated.
I'd have to think about it more, but I believe the above changes would keep the door open to this proposal and quite a few others. More information is better, and you can always allow omitting symbols if they turn out not to be necessary.
Suppose the "with-block" was really a LINQ like block of exprs?
The thing I like about builder is that you get compile-time safety: you can only set fields and invoke Void-returning methods that actually exist on the object.
I'd have to think more about the implications for treating a builder block as a dynamic class with runtime access to the AST, but it seems to me you lose compile-time safety.
We should also think about how that feature would overlap/interact with plug-ins.
jodastephenSat 21 Feb 2009
I've given this some thought, and I really like it. This overall proposal is way better than what Fan has at present, and much better than what Java/C# can achieve. I do have a few tweaks.
Construction
This approach can allow "split construction" if tweaked:
// today in Fan
conn := Connection("localhost", 8880) { external = false }
// John suggested (option A)
conn := Connection { init("localhost", 8880) external = false }
// John also proposed this where external is not const (option D)
conn := Connection("localhost", 8880) : { external = false }
// I'm proposing (option E)
conn := Connection("localhost", 8880) { external = false }
ie. I'm tweaking the proposal such that we allow Fan construction-with-blocks to be coded exactly as they are today. It seems to me that this is no longer a problem (my previous concerns seem solved). This is because normal-with-blocks are different to construction-with-blocks, because of the extra colon:
// today, meaning depends on whether factory or ctor
Point() {x=6; y=10}
// proposed, compiles if ctor taking Builder, doesn't if no builder allowed
Point() {x=6; y=10}
// today, confusing, meaning varies (wrt const) depending on Point internals
Point() {x=6; y=10} {x=7; y=9}
// proposed, clear, we know the first is part of construction
Point() {x=6; y=10} : {x=7; y=9}
Whats interesting is that this last piece of code above will work even if Point is fully const. Why? Well, the x/y get set by the construction-with-block, and then a new instance is returned by the second normal-with-block. Inefficient, but entirely readable and sound.
So, this all works well from the callers point of view. The problem is the right syntax for the factory/ctor. I've played with some syntax, but aren't happy yet. The key point is that split construction is valid now from the caller, so the ctor should be relatively easy.
Auto-add
I'm not a huge fan of the postfix comma (I dislike postfix in general I suspect). Basically, its easy to miss (we've got rid of semicolon line endings for a similar reason). Here are three alternatives:
// today
Menu {
text = "File"
MenuItem { text = "Open"; onSelect=&open }
MenuItem { text = "Save"; onSelect=&save }
}
// John's proposal
Menu {
text = "File"
MenuItem { text = "Open"; onSelect=&open },
MenuItem { text = "Save"; onSelect=&save },
}
// Alternative A
Menu {
text = "File"
add MenuItem { text = "Open"; onSelect=&open }
add MenuItem { text = "Save"; onSelect=&save }
}
// Alternative B
Menu {
text = "File"
<<< MenuItem { text = "Open"; onSelect=&open }
<<< MenuItem { text = "Save"; onSelect=&save }
}
// Alternative C
Menu {
text = "File"
[
MenuItem { text = "Open"; onSelect=&open },
MenuItem { text = "Save"; onSelect=&save }
]
}
The nice thing about alternative C is how it links in to custom lists/sets:
That looks pretty nice. Removing the curly brackets is possible, but might be confusing.
BTW, I suspect that builders should support put() as well as add() to allow map style additions if we go with alternative C.
First-class with blocks
These feel a bit magical at the moment. I can get my head around a builder, and how to generate it (mirror class with public fields). I can't picture a FCWB except as a hash-map.
brianSat 21 Feb 2009
The key issue I am trying to get my head around is what is the "builder" type. I think John's proposal requires generating a mirror class with a mirror slot for every class. That amount of overhead doesn't sit well with me.
The two alternatives are try to re-use the original class or to generate a class per call site.
I can't figure out how we use the original class and get a first class builder.
If we use a class per call site, it is very difficult to get static typing and its efficiency without introducing some notion first class tuple/record types.
I looked into using square brackets previously, and I think had some grammar issues trying to make it work.
jodastephenSat 21 Feb 2009
what is the "builder" type. I think John's proposal requires generating a mirror class with a mirror slot for every class.
I wonder how bad it really is. A builder only needs to be generated if the class declares that it can use builders. If it does that, then there is a cost involved. Given how many closure classes will be generated, size shouldn't be a concern. Its difficult to invest too much time in the builder design until you're willing to go the route of a mirror class.
(I believe that the main class will need final instance variables, not set after the constructor. Googling fails to turn up any clear references on this, so we shoul go with the classic Java Memory Model and make fields final. This makes it very hard for any single class design to work.)
JohnDGSun 22 Feb 2009
The key issue I am trying to get my head around is what is the "builder" type. I think John's proposal requires generating a mirror class with a mirror slot for every class. That amount of overhead doesn't sit well with me.
It doesn't require a mirror class, but that's the proper way to think of it (and the way you would describe the feature in documentation).
I think when you do analysis on the AST, to perform validation (i.e. to check x really exists as referenced in the builder block { x = 4 } ), you simply check the class for which the builder is being used. Since this class is known statically, at compile time, it's pretty easy.
Now, you still have a Builder class somewhere, but it's completely generic, and looks something like this:
Whenever setField or invokeMethod are called, they store a simply store (in a list) an anonymous function that will take its argument and perform the specified operation on it.
At the use site, you simply translate the builder block into usage of a generic builder.
// BEFORE
pt := Point : { x = 0; y = 2 }
// AFTER
temp := Point
builder := Builder(Point.type)
builder.setField(Point#x, 0)
builder.setField(Point#y, 2)
pt := temp.with(builder)
This should be pretty efficient (as efficient as you can be, given that builders can be played back an arbitrary number of times), and achieves all the requirements of the proposal.
BTW, I suspect that builders should support put() as well as add() to allow map style additions if we go with alternative C.
Again, I more like the comma operator because it's already used to create lists and maps. You're right, it's easier to miss, but I think the unification gains are too much to overlook.
In any case, I do like the idea of supporting map-like syntax, too. I think what I'd suggest is a combination of tuples, syntax sugar for tuple construction, adding an add function to Map that accepts a pair. e.g.:
// TUPLES
(a, b)
(a, b, c)
...
// SYNTAX SUGAR FOR TUPLE CONSTRUCTION
a => b // (a, b)
a => b => c // (a, b, c) or (a, (b, c))
...
// ADD METHOD ON MAP
map.add((key, value))
map.add(key => value)
Then the syntax for map would be unified with lists:
list := MyList["foo", "bar",]
map := MyMap["key" => "value",]
JohnDGSun 22 Feb 2009
The generic Builder shown above would actually be a good use case for generics. Builder<T> would allow you to constrain apply to accept only objects of type T, which would get you type checking "for free". Otherwise, I suppose, you just carry around type information in the AST and manually disallow using a builder of one type on an object of another type.
brianTue 24 Feb 2009
tuples, syntax sugar for tuple construction, adding an add function to Map that accepts a pair
This is what Scala did, and I think it is pretty elegant. But it requires the notion of first class tuples (which Scala has and we don't).
BTW, I just ran across a bit of scala code that looked something like:
def f = new MainFrame
{
title = "foo"
contents = new Button { text = "foo" }
}
It looks a lot like a with-block, but I don't understand what it is really doing. Can someone explain how it works?
JohnDG Tue 17 Feb 2009
Let's say
type.builder
is a compiler-generated custom builder class fortype
. Further, suppose that it has fields for all fields of the classtype
, and that it also replicates theVoid
-returning methods oftype
. Finally, the builder class supplies a method calledapply
, which takes an object of typetype
, and replicates all the set fields and invoked methods to the supplied object.Now this isn't enough. We still need a few more things: first, we need a syntax for invoking a builder. That's simple enough, since we've already decided on a syntax for with blocks:
The
with
block constructs a builder with the specified settings.Now we need to modify the syntax a bit so we have a hook into application of the
with
block. To do this, we can use the:
method as syntax sugar for a method calledwith
. Thus the syntax of the above application becomes:which is translated into:
which is further translated into:
Meanwhile, the default implementation of
with
simply applies the builder tothis
:Every type's builder supplies some utility methods. One such method is:
isSet
, which determines if a specified field has been set. This allows one to query to make sure certain fields are or are not set in thewith
block:The natural method of chaining to
super
allows each superclass to perform its own validation.Now we're almost done, but we still need to do a bit more work. First, we need to add a new postfix operator: the comma operator. The postfix operator is syntax sugar for a method called
add
that returnsVoid
.From the outside, you would use the comma operator like this:
which would compile down to:
Now recall that the
type.builder
fortype
automatically replicates allVoid
returning methods, so they can be invoked directly from the builder (because the invocation may or may not be applied, this replication can happen only forVoid
returning methods). In particular, this means that if a class has a method calledadd
that returnsVoid
, it will be replicated in the builder.This means that you can now use syntax like this:
which is nearly identical to today's syntax for
with
blocks.Now what else can we do with this? Well, if we want, we can validate the
Void
returning methods invoked inside the builder, simply by passing a custom visitor to theapply
method to verify that certain methods are or are not called (perhaps with certain parameters). If things check out, then we can call theapply
method onthis
. Thus, we can validate both fields and method invocations.We can also handle immutable structures, if we like. The
with
method need not returnthis
, but could return an entirely new copy of the object, if the instance was immutable (this is the reason whywith
is typed to returnObj
). Here's a sketch of how the method might be implemented inObj
:JohnDG Tue 17 Feb 2009
Now there's still the problem of construction & validation. At what point is an object considered to be constructed?
I don't like the current system of bifurcated construction. I never have liked it. So instead, I suggest that privileged
with
blocks must be explicitly denoted.Here's an example of what this could look like:
The
make
constructor accepts abuilder
as an explicit argument. Therefore, the builder is allowed to setconst
fields and perform other privileged operations.If instead the constructor had been written like so:
then this signifies that the author of the API does not wish to allow privileged building.
Of course, you're still free to use a with-block after construction, like so:
but the operations in the
with
block are no longer privileged, because they take place following construction of the object.Now in some cases, it's convenient (if usually a design smell indicating merging of concerns) to pass into a constructor some core essential information, and then based on that information, do core-dependent initialization in the subsequent
with
block. This usage pattern can still be accommodated:This is possible because the
init
method is aVoid
returning method that is therefore replicated in theConnection.type.builder
object.Perhaps, the compiler's auto-supplied
make
function would always accept a builder, allowing privileged construction by default.brian Wed 18 Feb 2009
This seems to feel pretty good to me - I think it is kind of somewhere between what Stephan was proposing and I was kind of thinking for first with-blocks.
I am not sure I have fully digested either proposal yet, I really need to spent time thinking about it.
At first I was hoping we might be able to delay this issue to 1.1 and just force people to use constructors like Java. I definitely don't want to add more features right now. But the more Fan code I write, the more I think we are going to have really tackle this problem in the next month or two and nail it down.
I think the key thing you spelled out that I was already thinking is that to make this work we have to generate a mirror builder class with all the same fields and methods. The problem is I really do not want to be doing that for every single class. How we can avoid that?
JohnDG Wed 18 Feb 2009
Dynamics or subclassing. I'm not sure how fast dynamics are in Fan, but it's been made pretty fast in some languages like pnuts (to be made faster still with JVM7+).
Subclassing would require all virtual methods which seems nasty.
Another approach is to emulate the existence of a builder class but not actually create it. Instead, you would store closures that record setting/invocation and which can be played back later (similar to how mocking frameworks work, but you can make it more efficient).
A key question is whether the type of the
builder
must be known statically. If there is no such restriction, dynamics is a natural method to implement this, counting on method caching and/or JVM7+ to make it more efficient. If the type ofbuilder
must be known statically, you can do pure emulation (awkward but arbitrarily fast; I haven't taken a look yet to see how you do this for primitives).jodastephen Wed 18 Feb 2009
I had the thought that perhaps the builder class is actually the same type as the class its building for.
This wouldn't work if this were Java, and the fields of the main class were actually marked as final. But maybe Fan can compile all its classes with non-final instance variables? That way, the builder might appear to be a different type in the Fan type system, but is implemented as the same type in bytecode.
Of course this all depends on what is necessary to safely publish an object at the bytecode level. In the Java Memory Model, this requires final instance variables, but I suspect that is not what is actually needed at the bytecode level.
Overall, I think John has taken the basics of what I came up with and made something that looks good. I don't like the comma suffix (I'd prefer something more obvious, such as <<< as mentioned elsewhere). I quite like the
isSet
idea too.As with John, I don't like the split constructor concept. I've pointed out lots of problems it has. Perhaps we could simplify this:
Option C seems quite friendly, because it can be refactored like methods that take a closure as the last argument if desired:
Of course, there is a difference, as this version creates the object and then changes it after the constructor, whereas option C performs all the work in the constructor.
brian Wed 18 Feb 2009
I think the three options for "what is a builder":
The simplest thing by far is that the "builder" is just the object itself and you are just setting its fields and calling its methods. Which is pretty much we do today.
This is what I am thinking - the whole notion of first class with-blocks, builders, etc is very cool and very powerful. But I really do not want to add any more complex features to Fan 1.0. I am still trying to ensure the features we have now all play well together.
So I come back around to this premise: what is the simplest thing we can do that solves the common uses cases, yet doesn't shut the door to adding these more advanced features in the next release?
I think the simplest thing is to just add a callback for validation:
The validation method implicitly calls the super class versions so that you couldn't by-pass your super class validation.
That solution doesn't require any new syntax, and I think it can be made to be backward compatible with first class with-blocks or builds.
JohnDG Wed 18 Feb 2009
The problem with Option B or C is that it requires the builder to be aware of what parentheses mean -- i.e. what does it mean to say,
("localhost", 8880)
, and what should be done when such a notation is encountered???Option A is very non-magical and we get it for free because all
Void
-returning methods are replicated in the builder object.I should mention there are a number of correct, object-oriented ways to deal with this problem. The reason they're not used is because they involve intermediate objects. e.g.:
But such clean abstractions make refactoring and code maintenance a lot easier.
So I don't view the core-dependent initialization as important enough to merit any special syntax, because in my experience, it's always indicative of a missing abstraction or merging of concerns. So that it can be solved "for free" by a non-magical builder solution is sufficient (in my view).
As for a different operator for
add
, I wouldn't have a large problem with it; it would be more visible, but the reason I like the comma operator is because it's already used for list and map construction. And the similarity there would go a long way toward making third-party collection libraries feel more like first-class citizens. e.g.:So to summarize the discussion so far.
New Builder Object
Every type contains a new method called
builder
which constructs a type-dependent subclass ofBuilder
. This subclass replicates all fields andVoid
-returning methods of the type itself.It is possible to directly instantiate a builder object.
In addition to replication of all fields and
Void
-returning methods, everyBuilder
object contains some utility methods.apply()
method.Builder
objects can be implicitly constructed using a builder block, which consists of a series of field initializations and method invocations enclosed by curly braces.When a builder block is used to construct a
Builder
object, the type must be statically known by the compiler. This means theBuilder
object must be assigned to a typed variable, or passed to a typed parameter of a method or constructor.With Operator & Method
A new operator is introduced,
:
, pronouncedwith
. This operator is a shortcut to a method calledwith
.Obj
supplies a default implementation of this method, which accepts aBuilder
object and returns anObj
.The default implementation provides the following functionality:
with
method simply invokes the builder'sapply
method to thethis
object, thus committing the changes to the object. The method then returns thethis
object.with
method creates a new object, and uses the builder's set fields whenever they have been specified, and thethis
object's fields whenever the builder does not have a value for a field. The method then returns the new object, which is also immutable.Optionally, the implementation of
with
could provide additional functionality:const
fields are not set.null
.Add Operator & Method
A new postfix operator is introduced,
,
, pronouncedadd
. This operator is a shortcut to a method calledadd
that accepts a single argument and returnsVoid
.It is possible to use the operator on any object that has such an
add
method.From inside the class with the
add
method, the object may be omitted, like normal.If a class has an
add
method, then because it returnsVoid
, it will be replicated in theBuilder
object for the class. This means it is possible to use theadd
method in builder blocks.Constructor Syntax
(This section needs more work.)
A constructor may return a value of the exact same class. This allows substituting existing classes for "new" objects.
Construction & Verification
If a constructor accepts a
Builder
argument, then if the builder is applied inside the constructor, all operations are privileged: the builder may setconst
fields, for example. Once constructed, it is not possible to use a builder to perform privileged operations.The Fan idiom is to perform field validation inside an overridden
with
method that delegates tosuper
after performing validation.This idiom has many advantages:
super.with
will return a new object or the existing one. Since any object can be made immutable, this is a very important property.Logic that appears frequently in verifiers is factored out into methods of
Builder
(the above pattern, for example, might be common enough to justify a new method inBuilder
, or at least a static method ofBuilderUtils
).For constructors, the Fan idiom is generally to delegate to the
with
method. In situations where this is not desirable, common verification logic is factored out and invoked both inwith
and in the constructor.Alternate Syntax for Builder Blocks
An alternate syntax for builder blocks uses square brackets instead of curly braces.
This alternate syntax has two advantages over curly braces:
List
andMap
.JohnDG Wed 18 Feb 2009
This means Fan will go out of the door with no way to modify immutable structures, and no backwards-compatible way to intercept
with
blocks or do implicit adds because no operator are used for either.Fan in its current state cannot evolve in a backwards-compatible way with the notion of builders as expressed in this proposal. Moreover, I think there was some consensus in the "what's needed for Fan 1.0" that the existing bifurcated construction is troublesome and that we need a way to modify immutable data.
I'm fine with Fan not changing for a 1.0 release, but I do think that kills this proposal and is going to impose an awkward grammar on future support for these concepts.
There's clearly a more general concept behind
with
blocks and it seems after many months of trying to find it, we're pretty close. It seems a shame to throw that away.That said, in my opinion, and with the exception of fine-grained concurrent applications, Fan is already a better Java/C#.
brian Wed 18 Feb 2009
I think that was what I was trying to ask. What can we do now so to leave the door open for 1.1 where we don't have an awkward grammar.
At the least I think we really need to change add to be postfix comma operator (although I am not sure if/how that works outside of a with-block).
What other syntax changes do we need now so that can evolve?
I am not giving up this discussion though, I just want to make sure we have a back up plan.
I have really started thinking about first class with-blocks where we generate a class at the call site instead of a mirror builder class.
Suppose the "with-block" was really a LINQ like block of exprs? I could either apply the exprs or actually access the AST at runtime? One key thing LINQ does is do change what gets generated based on the target of the expression:
JohnDG Wed 18 Feb 2009
If you're looking for minimal set of changes, I'd recommend the following:
:
to prefix awith
block. You can always delete this symbol and retain backward compatibility (if it turns out not to be needed).with
so that no one can use it.,
as a postfix operator for adding items inside awith
block.@onConstruction
facet or similar to hook into verification. A futureObj.with
method could always call methods so annotated.I'd have to think about it more, but I believe the above changes would keep the door open to this proposal and quite a few others. More information is better, and you can always allow omitting symbols if they turn out not to be necessary.
The thing I like about
builder
is that you get compile-time safety: you can only set fields and invokeVoid
-returning methods that actually exist on the object.I'd have to think more about the implications for treating a builder block as a dynamic class with runtime access to the AST, but it seems to me you lose compile-time safety.
We should also think about how that feature would overlap/interact with plug-ins.
jodastephen Sat 21 Feb 2009
I've given this some thought, and I really like it. This overall proposal is way better than what Fan has at present, and much better than what Java/C# can achieve. I do have a few tweaks.
Construction
This approach can allow "split construction" if tweaked:
ie. I'm tweaking the proposal such that we allow Fan construction-with-blocks to be coded exactly as they are today. It seems to me that this is no longer a problem (my previous concerns seem solved). This is because normal-with-blocks are different to construction-with-blocks, because of the extra colon:
Whats interesting is that this last piece of code above will work even if Point is fully const. Why? Well, the x/y get set by the construction-with-block, and then a new instance is returned by the second normal-with-block. Inefficient, but entirely readable and sound.
So, this all works well from the callers point of view. The problem is the right syntax for the factory/ctor. I've played with some syntax, but aren't happy yet. The key point is that split construction is valid now from the caller, so the ctor should be relatively easy.
Auto-add
I'm not a huge fan of the postfix comma (I dislike postfix in general I suspect). Basically, its easy to miss (we've got rid of semicolon line endings for a similar reason). Here are three alternatives:
The nice thing about alternative C is how it links in to custom lists/sets:
That looks pretty nice. Removing the curly brackets is possible, but might be confusing.
BTW, I suspect that builders should support
put()
as well asadd()
to allow map style additions if we go with alternative C.First-class with blocks
These feel a bit magical at the moment. I can get my head around a builder, and how to generate it (mirror class with public fields). I can't picture a FCWB except as a hash-map.
brian Sat 21 Feb 2009
The key issue I am trying to get my head around is what is the "builder" type. I think John's proposal requires generating a mirror class with a mirror slot for every class. That amount of overhead doesn't sit well with me.
The two alternatives are try to re-use the original class or to generate a class per call site.
I can't figure out how we use the original class and get a first class builder.
If we use a class per call site, it is very difficult to get static typing and its efficiency without introducing some notion first class tuple/record types.
I looked into using square brackets previously, and I think had some grammar issues trying to make it work.
jodastephen Sat 21 Feb 2009
I wonder how bad it really is. A builder only needs to be generated if the class declares that it can use builders. If it does that, then there is a cost involved. Given how many closure classes will be generated, size shouldn't be a concern. Its difficult to invest too much time in the builder design until you're willing to go the route of a mirror class.
(I believe that the main class will need final instance variables, not set after the constructor. Googling fails to turn up any clear references on this, so we shoul go with the classic Java Memory Model and make fields final. This makes it very hard for any single class design to work.)
JohnDG Sun 22 Feb 2009
It doesn't require a mirror class, but that's the proper way to think of it (and the way you would describe the feature in documentation).
I think when you do analysis on the AST, to perform validation (i.e. to check
x
really exists as referenced in the builder block{ x = 4 }
), you simply check the class for which the builder is being used. Since this class is known statically, at compile time, it's pretty easy.Now, you still have a
Builder
class somewhere, but it's completely generic, and looks something like this:Whenever
setField
orinvokeMethod
are called, they store a simply store (in a list) an anonymous function that will take its argument and perform the specified operation on it.The
apply()
method simply runs through the functions and invokes them on the object.At the use site, you simply translate the builder block into usage of a generic builder.
This should be pretty efficient (as efficient as you can be, given that builders can be
played back
an arbitrary number of times), and achieves all the requirements of the proposal.Again, I more like the comma operator because it's already used to create lists and maps. You're right, it's easier to miss, but I think the unification gains are too much to overlook.
In any case, I do like the idea of supporting map-like syntax, too. I think what I'd suggest is a combination of tuples, syntax sugar for tuple construction, adding an
add
function toMap
that accepts a pair. e.g.:Then the syntax for map would be unified with lists:
or with alternate builder block syntax:
JohnDG Sun 22 Feb 2009
The generic
Builder
shown above would actually be a good use case for generics.Builder<T>
would allow you to constrainapply
to accept only objects of typeT
, which would get you type checking "for free". Otherwise, I suppose, you just carry around type information in the AST and manually disallow using a builder of one type on an object of another type.brian Tue 24 Feb 2009
This is what Scala did, and I think it is pretty elegant. But it requires the notion of first class tuples (which Scala has and we don't).
BTW, I just ran across a bit of scala code that looked something like:
It looks a lot like a with-block, but I don't understand what it is really doing. Can someone explain how it works?