#933 Extract List.*() vs List.*Same() methods into comparison strategy

qualidafial Tue 26 Jan 2010

I think my comment got lost in the Set vs List holy war on #927 so I thought I'd break out this suggestion into a separate topic. :-)

I've noticed that when using a List I usually want to use either compare-by-reference or compare-by-equals, never a mixture.

There are currently four methods in List that are just compare-by-reference duplicates of their compare-by-equals counterparts:

  • containsAll vs containsAllSame
  • contains vs containsSame
  • index vs indexSame
  • remove vs removeSame

I will add another: equals uses compare-by-equals semantics but I have wished many, many times in Java that I could compare elements by reference or by some configurable methods instead of just equals. I could go on for hours about the problems this has caused in my work in Eclipse Data Binding...

At any rate, here is my proposal:

Introduce a mixin Comparer as a strategy for object equality comparison:

mixin Comparer
{
  Bool equals(Obj? a, Obj? b)
  Int hash(Obj? o)
}

class Comparers
{
  const Comparer eq = EqComparer.make;
  const Comparer ref = RefComparer.make;
}

private class EqComparer : Comparer
{
  Boolean equals(Obj? a, Obj? b) { return a === b ? true : a?.equals(b) ?: false; }
  Int hash(Obj? o) { return a?.hash ?: 0; }
}

private class RefComparer : Comparer
{
  Boolean equals(Obj? a, Obj? b) { return a === b; }
  Int hash(Obj? o) { return Sys.idHash(o); }
}

Add a new field List.comparer with a default value Comparers.eq and let this strategy field become the touch point for all equality comparisons.

Now remove the four *Same duplicate methods and modify the original methods to take a strategy object:

Bool containsAll(L list, Comparer cmp := this.comparer);
Bool contains(V item, Comparer cmp := this.comparer);
Bool index(V item, Int offset := 0, Comparer cmp := this.comparer);
V? remove(V item, Comparer cmp := this.comparer);

A major benefit to this approach is that when you instantiate a list you can declare up front what comparison strategy you intend to use, and just use the methods from that point forward. This will reduce the number of touch points where typing contains when you really meant containsSame could introduce a bug, to just one touch point at construction. And with the optional cmp arguments you can always override the default behavior.

If we go a step further with the Comparer interface we have the possibility of determining our own comparison with map keys:

mixin Comparer
{
  Bool equals(Obj? a, Obj? b)
  Int hash(Obj? o)

  **
  ** Returns an object which wraps the argument, and delegates to this 'Comparer'
  ** for equals and hash computation
  ** 
  Obj? wrap(Obj? o)
  {
    return o == null ? null : ComparerWrapper.make(o, this);
  }

  **
  ** Returns the object previously wrapped in `#wrap(Obj?)`
  **
  Obj? unwrap(Obj? o)
  {
    return ((ComparerWrapper)o)?.unwrap
  }
}

private class ComparerWrapper
{
  Obj? obj;
  Comparer cmp;

  new make(Obj? obj, Comparer cmp)
  {
    this.obj = obj;
    this.cmp = cmp;
  }

  Bool equals(Obj? o) { return cmp.equals(obj, o); }
  Int hash() { return cmp.hash(obj); }
}

private class EqComparer
{
  // ...

  // These objects already compare by equals, so no need to wrap
  Obj? wrap(Obj? o) { return o; }
  Obj? unwrap(Obj? o) { return o; }
}

Looking forward to everyone's comments.

Edit: Correct EqComparer.equals(Obj?, Obj?) implementation, clarify some ambiguous wording

qualidafial Tue 26 Jan 2010

Just realized I didn't finish my thought with maps:

Introduce Map.keyComparer, also with a default value of Comparers.eq. Inside the Map implementation every key would be wrapped using Comparer.wrap before storing to the underlying data structure. This allows the map data structure to use compare-by-equals regardless of the key being stored.

When iterating, every key would be unwrapped before passing the element to the iterator callback function.

And since the Comparers.eq.wrap doesn't actually wrap the object (it just returns the passed in object) this structure would not incur any object overhead for the default, compare-by-equals use case.

However there would have to be a restriction that you cannot change a Map's comparer while is it non-empty.

brian Tue 26 Jan 2010

I like the basic idea, but I don't like the idea of adding callback references to List. It gets into all sorts of ugliness regarding readonly and immutability.

Although one thing that might work ok is just having a equals/same mode - that wouldn't be quite as flexible, but would handle the majority of cases (plus we could share the readonly bitmask and wouldn't add any memory overhead).

jodastephen Tue 26 Jan 2010

I should point out that I've already given one additional use case - matching based on an id (coded once, rather than coded every time in the closure).

There will be another case at some point when weak references are added (as list items and map keys/values may need to be weak.

The basic idea here is sound.

qualidafial Tue 26 Jan 2010

brian: I like the basic idea, but I don't like the idea of adding callback references to List. It gets into all sorts of ugliness regarding readonly and immutability.

This dovetails nicely into the pure/'const' debate:

mixin Comparer {
  pure Bool equals(Obj? a, Obj? b)
  pure Int hash(Obj? o)
}

Problem solved! ;-)

While we're at it we could maybe enforce this on some core methods e.g. sys::Obj#equals or sys::Obj#hash.

jodastephen: I should point out that I've already given one additional use case - matching based on an id (coded once, rather than coded every time in the closure).

I've run across that case in Java, particularly with Hibernate.

Login or Signup to reply.