#1356 getOrAdd used with null value

vkuzkokov Tue 7 Dec 2010

In Map.java:

public final Object getOrAdd(Object key, Func valFunc)
{
  Object val = map.get(key);
  if (val != null) return val;
  val = valFunc.call(key);
  add(key, val);
  return val;
}

Which is semantically:

if (map[key] == null) map[key] = valFunc()

Actually, I expected:

if (!map.containsKey(key)) map[key] = valFunc()

DanielFath Tue 7 Dec 2010

Aren't map[key]==null and map.containsKey(key) same?

vkuzkokov Tue 7 Dec 2010

Common misconception:

m := [0:null]
echo(m[0] == null) // true
echo(!m.containsKey(0)) // false

perp Tue 7 Dec 2010

I agree, checking for a null value just seems wrong.

brian Tue 7 Dec 2010

I agree technically it should probably be implemented with containsKey. But at the same time that requires two hash lookups. So is it really worth that performance hit just to use null values with this one method?

So my vote would be just update docs to reflect actual implementation rather than change it to use containsKey.

DanielFath Tue 7 Dec 2010

@vkuzkokov: facepalm Forgot Values can be null.

It was probably a temp hack to make it work or something. Either case brian needs to beef up the test suite and change that. Speaking off how come map doesn't have the equivalent of put in Java(add or replace)?

@brian: Sounds good enough. Also can you give any reasonable estimate how long in Big O notations would get/set/add require (depending on platform if necessary)?

ivan Tue 7 Dec 2010

Probably it is fine that hash is calculated twice, since it's getOrAdd, not getOrPut, because sys::Map#add calls hash twice in worst case

brian Tue 7 Dec 2010

peaking off how come map doesn't have the equivalent of put in Java(add or replace)?

That is sys::Map.set, which typically is written using operator a[b] = c

perp Tue 7 Dec 2010

@brian

I agree technically it should probably be implemented with containsKey. But at the same time that requires two hash lookups. So is it really worth that performance hit just to use null values with this one method?

So my vote would be just update docs to reflect actual implementation rather than change it to use containsKey.

I would personally worry more about correctness and principle of least surprise than performance in this case. I would personally expect the following:

class Test
{
  Void main()   
  {
    m := ["key":null]
    m.getOrAdd("key", |->Str| { "not null" })
    echo(m["key"])
  }
}

...to print "null" rather than "not null".

DanielFath Tue 7 Dec 2010

@brian: Oh, right. Sorry bout that. I was convinced put was the equivalent of add. BTW any chance you add Big O notation for set/get/add? I don't mind that much but others might want some guarantee about informational complexity (even if per platform).

@ivan: Well null is also used to denote a missing value so in a way getOrAdd has a point of changing it to a funcVal. It's just the documentation that needs rectifying. OTOH this IMO isn't one of frequently used methods so performance impact (how much of an impact is two hash lookup anyway?) shouldn't matter much.

vkuzkokov Tue 7 Dec 2010

@DanielFath Default implementation is O(1). For collections it's usually O(n)+time to calculate element hashes. For custom implementations: O(as long as it takes)

@brian

I agree technically it should probably be implemented with containsKey. But at the same time that requires two hash lookups.

It requires 2 hash lookups because we use java.util.HashMap which doesn't have getOrAdd functionality. Keeping things as they are will result in inconsistency driven by current implementation.

In Java there is ConcurrentMap.putIfAbsent but both ConcurrentHashMap and ConcurrentSkipListMap forbid use of null as key or value.

That leads me to the next question: do we really need nullable values in Map?

brian Tue 7 Dec 2010

Well looking at code, it looks like Map.add didn't handle null values correctly either. So probably makes sense to tighten up add and getOrAdd to handle null values correctly even at the expense of double lookup. I pushed a fix

BTW we could potentially avoid double lookup by handling null using a token value ourselves or implement our own map from scratch. I use small sized maps so heavily, that I am considering optimizing Fantom's implementation for small sizes (0-5) to use fixed sized structures which can have huge memory savings.

That leads me to the next question: do we really need nullable values in Map?

This is a great question which deserves some discussion. Null values are notoriously fraught with problems. We disallow null keys, and I could live without null values. But I know people use them, and they are useful sometimes.

DanielFath Tue 7 Dec 2010

@vkuzkokov: Thanks for that info but it would be way better if it was specified in the docs where everyone can read it than in some random thread.

Also sometimes you need to add an Object for which you can't guarantee it is not null. If brian implements it despite the two hash-lookup, any idea where or how much more inconsistency can we expect?

vkuzkokov Tue 7 Dec 2010

I think map.entrySet().get(key) can eliminate necessity for extra lookup.

jodastephen Wed 8 Dec 2010

For small maps, Apache Commons Collections Flat3Map is an optimised small implementation that could be used.

I do believe that nulls should be supported in maps. Fantom already forces a lot of design decision on users in the list/map space (no user implementations, strange choice of which methods are mutating and which retuen a new instance). Adding null to the list simply decreases the usability of Fantom, which I think is already marginal in this area in places.

andy Wed 8 Dec 2010

I do believe that nulls should be supported in maps

Not saying I am for or against changing this feature - but I'm curious to how people use nulls as meaningful values in Maps. Anyone have any examples?

rfeldman Wed 8 Dec 2010

Agreed that null should continue to be supported as a value in Map. It doesn't come up often, but when it does, it's extremely clunky to work around.

Consider a UI for a numerical properties map ( [Str:Int?] ). There could easily be a meaningful distinction between 0 and null in such a case, with 0 indicating a preference for 0 and null indicating a preference for whatever the system default value happens to be.

If you can put nulls in maps, setting up the UI is as simple as iterating over the key set and displaying a label and textbox for each key. If null is encountered, display a blank text field instead of one containing an Int.

If nulls are not allowed, you have to enact some clunky workaround in which you either have a separate Str[] list of all the keys with null values, which you then have to merge into the original key set to keep the ordering consistent, or you have to sacrifice type safety and go to [Str:Str] as a hack to represent null as the empty string, or...I can't really think of an appealing option, honestly.

Like I said, it's a pretty rare case, but it's a pretty horrendous limitation to try and work around when it comes up.

DanielFath Wed 8 Dec 2010

@joda: Yeah, some design decisions are rather limiting - like no user implementation but on the other hand does Python or Ruby allow custom map implementations (not a rhetoric question)? Could you be more specific on which methods behave inconsistently.

andy Wed 8 Dec 2010

Like I said, it's a pretty rare case, but it's a pretty horrendous limitation to try and work around when it comes up.

Don't think thats really all that bad:

props := ["a", "b", "c"]
vals  := [:]
props.each |p| { v := vals[p] }

I would consider the common use case of maps that null values are semantically the same as containsKey. If we removed null value support - you're right that it disallows some use cases - but I guess I'd be more interested in would that change increase the usability and consistency of the "common" case? And even if it did is worth enough to limit the edge cases?

tcolar Wed 8 Dec 2010

At first I was gonna say null in maps are not necessary and you can just not store the item in the map if null .... but when you use a map as a lazy cache(common) for a heavy computation result, it's totally useful to be able to store null - If the result of the long computation is null(maybe it failed or not valid result), you don't want to recalculate it each time.

yachris Wed 8 Dec 2010

Ruby "allows" a custom map implementation, in that you can implement the same API as the existing map, and (since variables don't have types) use it anywhere you'd use the standard map. You'd have to say:

map = MyMap.new

instead of

map = {}

but you already have to use the standard constructor if you want a default value, so it wouldn't be a shock.

rfeldman Wed 8 Dec 2010

@andy I hadn't thought of that workaround...it's pretty quick and easy, although it means you aren't dealing with a single object anymore; you'd have to package that up into a new class if you wanted to be able to pass around a single instance like you could with a regular Map.

That said, that approach would work for any case where you wanted to store nulls in a "map".

So the real question becomes: is it worth it to make the corner case more annoying to implement (though not significantly more complex - as shown it wouldn't take more than 5 minutes to throw together a null-value-accepting Map lookalike that would support at least get and put) for the sake of eliminating confusion around containsKey in all cases?

qualidafial Wed 8 Dec 2010

I'd say null values in maps are a legitimate use case. Like tcolar I immediately thought of caching expensive computations.

If you don't want to allow null in the map, then use the type signature to say so: [Str:Obj] vs [Str:Obj?]

DanielFath Wed 8 Dec 2010

@yachris: Nice. That seems rather elegant. If Fantom ever implements generics it would probably the best model to follow (though I'm not sure if it would work with Fantom's type system).

jodastephen Thu 9 Dec 2010

@DanielFath, see `http://fantom.org/sidewalk/topic/884`, notably my comments.

DanielFath Thu 9 Dec 2010

Thanks.

The more I think about the more some kind of Scala-esque OpaqueTrait that could be applied to list to make it a Stack/Set/Bag is preferable than current Jack-of-All trades list.

Login or Signup to reply.