All Topics

#1141 Series of questions/remarks about the API and its implementation

MoOm Thu 8 Jul 2010

As I'm currently writing a native implementation of the sys pod in C for the LLVM backend I'm working on, I happen to spend time reading the sys API and its Java implementation. Therefore, I have several small questions/remarks about it, ans as those questions are not important enough to deserve an individual topic for each of them, I'm going to regroup them in this topic, and add new one each time I have a new one.

So let's begin with the first series of questions/remarks.

sys::list

* In my opinion, insert() is not intuitive when used with negative indices. For example, [1, 2, 3].insert(-1, 4) results in [1, 2, 4, 3] while I would expect the result to be [1, 2, 3, 4]. I think the API would be more consistent if list.insert(index, val).get(index) == val for any index.

* I think containsSame() could be useful to have. It would also make the API more symetrical: index() has its contains() counterpart, but indexSame() misses its containsSame() counterpart.

* The return-type in the sys::List API seems a bit inconsistent to me. Sometimes this is This and sometimes, this is L, and I can't see any logic in it. For example, add() returns L while removeRange() returns This. I think one possible rule could be: if the method returns this, then the return-type should be This, otherwise it should be L.

* In the Java implementation of map(), I don't understand the line if (r == Sys.VoidType) r = Sys.ObjType.toNullable();. When could the function passed to this method return Void?

sys::Method

* Shouldn't callOn(Obj? target, Obj[]? args) be callOn(Obj? target, Obj?[]? args)? Otherwise, it wouldn't be possible to pass null parameters when using callOn()

sys::Param

* toStr() is overriden in Param.java but is not declared with override in Param.fan

sys::Range

* In the Java implementation of Range.start(long) and Range.end(long), there is no check to verify if index is negative. I think the check should be if (x < 0 || x > size) throw IndexErr.make(this).val;

sys::Int

* pow() currently performs a for loop on the exponent, so it's complexity is linear. I could be O(ln) by using exponentiation by squaring. This is not critical as for high exponent, the integer would overflow anyway. But it would prevent programs to hang for a long time if the exponent is a high value.

* isDigit(): I think there is a bug in the Java implementation: x = val - 10 should probably be x = radix - 10.

* toHex(): 1.toHex(17) throws an IndexErr as it tries to prepend 16 zeros while the zeros array contains maximum 15 zeros.

* fromStr(): in any other radix than 16, negative numbers prefixed by "-" are supported. I think it might be good for consistency if it would be the case for hexadecimal too. For example, Int.fromStr("-123", 8) works while Int.fromStr("-123", 16) throws a ParseErr.

* equalsIgnoreCase() does not work correctly for other characters than letters. For example, 0.equalsIgnoreCase(20) returns true.

sys::Obj

* why is typeof() declared as virtual? It seems dangerous to me to be able to override typeof() and to make it return another type than the normal one.

* isImmutable() and toImmutable() are not declared as virtual but are still overriden in the bytecode (for example, each time there is a closure). That makes things hard on my side: as they are not virtual, I don't register them in the virtual-table of sys::Obj's C implementation. Therefore, I have no way to override them. Would it be possible to declare them as virtual in Obj.fan?

I'm done with the questions for now. :) Sorry to send so many points in one time, I know this will make this topic hard to read and answer. For the future points, I'll try to post them more frequently.

Anyway, thanks for your amazing work, Fantom is definitely a truly amazing language and API!! :)

brian Fri 9 Jul 2010

Wow - thats a long list of comments! These are all great comments/bug reports. I pushed a changeset for the bug fixes. But let me try to respond to each issue:

sys::List

In my opinion, insert() is not intuitive when used with negative indices.

In the original design I did go back and forth on that one. I can't remember all the reasons, but this was the design that was more consistent. I'd hate to make a breaking change like this. But if everyone was supportive we could consider it.

I think containsSame() could be useful to have.

I think we used to have this and decided to remove it (there was a previous discussion about methods to add/remove)

The return-type in the sys::List API seems a bit inconsistent to me.

My intention was to use generic parameter L everywhere. The This was just a mistake, I fixed it.

In the Java implementation of map(), when could the function passed to this method return Void?

Don't remember to tell the truth. In general the compiler will disallow this, but I probably just put it in there as a safety guard. If you did pass a void function you would essentially map everything to null.

sys::Method

Shouldn't callOn(Obj? target, Obj[]? args) be callOn(Obj? target, Obj?[]? args)?

Fixed

sys::Param

toStr() is overriden in Param.java but is not declared with override in Param.fan

fixed

sys::Range

In the Java implementation of Range.start(long) and Range.end(long), there is no check to verify if index is negative.

You actually need to support negative indexes, that is how expressions like str[1..-1] work.

sys::Int

pow() currently performs a for loop on the exponent, so it's complexity is linear.

I think that might be a nice optimization, but not sure it is really worth complicating an obscure method like this one. I am surprised there isn't an integer/long based version in java.lang.Math.

isDigit(): I think there is a bug in the Java implementation: x = val - 10 should probably be x = radix - 10.

That code is should be correct because it only happens if the radix is over 10, in which case we assume 0-9 are the decimal digits, and every digit over 10 maps to a letter 10=a, 11=b, etc

toHex(): 1.toHex(17) throws an IndexErr as it tries to prepend 16 zeros while the zeros array contains maximum 15 zeros.

Good catch, fixed and beefed up test suite

fromStr(): in any other radix than 16, negative numbers prefixed by "-" are supported.

That is a tough one. But my gut feel is that hex to should always be absolute on both to and from string side.

equalsIgnoreCase() does not work correctly for other characters than letters.

Another good catch, fixed and beefed up test suite

sys::Obj

typeof and isImmutable/toImmutable

So these are special cases that I did this way like this:

I made typeof virtual because the emitter does actually generate an override for each class. It could probably be non-virtual too since it is such a special case (like Java did). However even though it is virtual you cannot override - it is handled specially by the compiler and if you try it, then you get a compiler error.

The immutability functions are non-virtual on the Fantom side because otherwise it would be easy to break the immutability semantics of Fantom (just override isImmutable to return false in your mutable class and hell breaks loose!). But for performance reasons in the Java runtime I left them virtual so that List/Map/Func can do their special thing.

MoOm Fri 9 Jul 2010

Thanks Brian for these quick answers and bugfixes. Let me respond to some specific points that are still unclear to me.

sys::Range

You actually need to support negative indexes, that is how expressions like str[1..-1] work.

Agreed with this. But what I meant is, when we have a negative index, we add size to make it positive. But this is not enough to ensure that the index will be positive: if the index is less -size, it'll still be negative after the addition.

That's how I'll fix this:

final long start(long size)
{
  long x = start;
  if (x < 0) x = size + x;
  if (x > size) throw IndexErr.make(this).val; // Should be 'if (x < 0 || x > size)'
  return x;
}

This is actually the same for any code that deals with negative indexes (not only Range but also StrBuf, Str, List, ...)

sys::Int

isDigit(): That code is should be correct because it only happens if the radix is over 10, in which case we assume 0-9 are the decimal digits, and every digit over 10 maps to a letter 10=a, 11=b, etc

Actually if the radix is over 10 and if val is not a number (that is, between ‘0‘ and ‘9‘), it should be between ‘a‘ and ‘a‘ + (radix - 10) (e.g. if radix is 17, val should be between ‘a‘ and ‘g‘).

But current code check that val is between ‘a‘ and ‘a‘ + (val - 10), which seems incorrect. For example, ‘h‘.isDigit(17) returns true which is incorrect.

fromStr()

Agreed with you that hex numbers prefixed by ‘-‘ does not make much sense. But it does not make more sense for octal numbers or for any radix other than 10. So for consistency, I think we should either allow it all the time, or forbid it for non-decimal numbers. For info, Java supports numbers prefixed by ‘-‘ for any radix, so I'll be in favor of supporting them too. toStr() is not really a problem as we have the choice to format the output string the way we want.

sys::Obj

Ok about not declaring isImmutable and toImmutable as virtual, as it may indeed be dangerous to override them (not more than using Unsafe but using Unsafe makes it more explicit that what we are doing is actually unsafe :))

In this case, wouldn't it be possible to do the same for typeof? i.e. not declaring it as virtual in the .fan but only in the Java runtime? This way, there would be no need for a warning in the documentation and for a special-error in the compiler.

brian Fri 9 Jul 2010

Ok, I posted a second changeset ...

Range.start/end

Oh ok, I misunderstood. Although I'm not sure what you are recommending, that we move the checks in Str, Buf, etc into Range?

Int.isDigit

My fault, I was looking at the wrong code. Yeah that was a bug, I pushed a fix and beefed up test suite.

Int.fromStr

Actually testing that code in .NET showed that we inconsistent behavior with non-decimal negative numbers. So I agree we should disallow a minus sign for any radix other than 10. I fixed this and added a check in the tests.

Ok about not declaring isImmutable and toImmutable as virtual,

I have pretty grave concerns about opening up toImmutable to developers without some checks and balances (although I do some ideas how to let developers implement mutable to immutable conversions for future release).

In this case, wouldn't it be possible to do the same for typeof? i.e. not declaring it as virtual in the .fan but only in the Java runtime?

If I remember correctly it was a matter of whether the special check was in compiler or in the runtime. I'll go back and see what happens if we make it non-virtual.

brian Fri 9 Jul 2010

Ok, I tested out making Obj.typeof non-virtual and that all seems to work fine. I remember way back several years ago going back and forth on that issue. But I agree it is better to make it non-virtual in public Fantom API (even if runtime uses a virtual method for implementation). So I pushed this change.

BTW: how is the LLVM backend going? That could be a really great runtime for Fantom! And I can see all sorts of uses for a C implementation of the sys library.

MoOm Wed 14 Jul 2010

Thanks Brian for all these fixes.

About the Range thing, here is what I meant:

If I do, "slice"[1..10], I get the following exception:

sys::IndexErr: 1..10
  fan.sys.Range.end (Range.java:254)
  fan.sys.FanStr.slice (FanStr.java:160)

but if I do, "slice"[-10..-1], I get:

sys::IndexErr: java.lang.StringIndexOutOfBoundsException: String index out of range: -5
  java.lang.String.substring (String.java:1943)
  fan.sys.FanStr.slice (FanStr.java:163)

which is not very consistent with the previous exception. This is because Range.start does not check whether or not the index is in the range when it is negative.

Forget what I said about StrBuf, Str and List, it was a mistake.

About the LLVM backend, it's still in a very early stage. I'm still working on porting the sys pod to C as I cannot do much without it. I almost have the minimal set of classes (Obj, Int, Float, Str, List, Map, Type, Method, Field, ...) to be able to start working on the LLVM part, which should hopefully not be the hardest part. I'll post a topic here as soon as it can run some basic sample codes! :)

vkuzkokov Sat 28 Aug 2010

I think that might be a nice optimization, but not sure it is really worth complicating an obscure method like this one.

Just to make sure I understand what kind of complication do you mean:

--- a/src/sys/java/fan/sys/FanInt.java  Wed Aug 25 09:08:32 2010 -0400
+++ b/src/sys/java/fan/sys/FanInt.java  Sun Aug 29 05:59:11 2010 +0700
@@ -202,7 +202,11 @@
   {
     if (pow < 0) throw ArgErr.make("pow < 0").val;
     long result = 1;
-    for (int i=0; i<pow; ++i) result *= self;
+    for (; pow>0; pow>>=1)
+    {
+      if ((pow&1) == 1) result *= self;
+      self *= self;
+    }
     return result;
   }

The construct itself is also valid C# with same behavior.

brian Sat 28 Aug 2010

That code seems clever and pretty tight. Not sure I've thought thru the math myself, but it seems to work for everything I tested. I pushed that changeset. Thanks.

vkuzkokov Sat 28 Aug 2010

The code works so (I'll use ^ for power):

Let pow = 2*a + b, where a := pow >> 1 and b := pow & 1
self ^ pow == self ^ (2*a+b) == (self^2)^a * self^b
We multiply result to self ^ b.
Then we come to the same task for self * self as new base and pow >> 1 as new power

I think it belongs to src comments in case of adding another platform.