#1989 Pattern matching in switch proposal

Jens Sat 4 Aug 2012

Intro

A very useful feature in many functional languages is pattern matching. This makes it very easy and concise to write some kind of logic that can be pretty hairy to express with normal if:s and switch:s.

I propose adding pattern matching capabilities to the Fantom switch statement in a way that resemble how it looks in Scala.

UPDATE: This proposal tries to introduce a unified way of switching on the type and contents of objects, instead of making smaller and more directed changes.

For a description of the Scala model have a look at the Programming in Scala book.

Pattern matching would be added for lists, maps, and the type and fields of objects using the it-block style constructor syntax.

Example

This is an example of how it might look in Fantom:

class Animal{Animal? parent := null}

class Turtle : Animal{Int field1 := 1}

class Frog : Animal{Int field2 := 2}

Void patternExample(Animal o)
{
  switch (o) {
    case Turtle{field2 = 2; parent = Frog{} as par}:
        echo("Object pattern. Parent: " + par)
    case [1, _ as snd]:
        echo("List pattern. Second list elem: " + snd)
    case ["key1": 1, "key2": _ as val2] if (val2 > 7):
        echo("Map pattern. key2: " + key2)
    case 17:
        echo("Simple literal")
    default:
        echo("Default")
  }
}

The first case branch is chosen if o is a object of type Turtle, its field2 has value 2, and its field parent has an object of type Frog as its value. The par variable introduced with the as keyword can be used only in that branch.

If that pattern doesn't match, the list patterns in the second case is tried. It matches if o is a two elem list, with 1 as its first elem and any second elem. _ is the wildcard, "don't care" pattern. It is bound to the variable snd.

The third case is a map pattern, checking for the presence of keys "key1" and "key2" in o, that "key1" is mapped to 1, and binding the value of "key2" to the variable key2.

The fourth case is a simple expression pattern, which is compared to o with equals.

Details of proposal

The current switch works by comparing the value of an expression in each case with the objects equals method. This would change so that after each case a pattern is expected instead of an expression. The object that is switched on will be checked for a match against each pattern in turn.

Simple expressions are valid patterns. They will get evaluated and their value compared with equals in the switch.

But after the case keyword, list literals, map literals and it-block constructors will be interpreted as patterns, not as expressions creating objects.

The value that is switched on is checked to see if it matches the structure of the pattern. The case branch associated with the first matching pattern is executed.

A pattern can be either:

  • A normal expression. This will work like the current switch, the value will be compared by equals.
  • A literal object, it-block constructor style. This will be checked for a matching type, and can check the value of fields of that type with nested patterns.
  • A literal list, with nested patterns for the elements.
  • A literal map, with expressions for the keys and nested patterns for values.
  • The wildcard identifier, _, which matches all values. This is useful when one wants to introduce new variables without checking their value.

A pattern can also introduce new variables. That is done with the as keyword followed by a free identifier, written after the pattern.

A top level pattern can have a guard. It consists of an if followed by a boolean expression. The pattern match if the expression evaluates to true.

Scala

The Scala match statement is extendible in a very flexible way with custom extractors. This rimes well with Scalas goal of being extendible with new syntax, but it is complex and potentially bad for readability. The Fantom solution should probably aim for something simpler which solves the large majority of cases.

Implementation

The semantics of a switch in which only simple literals are used remain the same as in the current switch. It can be compiled in the same way as now.

A possible implementation for a switch with complex patterns is to translate it into a number of nested if statements. No fcode changes would be needed.

This file demonstrates what the translation might look:

https://github.com/jensli/FantomPatternMatching/blob/master/F4_Projects/FantomPatternMatching/src/PatternsTranslation.fan

Open problems

Change of semantics

In this proposal the patterns are written as normal literals in the case clauses. This will change the semantics of the switch, from comparing with the equals method to comparing with patterns. For list and map literals pattern matching and equals comparison would mean the same thing but for object patterns the meaning would change.

A change in semantics is never good, however this case is probably extremely rare in code. In the Fantom distribution there doesn't seem to be any code which would have its behavior changed.

Another possibility would be to keep the equals semantics of switch, and introduce special syntax for the patterns. Perhaps using a match keyword instead of case when matching is wanted. The disadvantage with this is that there would be two different behaviors in the switch, it might be cleaner if it only does pattern matching branching.

Ambiguity between patterns and expressions

Normal expressions would be allowed as patterns, but it-block constructor, list and map literals would be interpreted as patterns instead of expressions. This might lead to confusion, for people and compilers. My feeling is that both would be able to handle that. But maybe some syntax could be used to explicitly mark out patterns, like preceding them with #, or why not ยง.

Reuse ofas keyword

Might be undesirable.

Further development

  • Statically check that switches cover all possible cases
  • Patterns in more places, for instance for destructive assignments
  • Matching on tail of list
  • UPDATE: Matching on the type of value types, like Int
  • UPDATE: Matching on more types, such as Range and Time

Prototype

As a project in a programming language course I am working on a prototype implementation of this proposal. It will do a syntax tree transformation, turning any switch that contains object creation in the cases into a series of nested if statements. It will not implements guards or variable bindings. In this way I can avoid adding any new syntax.

My initial attempt can be viewed here:

https://github.com/jensli/FantomPatternMatching/blob/master/F4_Projects/Compiler/fan/steps/ExpandPatterns.fan

It currently translates a very simple switch into executable code, as seen in this test case:

https://github.com/jensli/FantomPatternMatching/blob/master/F4_Projects/FantomPatternMatching/src/PatternsTest.fan

Grammar draft

Remove the rule:

<case> := "case" <expr> ":" <stmts>

New rules:

<case>           := "case" <casePattern> ":" <stmts>
<casePattern>    := <patternExpr> ["if" "(" <expr> ")"]
<patternExpr>    := <patternElem> ["as" <id>]
<patternElem>    := <objectPattern> | <listPattern> | <mapPattern> | <expr> | "_"
<objectPattern>  := <type> "{" (<memberPattern> <eos>)* "}"
<memberPattern>  := <id> "=" <patternExpr> <eos>

<listPattern>    := Similar to <list>, substitute <expr> for <patternExpr> in items 
<mapPattern>     := Similar to <map>, substitute <expr> for <patternExpr> in values

Conclusion

This proposal brings pretty big changes that probably isn't trivial to implement. But the usefulness of pattern matching in languages like Erlang and Haskell makes me believe it would be worth it.

It might be little tricky to wrap one's head around this when first confronted with it, but I think the idea of writing logic by matching values against patterns is a really simple, beautiful and expressive one, which is why I think it would fit in nicely in Fantom.

brian Mon 6 Aug 2012

Thanks for such an awesome and complete write-up!

When you get down to it, what pattern matching does is two main things:

  • lets you match against more complicated "something" than just equality
  • lets you create and bind new variables using that pattern

I'm still struggling with the readability though

// if this
case ["key1": 1, "key2": _ as val2] if (val2 > 7):
   echo("Map pattern. key2: " + key2)

// really more readable than this
if (map["key1"] == 1 && map["key2"] > 7)
{
  val2 := key["key2"]
  echo("Map pattern. key2: " + val2)
}

When I look at a lot of the pattern matching code in other languages, I find the actual matching isn't that much more readable than the constructs we already have. What it does often do is save that annoying field/collection access and assignment to a variable to use. That exact line in my second snippet of code sort of illustrates that problem (and it happens a lot).

The other direction for "pattern matching" which I think can provide inspiration is .NET LINQ. I haven't actually spent much time investigating how F# does functional style pattern matching and LINQ. Are the two integrated nicely?

Akcelisto Mon 6 Aug 2012

Fantom key feature is readability. This proposral has no value for readability.

I suggest to return to old proporsal about autocasting in if for is-operator.

// old way
if(obj is Map){
  map := obj as Map
  map["key"] = "value"
}else if(obj is List){
  list := obj as Map
  list[0] = "value"
} 

// new way
if(obj is Map){
  obj["key"] = "value"
}else if(obj is List){
  obj[0] = "value"
} 

brian Mon 6 Aug 2012

I suggest to return to old proporsal about autocasting in if for is-operator.

I am planning on adding that no matter what. Just going to require a bit of compiler rework, so been putting it off

Jens Tue 7 Aug 2012

This proposral has no value for readability.

Has too!

One of the main points is that instead making several different small additions for fixing different small problems, this is a attempt to solve several problems in a unified and coherent way, with a switch that matches values against patterns.

I really think this will be more readable that one special feature for is in if, one for matching objects in the switch, one for extracting fields from objects... and so on.

And writing logic with pattern matching is many times much more readable that with a bunch of if:s. When the logic is about extracting data from members and nested members, and choosing different branches depending on the values, it is often superior in that aspect.

Consider this first clause in the expression simplification example from the Programming in Scala book. This is how it looks with pattern matching and with if:s (using the proposed is in if addition):

Expr simplify( Expr obj )
{
  switch (obj) {
    case UnOp{opr = "-", arg = UnOp{opr = "-"} as nested}}:
        return nested.arg

    default: return e
  }
}    

Expr simplify( Expr obj )
{
  if (obj is UnOp) {
    if (obj.opr == "-" ) {
      nested := obj.opr
      if (nested is UnOp) {
        return nested.arg
      }
    }
  } else {
    return obj
  }
}

The pattern specifies the structure of what we are looking for in a declarative way. Much more readable than the imperative style if:s. Especially if there are more branches.

Jens Tue 7 Aug 2012

autocasting in if for is-operator

The Ceylon language has a similar feature.

What strikes me as weird about it is that the same variable has different types in different locations. It just seems illogical.

And the example above shows a drawback in that you might have to introduce a new local variable to use it, as with nested.

Jens Tue 7 Aug 2012

Thanks for such an awesome and complete write-up!

Thank you. I had fun thinking about this :)

tcolar Tue 7 Aug 2012

Hum ... I don't know, maybe a DSL could come handy for that, not sure i like to "pollute" the language with that.

That seem like the kind of "language feature" that Scala likes to advertise ... powerful for sure, but the kind of thing that you would probably use twice a year, in a good year, and that when somebody new will look at it, will go "wtf is that? I need a book".

Personally I'm glad there is no need for a 1000 pages "Programming in Fantom" book. I've read the Scala one and learned a lot reading it but it has not made me want to use(and more importantly maintain) Scala.

I do agree with Akcelisto that the "obj is UnOp" was useful and very readable so that would be nice to have again.

Jens Tue 7 Aug 2012

like the kind of "language feature" that Scala likes to advertise ... powerful for sure ...

I agree that this is the problem of Scala, and a danger for Fantom.

but the kind of thing that you would probably use twice a year

I learned to love this programming style with Erlang. There you do everything with pattern matching all the time, and it's really convenient when you get into it. They don't even have a real if, it's just some tacked on syntax sugar over the pattern matching. I think it could be just as convenient in Fantom.

But the biggest problem with this might be that there is already an established paradigm on using if-logic and variable declarations in Fantom. Using both models might lead to too much complexity.

Jens Tue 7 Aug 2012

Ups, I forgot my main point about the "autocasting in if for is-operator"!

In the pattern matching switch the casting issue is solved because there is a nice way of introducing a new variable in the case clauses (the syntax could be different).

// new way
swich (obj) {
   case Map{} as map:
     map["key"] = "value"
   case List{} as list:
     list[0] = "value"
}

Alternatively, wouldn't it be cleaner to have a way to introduce a new variable with the desired type in an if? Something like:

// new way
if (map := obj as Map){
  map["key"] = "value"
} else if (list := obj as List){
  list[0] = "value"
}

helium Tue 7 Aug 2012

What strikes me as weird about it is that the same variable has different types in different locations. It just seems illogical.

How is Flow Typing illogical?

Jens Tue 7 Aug 2012

How is Flow Typing illogical?

Oh, interesting, a new typing discipline to explore! That lead to a new language to read up on: whiley.org

No, not illogical in itself, but a little weird tucked on as an exception in only one place of a language.

Login or Signup to reply.