scalaakka-actor

Is creating a props object by calling a constructor manually safe and recommended?


I am trying to get fammiliar with Akka actors and I cannot get my head around these two issues: First as explained here closures can cause serialization problems. The example below contains a Props object that is not serializable because it closes over a non-serializable object:

case class Helper(name: String)

object MyNonserializableObject {

   val helper = Helper("the helper")

   val props7 = Props(new MyActor(helper))
}

So it is suggested not to create an Actor like this. The answer above is related to Akka docs dangerous variants. On the other hand when we are dealing with value classes as constructor arguments Akka docs recommends creating props by calling the constructor manually that props3 on the code below is an example of:

class Argument(val value: String) extends AnyVal

class ValueClassActor(arg: Argument) extends Actor {
  def receive = { case _ => () }
}

object ValueClassActor {
  def props1(arg: Argument) = Props(classOf[ValueClassActor], arg) // fails at runtime
  def props2(arg: Argument) = Props(classOf[ValueClassActor], arg.value) // ok
  def props3(arg: Argument) = Props(new ValueClassActor(arg)) // ok
}

These two concepts appear paradoxical to me. By the way because of my rank I coud not create this question as a comment.


Solution

  • This is easier to understand if you know how JVM works. If you instantiate object using classOf[ValueClassActor] and list of args, JVM has to extract Constructor from Class object and then instantiate object using Java reflection API.

    Meanwhile if you take a look what AnyVals are, you'll see that class taking AnyVal

    class Argument(val value: String) extends AnyVal
    
    class ValueClassActor(arg: Argument)
    

    compiles to:

    Compiled from "test.scala"
    public class ValueClassActor {
      public ValueClassActor(java.lang.String);
        Code:
           0: aload_0
           1: invokespecial #14                 // Method java/lang/Object."<init>":()V
           4: return
        LineNumberTable:
          line 3: 0
        LocalVariableTable:
          Start  Length  Slot  Name   Signature
              0       5     0  this   LValueClassActor;
              0       5     1   arg   Ljava/lang/String;
    }
    

    so Argument type exists only at compile time (well, mostly, sometimes Scala instantiates it) and if you want to call the constructor that JVM actually sees, you need to pass String isntead of Argument. That's why you have this behavior you aboserve:

      def props1(arg: Argument) = Props(classOf[ValueClassActor], arg) // fails at runtime
      def props2(arg: Argument) = Props(classOf[ValueClassActor], arg.value) // ok
    

    To avoid dealing with this issue, you can use Props creator which doesn't rely on runtime reflection:

    def apply[T <: Actor](creator: => T)(implicit arg0: ClassTag[T]): Props
    

    Is it dangerous? Documentations says:

    CAVEAT: Required mailbox type cannot be detected when using anonymous mixin composition when creating the instance. For example, the following will not detect the need for DequeBasedMessageQueueSemantics as defined in Stash:

    'Props(new Actor with Stash { ... })

    Instead you must create a named class that mixin the trait, e.g. class MyActor extends Actor with Stash.

    which means that as long as you will simply use named class and just provide arguments to it without any minxins on anonymous subclasses you remove one potential issue. To avoid the issue of having closure, you can do what is exactly said in documentation and create that Prop construction in companion object.

    Problem is that when you try to create Prop it could be serialized if you send it through the internet to another part of your application if you had e.g. Akka Cluster. And if you try to serialize a function (here: anonymous Function that is `new ValueClassActor(arg)), it would fetch its whole closure if you tried to serialize it. Because of how Java works this function would have a pointer to the parent object within which it was created.

    If you have

    class Foo(s: => String)
    
    object Foo {
      def hello: Foo = new Foo("test") // "test" is by-name so it has closure
    }
    

    and you take a look at generated bytecode you'll see that there is

    Compiled from "foo.scala"
    public class Foo {
      public static Foo hello();
        Code:
           0: getstatic     #16                 // Field Foo$.MODULE$:LFoo$;
           3: invokevirtual #18                 // Method Foo$.hello:()LFoo;
           6: areturn
    
      public Foo(scala.Function0<java.lang.String>);
        Code:
           0: aload_0
           1: invokespecial #25                 // Method java/lang/Object."<init>":()V
           4: return
        LineNumberTable:
          line 3: 0
          line 1: 4
        LocalVariableTable:
          Start  Length  Slot  Name   Signature
              0       5     0  this   LFoo;
              0       5     1     s   Lscala/Function0;
    }
    

    and

    Compiled from "foo.scala"
    public final class Foo$ {
      public static final Foo$ MODULE$;
    
      public static {};
        Code:
           0: new           #2                  // class Foo$
           3: dup
           4: invokespecial #17                 // Method "<init>":()V
           7: putstatic     #19                 // Field MODULE$:LFoo$;
          10: return
        LineNumberTable:
          line 3: 0
    
      public Foo hello();
        Code:
           0: new           #23                 // class Foo
           3: dup
           4: invokedynamic #44,  0             // InvokeDynamic #0:apply:()Lscala/Function0;
           9: invokespecial #47                 // Method Foo."<init>":(Lscala/Function0;)V
          12: areturn
        LineNumberTable:
          line 4: 0
        LocalVariableTable:
          Start  Length  Slot  Name   Signature
              0      13     0  this   LFoo$;
    
      public static final java.lang.String $anonfun$hello$1();
        Code:
           0: ldc           #50                 // String test
           2: areturn
        LineNumberTable:
          line 4: 0
    }
    

    Which means that:

    object MyNonserializableObject is kind of shortcut in explanation as out of the box objects are serializable and you would have to do something weird with them to make them non-serializable. E.g. if you did

    trait Bar {
    
      object Baz {
        def hello: Foo = new Foo("test")  // "test" is by-name so it has closure
      }
    }
    

    closure would hold reference to Baz, which would hold reference to Bar and if whatever extends Bar was not serializable, so wouldn't be the closure. But if you will generate your lambda inside an object which is top-level (isn't nested in some other class, etc), then your closure could only depend on something that is serializable (because object on their own have empty constructors and implement Serializable interface), and so be serializable on its own.

    Same principle works when it comes to Props and by-name params. If you create a Prop using by-name param within a companion object that is top level (or is otherwise guaranteed to be serializable) then the closure will be serializable as well and the usage will be safe. Just like docs recommendation says.

    So long story short:

    class ValueClassActor(arg: Argument) extends Actor {
      def receive = { case _ => () }
    }
    
    object ValueClassActor {
      def props(arg: Argument) = Props(new ValueClassActor(arg))
    }
    

    is safe.