scalaakkaakka-supervision

Supervisor strategy RESTART, doesn't actually restart my child actors?


I have the below strat:

  override val supervisorStrategy = OneForOneStrategy(10, 10.seconds) {
    case e: JedisConnectionException => Restart
    case e: Exception => Restart
  }

From what I've read(which I think I'm misunderstanding), whenever child actor throws an exception that isnt caught, it is escalated to the parent actor. Based on the rule I have above, if my child actor always throw exception on receive, shouldnt it be restarted 10 times?

For some reason, from my logs, it looks like it just restarts once and that's it. I put logs in the prestart and poststart.

EDIT:

I realized one mistake I was making:

I was using "context.system.actorof()" thats why none of the child actors were reacting to the strat. Now I'm using "context.actorof()" and I do see the exceptions been "caught" by the strat.

For my child actor, it needs to talk to redis for info and I purposely shut down my redis so the child actor will fail, now if I set my supervisor strat to restart up to 10 times, should I see the same stacktrace 10 times?

Am I correct in assuming that when a child actor is restarted, the same message that was sent is sent to it again?

    2015-10-22 15:31:17,747 - [error] a.a.OneForOneStrategy - Error occurred trying to check for item existing in Redis:
java.lang.RuntimeException: Error occurred trying to check for item existing in Redis:
    at services.impl.RedisStatusServiceImpl.exists(RedisStatusServiceImpl.scala:62) ~[classes/:na]
    at w.c.Poller$$anonfun$process$1.apply(Poller.scala:64) ~[classes/:na]
    at w.c.Poller$$anonfun$process$1.apply(Poller.scala:58) ~[classes/:na]
    at scala.collection.immutable.List.foreach(List.scala:381) ~[scala-library-2.11.7.jar:na]
    at w.c.Poller.process(Poller.scala:58) ~[classes/:na]
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at redis.clients.util.Pool.getResource(Pool.java:50) ~[jedis-2.7.3.jar:na]
    at redis.clients.jedis.JedisPool.getResource(JedisPool.java:99) ~[jedis-2.7.3.jar:na]
    at services.impl.RedisStatusServiceImpl.exists(RedisStatusServiceImpl.scala:58) ~[classes/:na]
    at w.c.Poller$$anonfun$process$1.apply(Poller.scala:64) ~[classes/:na]
    at w.c.Poller$$anonfun$process$1.apply(Poller.scala:58) ~[classes/:na]
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refused
    at redis.clients.jedis.Connection.connect(Connection.java:164) ~[jedis-2.7.3.jar:na]
    at redis.clients.jedis.BinaryClient.connect(BinaryClient.java:82) ~[jedis-2.7.3.jar:na]
    at redis.clients.jedis.BinaryJedis.connect(BinaryJedis.java:1641) ~[jedis-2.7.3.jar:na]
    at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:85) ~[jedis-2.7.3.jar:na]
    at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) ~[commons-pool2-2.3.jar:2.3]
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[na:1.8.0_05]
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345) ~[na:1.8.0_05]
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[na:1.8.0_05]
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[na:1.8.0_05]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[na:1.8.0_05]

Solution

  • Yes, if a child actor reached the maxNrOfRetries (in your case 10) restarts within the time window (10 seconds in your example), it will be stopped.

    You can define a global supervisorStrategy for example in some base trait or abstract Actor class. All your actor classes are in this case subclasses of that BaseActor.