amazon-web-servicesamazon-kinesisamazon-kcl

Lease table is not updated when new shards are added causing stale workers in KCL


I am using the amazon-kclpy version "2.1.3" (the python library that uses the MultiLangDaemon).

According to aws docs (here) when I add new shards for my stream or when a new worker (pod in Kubernetes in my case) starts up a resharding event should occur.

Following the resharding I should see new records in my lease table.

However, in both cases, I do not see any new records in the lease table making the new workers added to be stale and go into a sleep cycle...

2023-12-24 17:46:34,750 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - Initializing LeaseCoordinator attempt 1 
2023-12-24 17:46:37,860 [multi-lang-daemon-0000] INFO  s.a.k.leases.LeaseCleanupManager [NONE] - Starting lease cleanup thread. 
2023-12-24 17:46:37,861 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - Starting LeaseCoordinator 
2023-12-24 17:46:37,861 [pool-15-thread-1] INFO  s.a.k.leases.LeaseCleanupManager [NONE] - Number of pending leases to clean before the scan : 0 
2023-12-24 17:46:37,926 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - Scheduling periodicShardSync 
2023-12-24 17:46:37,928 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - Initialization complete. Starting worker loop. 
2023-12-24 17:46:38,026 [multi-lang-daemon-0000] INFO  s.a.k.c.DeterministicShuffleShardSyncLeaderDecider [NONE] - Elected leaders: fe86945c-96d7-423e-a157-97c5c6ad6f79 
2023-12-24 17:47:05,036 [multi-lang-daemon-0000] INFO  s.a.k.c.DiagnosticEventLogger [NONE] - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=0, largestPoolSize=0, maximumPoolSize=2147483647) 
2023-12-24 17:47:35,044 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - No activities assigned 
2023-12-24 17:47:35,045 [multi-lang-daemon-0000] INFO  s.a.k.c.DiagnosticEventLogger [NONE] - Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=0, largestPoolSize=0, maximumPoolSize=2147483647) 
2023-12-24 17:47:35,045 [multi-lang-daemon-0000] INFO  s.a.kinesis.coordinator.Scheduler [NONE] - Sleeping ... 

I am not sure what I am missing in order to cause new pods to be assigned with those new shards so the pods will actually process records.

Is this a permission issue? maybe something with the python code?


Solution

  • I cannot say the source of the issue, but I saw the issue stopped once I replaced the worker_id attribute in the properties file with the pod name (the properties file that is read as part of the KCL run command).