druid

Data Ingestion from kafka is not happening as expected


I wanted to overwrite data in druid. Since there is no option to do this in a direct manner, I first killed the supervisor corresponding to the data source, which automatically kills the task associated. Then I gave a delete request using rest API call, to delete the datasource, with an intent to create a new datasource with the same name, so that effectively, I would be able to overwrite the data.

I had submitted a request to create a data source named "test_ingestion" and the response was 200 . But I can't see "test_ingestion" in druid datasources list although its supervisor and task is in Running state.

There are three problems that I encountered, when I tried to do this for three different datasources.

  1. No data source is created and No error message was displayed in task logs
  2. A Java Null Pointer Exception encountered. 2024-03-18T11:14:54,602 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Encountered exception while running task. java.lang.NullPointerException: timestamp
  3. An offset that was not present in the kafka topic is being referred, and the consequent error states that the offset can't be located. 2024-03-18T08:33:02,556 WARN [task-runner-0-priority-0] org.apache.druid.indexing.kafka.IncrementalPublishingKafkaIndexTaskRunner - OffsetOutOfRangeException with message [Fetch position FetchPosition{offset=560, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[192.168.144.68:9092 (id: 0 rack: null)], epoch=absent}} is out of range for partition test_ingestion-0] 2024-03-18T08:33:02,557 WARN [task-runner-0-priority-0] org.apache.druid.indexing.kafka.IncrementalPublishingKafkaIndexTaskRunner - Retrying in 30000ms

Solution

  • I resolved this error , It was due to the kakfa server issue.