I am using Transform API of Dstream(Spark Streaming) to sort the data. I am reading from TCP socket using netcat. Following the line of code used: myDStream.transform(rdd=>rdd.sortByKey())
It is unable to find function sortByKey. Could anyone please help what is the issue in this step?
If you use netcat
as an input, you're likely to use socketTextStream
which returns ReceiverInputDStream[String]
. In that case transform
will take a function:
(RDD[String]) => RDD[U]
Only RDD[(T, U)]
, where T
has corresponding Orderign
can be sortedByKey
. For other RDD
you can use sortBy
:
myDSTream.transform(rdd => rdd.sortBy(x => x))