I know ganglia can be used to monitor resource utilization in a cluster like Spark. But it will provide an overall report for my application.
But if I want to know how much resource is being utilized by a specific portion of my code, is there any way?
A
My code
B
For example, I want to know CPU/RAM utilization from A to B. I can calculate the runtime within the code, i.e. java application for spark, but I don't know how I can specifically know the resource utilization for that portion. I have an idea that if somehow I can generate a report (like call api for ganglia report) at B, it can basically show me resources utilized up to B. Although it will not exclude anything before A still it will work for me for now if such solution exists.
Thank you in advance.
Apparently new project sparkoscope seems to work on this, i.e. monitoring from source code level. However, their project is not well documented,so I am facing trouble to put their project in working position. Nevertheless, it is a start. Hope it helps someone like me.