I wrote the same XML parsing algorithm in Java using different parser Parser X (XOM) and Parser Y (DOM). I embedded the code inside a 2 million times loop to imitate the numbers of operations I need to carry and used a Java profiler to monitor performance. Measurements are shown below.
Parser X (XOM) Parser Y (DOM)
Heap Memory 6.82 7.9
Non-heap memory 14 15
Garbage Collector 617 collections \ 2 sec 523 collections \ 1 sec
Up time 1 m 53 s 1 m 54 s
CPU time 1 m 2 s 44.8 s
I have few questions.
What if I want to process about 2 million XMLs with sizes reaching 100 MB?. Which one is better for a better performance. Performance is measured against time (The one that finishes processing all XMLs faster regardless machine utilization as I have dedicated machine for this process). In short which one is better in terms of Memory VS CPU time VS uptime
Is it feasible to utilize the full CPU power to finish faster? Multi-threading?
If I want to measure performance. Should I use CPU time or Up time. I know that CPU time is the time dedicated by the CPU to finish the process while the up time is the total time taken on our watches by the machine to finish the process?
Why does Parser Y take the same up time as Parser X but with much lower CPU time despite the fact that this measurement is a mean not a result of a one run.
Is it feasible to make Parser Y's up time shorter so the difference in CPU time performance is reflected in the real life.
After expanding the code of both algorithms to cover a variety of operations, it turned out that the XOM parser was much faster in Up time with the same CPU time and lower memory foot print. XOM parser wins for me.