PerfMonMT: Update Event Loop Monitoring and add various improvements
Hi,
This MR includes changes implemented according to the feedback taken in one of the Athena Core Software Meetings at August 2019. Development is done on top of this MR . Here are the list of changes:
- Event Loop Monitoring: The previous version relied on the assumption that each event is run on a single thread in its entire lifetime. In this version we have eliminated this assumption. Now the service captures CPU and Memory measurements at certain checkpoints based on event numbers. These measurements are read from process, therefore it's thread-safe. This gives a overall picture of the event loop.
- Updated plots to make them look more readable.
- Added peak values for vmem, rss, pss in the summary result.
Here is a portion of the output for event loop monitoring. The job is run with 1000 events on 5 threads. Measurements are captured in every 5 events. Beginning of the event loop is set as offset:
INFO =======================================================================================
INFO CPU & Wall Time Monitoring
INFO (Event Loop)
INFO =======================================================================================
INFO Event CheckPoint CPU Time [ms] Wall Time [ms]
INFO 0 0.00 0
INFO 5 11970.00 11997
INFO 10 37460.00 17103
INFO 15 43980.00 19397
INFO 20 52640.00 21124
INFO 25 63700.00 23341
INFO ... ...
INFO 975 2835070.00 583306
INFO 980 2842570.00 584807
INFO 985 2855510.00 587405
INFO 990 2864330.00 589169
INFO 995 2880150.00 592350
INFO =======================================================================================
INFO =======================================================================================
INFO Memory Monitoring
INFO (Event Loop)
INFO Unit: KB
INFO =======================================================================================
INFO Event CheckPoint Vmem Rss Pss Swap
INFO 0 0 0 0 0
INFO 5 1075748 1100404 1100404 0
INFO 10 1075748 1103136 1103136 0
INFO 15 1075748 1127732 1127732 0
INFO 20 1075748 1127928 1127928 0
INFO 25 1075748 1127956 1127956 0
INFO ... ...
INFO 975 1318096 1359096 1359096 0
INFO 980 1318096 1358584 1358584 0
INFO 985 1318096 1358588 1358588 0
INFO 990 1318096 1358600 1358600 0
INFO 995 1318096 1358612 1358612 0
INFO =======================================================================================
cc: @amete
Best, Hasan
Edited by Hasan Ozturk