I want to track the time elapsed and memory usage of the process (bioinformatic tool) I execute from a Python script. I run the process on the Unix cluster, and save the monitoring parameters in a report_file.txt. To measure the elapsed time, I use the resources library, and to monitor the memory usage I use psutil library.
My main objective is to compare the performance of different tools, so I don't want to restrict memory or time in any way.
import sys
import os
import subprocess, resource
import psutil
import time
def get_memory_info():
return {
"total_memory": psutil.virtual_memory().total / (1024.0 ** 3),
"available_memory": psutil.virtual_memory().available / (1024.0 ** 3),
"used_memory": psutil.virtual_memory().used / (1024.0 ** 3),
"memory_percentage": psutil.virtual_memory().percent
}
# Open file to capture process parameters
outrepfp = open(tbl_rep_file, "w");
### Start measuring the process parameters
SLICE_IN_SECONDS = 1
# Start measuring time
usage_start = resource.getrusage(resource.RUSAGE_CHILDREN)
# Create the line for process execution
cmd = '{0} {1} --tblout {2} {3}'.format(bioinformatics_tool, setups, resultdir, inputs)
# Execute the process
r = subprocess.Popen(cmd.split(), stdout=subprocess.DEVNULL, stderr=subprocess.PIPE, encoding='utf-8')
# End measuring time
usage_end = resource.getrusage(resource.RUSAGE_CHILDREN) # end measuring resources
# Save memory measures
resultTable = []
while r.poll() == None:
resultTable.append(get_memory_info())
time.sleep(SLICE_IN_SECONDS)
# In case the process fails
if r.returncode: sys.exit('FAILED: {}\n{}'.format(cmd, r.stderr))
# Extract used memory
memory = [m['used_memory'] for m in resultTable]
# Count the elapsed time
cpu_time_user = usage_end.ru_utime - usage_start.ru_utime
cpu_time_system = usage_end.ru_stime - usage_start.ru_stime
# Write measurment to report_file.txt
outrepfp.write('{0} {1} {2} {3}\n'.format(bioinformatics_tool, cpu_time_user, cpu_time_system, memory))
For a given process, I received my report_file.txt:
bioinformatics_tool 0.0 0.0 [48.16242980957031, 47.76295852661133]
Could you please help me understand why the elapsed time is showing as 0, even though memory usage was monitored for 2 seconds and two values were captured?
Previously, I had implemented a time-capturing mechanism that reported around 4 seconds of elapsed time for the same process, which seems inconsistent with my current memory usage measurement.
***** EDIT *****
When I moved usage_end behind r.poll() loop I received some time measurement, but more reports of memory:
bioinformatics_tool 1.699341 0.063338 [18.01854705810547, 18.022377014160156, 17.966495513916016, 18.160659790039062, 18.281261444091797, 18.44908142 0898438, 18.343822479248047]
If the objective is to measure the running time of the process launched by
subprocess.Popen, thenusage_end = resource.getrusage(resource.RUSAGE_CHILDREN)should probably be after the loop which polls for process termination.