I want to write a Python program that analyzes the execution of other arbitrary Python programs.
For example, suppose I have a Python script called main.py that calls a function func a certain number of times. I want to create another script called analyzer.py that can "look inside" main.py while it's running and record how many times func was called. I also want to record the list of input arguments passed to func, and the return value of func each time it was called.
I cannot modify the source code of main.py or func in any way. Ideally analyzer.py would work for any python program, and for any function.
The best way I have found to accomplish this is to have analyzer.py run main.py as a subprocess using pdb.
script = "main.py"
process = subprocess.Popen(['python', '-m', 'pdb', script], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
I can then send pdb commands to the program via the process' stdin and then read the output via stdout.
To retrieve the input parameters and return values of func, I need to
- Find the line number of the first line of
funcby analyzing its file - Send a breakpoint command for this file/lineno
- Send continue command
- Import pickle, serialize
locals(), and print to stdout (to get input parameters) - Send return command (go to end of function)
- Serialize
__return__and print to stdout - Send continue command
I'm wondering if there is a better way to accomplish this
Instead of controlling pdb with pipes, you can just configure your own trace function using
sys.settracebefore doingimport main. (Of course you can also doimportlib.import_module("main")orrunpy.run_module()orrunpy.run_path().)For instance,
prints out