I would like to find a way to release a Python thread Lock using GDB on Linux. I am using Ubuntu 18.04, Python 3.6.9, and gdb 8.1.1. I am also willing to use the gdb package in Python.
This is for personal research and not intended for a production system.
Suppose I have this Python script named "m4.py", which produces a deadlock:
import threading
import time
import os
lock1 = threading.Lock()
lock2 = threading.Lock()
def func1(name):
print('Thread',name,'before acquire lock1')
with lock1:
print('Thread',name,'acquired lock1')
time.sleep(0.3)
print('Thread',name,'before acquire lock2')
with lock2:
print('Thread',name,'DEADLOCK: This line will never run.')
def func2(name):
print('Thread',name,'before acquire lock2')
with lock2:
print('Thread',name,'acquired lock2')
time.sleep(0.3)
print('Thread',name,'before acquire lock1')
with lock1:
print('Thread',name,'DEADLOCK: This line will never run.')
if __name__ == '__main__':
print(os.getpid())
thread1 = threading.Thread(target=func1, args=['thread1',])
thread2 = threading.Thread(target=func2, args=['thread2',])
thread1.start()
thread2.start()
My goal is to use gdb to release either lock1 or lock2 or both, so that the "DEADLOCK: This line will never run" message is displayed.
I think the first obstacle is that the program reaches the deadlock almost immediately, and there is not time to set a breakpoint in gdb. Is a breakpoint necessary?
Suppose I attach gdb by PID like this:
sudo gdb -p 121408
I can see that all threads are blocked with a futex.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7f56b324f740 (LWP 121408) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
2 Thread 0x7f56b1b8d700 (LWP 121409) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1bc3fc0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
3 Thread 0x7f56b138c700 (LWP 121410) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1bc3f90) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
The top five frames of the backtrace show the C function calls.
(gdb) thread 1
[Switching to thread 1 (Thread 0x7f56b324f740 (LWP 121408))]
#0 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205 in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1 do_futex_wait (sem=sem@entry=0x7f56ac000e70, abstime=0x0) at sem_waitcommon.c:111
#2 0x00007f56b2a378b8 in __new_sem_wait_slow (sem=0x7f56ac000e70, abstime=0x0) at sem_waitcommon.c:181
#3 0x00000000005aac15 in PyThread_acquire_lock_timed () at ../Python/thread_pthread.h:386
#4 0x00000000004d0ade in acquire_timed (timeout=<optimized out>, lock=0x7f56ac000e70) at ../Modules/_threadmodule.c:68
#5 lock_PyThread_acquire_lock () at ../Modules/_threadmodule.c:151
#6 0x000000000050a335 in _PyCFunction_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>, func_obj=<built-in method acquire of _thread.lock object at remote 0x7f56b1c289e0>)
at ../Objects/methodobject.c:231
#7 _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
#8 call_function.lto_priv () at ../Python/ceval.c:4851
Here are some of the things I have tried:
Return
"When you use return, GDB discards the selected stack frame (and all frames within it)". GDB
(gdb) return
Can not force return from an inlined function.
Access Python release function.
In this example, Frame 7 is the last frame where py-locals works. I tried accessing the release() method of Lock. As far as I know, it is not possible to invoke a method that is a member of a Python object.
(gdb) frame 7
#7 _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
294 in ../Objects/methodobject.c
(gdb) print lock
$7 = 0
(gdb) print lock.release
Attempt to extract a component of a value that is not a structure.
Interpret Lock as PyThread_type_lock
I am not sure that the interpreting the object as an opaque pointer is useful.
(gdb) print *((PyThread_type_lock *) 0x7f56ac000e70)
$8 = (PyThread_type_lock) 0x100000000
Call void PyThread_release_lock(PyThread_type_lock);
This attempt produces a segmentation fault.
(gdb) print (void)PyThread_release_lock (lock)
Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(PyThread_release_lock) will be abandoned.
When the function is done executing, GDB will silently stop.
Make System Call
I reran the script because the SIGSEV killed it. I then adapted code from this Gist Gist to make a syscall using the ctypes library in a Python script. In part, the code is this:
def _is_ctypes_obj_pointer(obj):
return hasattr(obj, '_type_') and hasattr(obj, 'contents')
def _coerce_to_pointer(obj):
print("obj", obj)
if obj is None:
return None
if _is_ctypes_obj(obj):
if _is_ctypes_obj_pointer(obj):
return obj
return ctypes.pointer(obj)
return (obj[0].__class__ * len(obj))(*obj)
def _get_futex_syscall():
futex_syscall = ctypes.CDLL(None, use_errno=True).syscall
futex_syscall.argtypes = (ctypes.c_long, ctypes.c_void_p, ctypes.c_int,
ctypes.c_int, ctypes.POINTER(timespec),
ctypes.c_void_p, ctypes.c_int)
futex_syscall.restype = ctypes.c_int
futex_syscall_nr = ctypes.c_long(202)
# pylint: disable=too-many-arguments
def _futex_syscall(uaddr, futex_op, val, timeout, uaddr2, val3):
uaddr = ctypes.c_int(uaddr)
error = futex_syscall(
futex_syscall_nr,
_coerce_to_pointer(uaddr),
ctypes.c_int(futex_op),
ctypes.c_int(val),
_coerce_to_pointer(timeout or timespec()),
_coerce_to_pointer(ctypes.c_int(uaddr2)),
ctypes.c_int(val3)
)
res2 = error, (ctypes.get_errno() if error == -1 else 0)
print(res2)
# _futex_syscall.__doc__ = getattr(futex, '__doc__', None)
res = _futex_syscall(0x7f5ca8000e70, 1, 99, 0, 0, 0)
print(res)
I do not know whether it is possible to unlock a futex with GDB. If it is, I would like to understand how.
In a subsequent run, this procedure worked.
Start gdb
GDB attached to the process and put it in a paused state.
Set a Catchpoint and Continue Execution
A catchpoint is a breakpoint that breaks whenever the specified system call is made. In the x64 architecture,
FUTEXis 202. See Set Catchpoint, SyscallsView Threads
Both child threads are blocked at a
futex.Access One Thread
Get, Set, and Get the Value at the Address of the Futex
0is locked, and1is unlocked. For information about assignment, see AssignmentRemove Catchpoint and Continue
Application Prints Messages with Exceptions
Threads Exit