Checking that a static library does not contain undefined symbols exept ones from libc and stdlib

313 views Asked by At

I build a c++ software module that is delivered as header files (.h) containing the API and a static library (.a) with the implementation.

The module only depends on standard libraries so I want to check that all undefined symbols in static_lib.a are actually present in libc and stdlib, else it means a function implementation is missing.

The module is cross-built for aarch64 on a x86_64 linux computer.

A possible solution would be linking a test executable with static_lib.a and rely on linker to find undefined references, but such executable would need to call every functions provided by the API and be manually updated when functions are added/removed.

The best solution I get so far is:

  • Getting libc.so and libstdc++.so path using
gcc [cflags] --print-file-name=libc.so
gcc [cflags] --print-file-name=libstdc++.so
  • Getting the list of symbols provided by libc and stdlib using
nm --format=posix --dynamic $LIBC_PATH $LIBSTD_PATH | awk '{print $1}' | grep -v ':$' > stdsyms
  • Getting the list of undefined symbols in my library using
nm --format=posix --undefined-only static_lib.a | awk '{print $1}' | grep -v ':$' > undefined
  • Checking that all symbols in undefined are present in stdsyms
while read symbol; do grep -q "^$symbol$" stdsyms || echo $symbol >> missing; done < undefined
if [ -s missing ]; then echo "missing symbols:"; cat missing; false; fi

Issue is that libc.so is actually a text file

/* GNU ld script
   Use the shared library, but some functions are only in
   the static library, so try that secondarily.  */
OUTPUT_FORMAT(elf64-littleaarch64)
GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a  AS_NEEDED ( /lib/ld-linux-aarch64.so.1 ) )

so nm can not parse it. I wonder parsing this file to extract /lib/libc.so.6 and also extract --sysroot parameter from gcc cflags to build actual libc shared library path but this seems very brittle...

I tried gcc [cflags] --print-file-name=libc.a instead but there are no results.

Does anybody have a better idea to check that there are no missing functions in the implementation ? Either by using a reliable way to export symbols from libc and stdlib or with an other method.

Edit following Employed Russian answer :

Actually library already use partial linking (using -r -nostlib flags).

then link main.o with lib.o If the link succeeds, then there are no unresolved symbols.

This requires that the main.c used to create main.o call every functions of library API, and I see no easy way to automate this.

It's actually a linker script. But it tells you exactly which libc.so.6 and libc_nonshared.a it will use, so you could scan these.

I may end up doing this, I was hoping for a solution avoiding manually parsing this file (maybe calling the linker in a special mode ? I will do some tests.).

Solution:

See https://stackoverflow.com/a/76605971/12251948 for a solution avoiding nm issue. Note that this only allow to get 'undefined' symbols, not 'missing' ones for which linking with an executable calling every functions provided by the API seems the only solution.

2

There are 2 answers

0
Employed Russian On

A possible solution would be linking a test executable with static_lib.a and rely on linker to find undefined references, but such executable would need to call every functions provided by the API and be manually updated when functions are added/removed.

Another possible solution is to use ld -r ${OBJS} -o lib.o (where ${OBJS} is all object files you archive into your static_lib.a), and then link main.o with lib.o. If the link succeeds, then there are no unresolved symbols.

Issue is that libc.so is actually a text file

It's actually a linker script. But it tells you exactly which libc.so.6 and libc_nonshared.a it will use, so you could scan these.

0
many-sigsegv On

To avoid issue with nm not able to read linker script, it is possible to directly use the linker instead.

New steps :

Still use gcc [cflags] --print-file-name to get libc and stdlib (and pthread) path, and nm --format=posix --undefined-only static_lib.a to dump list of undefined symbol in the library into the 'undefined' file.

Then call linker, with path of libc, stdlib and pthread lib, and requiring that symbols from 'undefined' are resolved

readarray -t symbols < undefined && gcc [ldflags] -Wl,--no-as-needed $LIBC_PATH $LIBSTD_PATH $LIBPTHREAD_PATH ${symbols[@]/#/ -Wl,--require-defined -Wl,}  -Wl,--ignore-unresolved-symbol -Wl,main

The readarray -t symbols and ${symbols[@]/#/ -Wl,--require-defined -Wl,} allow to generate a --require-defined arguments for linker for each undefined symbol in the library. If link fails, this means there are undefined symbols not belonging to libc, stdlib or pthread.