Provide symbols to the VisualStudio debugger for custom code

70 views Asked by At

Provided one generates custom JIT-generated code (x64) in C++ on Windows (using VirtualAlloc with MEM_EXECUTE), is there any way to provide some symbols for the Visual Studio Debugger, to get it to recognize names of functions? Code-line mapping is not relevant for me.

By default, when you execute this code, with valid RT_FUNCTION-tables installed, the callstack will be able to show the correct stack-trace, but the functions will obviously only be addresses instead of names:

    Engine.dll!ae::event::handleException(_EXCEPTION_RECORD * exceptionRecord, unsigned __int64 establisherFrame, _CONTEXT * contextRecord, _DISPATCHER_CONTEXT * dispatcherContext) Zeile 104  C++
    ntdll.dll!00007ffdeadf441f()    Unbekannt
    ntdll.dll!00007ffdead6e466()    Unbekannt
    ntdll.dll!00007ffdeadf340e()    Unbekannt
>   0000016c80000102()  Unbekannt // my code starts here
    000000106348ff90()  Unbekannt
    0000016c40000002()  Unbekannt
    0000016c802e9170()  Unbekannt
    000000106348fee0()  Unbekannt
    0000016d27b1f2f0()  Unbekannt

This can make it extremely difficult to debug such code, especially when errors occur, like in the example above.

I've heard people create full PDB-files, even for JITed code, but I've just wondered if there is a simpler, programmatic solution for this? Dealing with PDBs, which would AFAIK require switching to some sort of common executable format, would be a real undertaking that would not be worth it, at the moment. Especially since I only need function names, not full source-line mapping/breakpoints etc...

Potential solutions

I did find that you could call SymLoadModuleEx with SLMFLAG_VIRTUAL, and then call SymAddSymbol. Though the calls were all successful, and this resulted in StackWalker recognizing the code, it didn't seem to affect the debugger at all. So I'm assuming that this will simply only affect the modules that can be queried by the app, and not the VS-debugger - if you know otherwise, please let me know.

I've also found a documentation for Symbol Providers, but it is really shallow and doesn't even provide a point at where to start. I'm not even sure if this would solve my problem. If it does, is there any solid full documentation on this topic? Certain interfaces and methods are documentation, but I haven't dealt with VS extensions (which it seems this is) so far, and would need to learn that from scratch. It would also be valuable to know here, if anybody has experience with it, if this is even less work than generating a PDB would be.

Or is there something else that I'm missing? I'm assuming there would be some way to supply symbols by calling some functions, the same way you have RtlAddFunctionTable to supply unwind-information. But I get a feeling I might be out of luck.

1

There are 1 answers

0
Juliean On

Solved it, after hours of research and work. So for anyone trying to solve a similar problem:

Writing a VisualStudio-extension is a must, but it's not that hard. There are templates, and you can even debug the extension while running from within VS (woah).

Working with the debugger is a totally different story. I didn't have any luck getting my hands on IDebugSymbols3, or DebugEngine (used C#) mainly, but after a lot of search I found the "Concorde" SDK for programming for the Visual Studio debugger.

The "Hello World" already does kind of what I need, but it is a lot more involved. I could have done some hackery to just manually rewrite the callstack, but there is also the problem of getting the symbol data to the debugger.

Anyway, the real life-saver is the Microsoft PTVS SDK, which provides a full implementation of the necessary interfaces. And you need a shitton of interfaces, to get this to work. The baseline documentation of the debugger SDK is really poor. Everything is technically explained, but nothing explains how things like together. Now the implementation in PTVS is also pretty complicated. Some of it is necessary, like the split between two Components, but a lot of things are only there as Python seems to be self-hosting, etc.

Anyway, writing a detailed guide, I could probably write an article, if not paper. I'll try to dumb it down to the most important points, to help anyone who might read that:

  1. Follow the Guid-setup in the repo. You need GUIDs for your vendor, language, runtime-type as well as for the different components.

  2. You need two ways of supplying symbol-modules; one where the debugger queries the existing modules, when it is attached (or reattached). This is done via GetStaticVariable. The other one, is when a module is added in your app. This is done via CreateRuntimeDllFunctionBreakpoint (to a function in your app called on module creation; this can be a stub).

  3. You need to do a lot of book-keeping, storing and retrieving state via DkmProcess. Just stick with what the repo offers, and extend it (GetPythonRuntimeInstance etc). A lot of their helpers/wrappers are also pretty useful, if not necessary.

  4. In terms of interfaces, you need a lot of them, mostly everything dealing with message-handling, and loaded-callbacks. You don't need expression evaluation, if you don't want it; nor do you need the IDkmSymbolDocumentCollectionQuery, unless you want to support step-through your code. You can also just leave NotImplementedException in most of them, while working on them - the debugger will accept those. You just have to be careful to implement the methods that are actually needed. You mainly need IDkmSymbolQuery, IDkmLanguageFrameDecoder and IDkmCallStackFilter. As for implementation:

  5. The main work is in creating a lot of custom information. Those are classes usually starting with "DkmCustom", and they have a "Create" method. For your modules, you create DkmModule plus a DkmCustomModuleInstance. You then have to do the frame-fixup yourself, creating a DkmCustomInstructionAddress for a new DkmStackWalkFrame (don't be concerned that the input-frame won't have any symbols set - this is expected). If done right, at the end IDkmLanguageFrameDecoder.GetFrameName will be invoked, allowing you to provide the final name for the frame.

  6. Make sure to follow the vsconfigxml-setup as well. Setting the GUIDs in the way the repo does it, will ensure that your Interfaces are invoked only for your own code (mostly, this is not true for all). Doing it wrong will result in interfaces not being called.

Soo... yeah. Hope this is at least somewhat understandable. Making it an easy-to follow step by step list is very hard, by the complexity of it. But maybe this will at least save you some trouble of figuring out certain steps.

The upside of this approach is, that you have a lot of options, on top of just providing symbol-names. You could show local variables; you could implement expression-evaluators; you could even implement a custom debugger that steps through your own source-code. For me, one very important thing is to actually be able to attach the origin-stack of my stack-based coroutines via FilterStack (haven't implemented it yet, but from what I've seen you can do this). So it seems to at least be worth the trouble VS just doing something with AddSyntheticSymbols, if that were to work.