NUMA memory allocation with hwloc

217 views Asked by At

I'm trying to do NUMA aware memory allocation with hwloc and get somewhat strange behavior.

My goal is to allocate blocks of memory on different NUMA nodes as i need this for a project. To verify that the correct amount of memory is allocated i have been using valgrind's memcheck tool. The tool reports always the same amount of bytes allocated no matter how many elements i want to allocate, which doesn't make sense to me. And overall the number of allocations seems to high. I do understand that hwloc needs to allocate some stuff internally to work properly but still this doesn't make sense to me.

Even if i try to allocate a bigger chunk of memory the allocated bytes reported by valgrind remain the same.

Here is the code i have been using to allocate memory on NUMA node 0.

#include <iostream>
#include <hwloc.h>

int main() {
    hwloc_topology_t topo;
    hwloc_topology_init(&topo);
    hwloc_topology_load(topo);

    auto node = hwloc_get_obj_by_type(topo, HWLOC_OBJ_NUMANODE, 0);

    auto mem = hwloc_alloc_membind(topo, SIZE * sizeof(float), node->nodeset,
                    HWLOC_MEMBIND_BIND, HWLOC_MEMBIND_STRICT | HWLOC_MEMBIND_BYNODESET);

    if (mem == NULL) {
        std::cout << "Allocation failed" << std::endl;
    }   

    hwloc_free(topo, mem, SIZE);
    hwloc_topology_destroy(topo);
}

and this is the relevant part of the valgrind memcheck report: valgrind report

Really hope some can explain what is going on here.

I'm using hwloc 2.7.1 and valgrind 3.15.0

1

There are 1 answers

2
Paul Floyd On

Valgrind 3.15 is quite old - can you try something more recent?

To see everything you need to use

valgrind --leak-check=full --show-reachable=yes --default-suppressions=no

With that I get (using a SIZE of 1024)

==1709== HEAP SUMMARY:
==1709==     in use at exit: 2,777 bytes in 4 blocks
==1709==   total heap usage: 224 allocs, 220 frees, 31,139 bytes allocated
==1709== 
==1709== 9 bytes in 1 blocks are definitely lost in loss record 1 of 4
==1709==    at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==1709==    by 0x48A060D: ??? (in /usr/local/lib/libhwloc.so.15.6.0)
==1709==    by 0x48751E5: hwloc_topology_load (in /usr/local/lib/libhwloc.so.15.6.0)
==1709==    by 0x202980: main (hwloc.cpp:7)
==1709== 
==1709== 64 bytes in 1 blocks are still reachable in loss record 2 of 4
==1709==    at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==1709==    by 0x4FFD7B2: ??? (in /lib/libthr.so.3)
==1709==    by 0x4FF5FC9: ??? (in /lib/libthr.so.3)
==1709==    by 0x4FF5139: ??? (in /lib/libthr.so.3)
==1709==    by 0x400B0FC: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x400938A: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x4006F88: ??? (in /libexec/ld-elf.so.1)
==1709== 
==1709== 1,040 bytes in 1 blocks are still reachable in loss record 3 of 4
==1709==    at 0x484CBC4: malloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==1709==    by 0x4B57254: ??? (in /lib/libc.so.7)
==1709==    by 0x4B573D2: __cxa_atexit (in /lib/libc.so.7)
==1709==    by 0x4E53A68: ??? (in /usr/local/lib/libze_loader.so.1.8.12)
==1709==    by 0x400B0FC: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x400938A: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x4006F88: ??? (in /libexec/ld-elf.so.1)
==1709== 
==1709== 1,664 bytes in 1 blocks are still reachable in loss record 4 of 4
==1709==    at 0x48500D5: calloc (in /usr/local/libexec/valgrind/vgpreload_memcheck-amd64-freebsd.so)
==1709==    by 0x4FF5FB8: ??? (in /lib/libthr.so.3)
==1709==    by 0x4FF5139: ??? (in /lib/libthr.so.3)
==1709==    by 0x400B0FC: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x400938A: ??? (in /libexec/ld-elf.so.1)
==1709==    by 0x4006F88: ??? (in /libexec/ld-elf.so.1)
==1709== 
==1709== LEAK SUMMARY:
==1709==    definitely lost: 9 bytes in 1 blocks
==1709==    indirectly lost: 0 bytes in 0 blocks
==1709==      possibly lost: 0 bytes in 0 blocks
==1709==    still reachable: 2,768 bytes in 3 blocks
==1709==         suppressed: 0 bytes in 0 blocks

Also try using --trace-malloc=yes to see all of the allocation and deallocation calls.