I have a class of objects in a multithreaded application where each thread can mark an object for deletion, then a central garbage collector thread actually deletes the object. The threads communicate via member methods that access an internal bool:
class MyObjects {
...   
bool shouldBeDeleted() const
{
   return m_Delete;
}
void
markForDelete()
{
   m_Delete = true;
}
...
   std::atomic< bool >                                        m_IsObsolete;
}
The bool has been made an atomic by someone else in the past because Thread Sanitizer kept complaining. However, perf suggests now that there is a processing overhead during the internal atomic load:
   │     ↓ cbz    x0, 3f4                                                                                                                                                                                                                                                                                                                                                                                            
   │     _ZNKSt13__atomic_baseIbE4loadESt12memory_order():                                                                                                                                                                                                                                                                                                                                                           
   │           {                                                                                                                                                                                                                                                                                                                                                                                                     
   │             memory_order __b = __m & __memory_order_mask;                                                                                                                                                                                                                                                                                                                                                       
   │             __glibcxx_assert(__b != memory_order_release);                                                                                                                                                                                                                                                                                                                                                      
   │             __glibcxx_assert(__b != memory_order_acq_rel);                                                                                                                                                                                                                                                                                                                                                      
   │                                                                                                                                                                                                                                                                                                                                                                                                                 
   │             return __atomic_load_n(&_M_i, __m);                                                                                                                                                                                                                                                                                                                                                                 
   │       add    x0, x0, #0x40                                                                                                                                                                                                                                                                                                                                                                                          
 86,96 │       ldarb  w0, [x0]  
Target platform is GCC, Aarch64 and Yocto Linux.
Now my questions are as follows:
Is atomic really needed in this case? The transition of the bool is one way (from false to true) with no way back while the object lives, so an inconsistency would merely mean that the object is deleted a little later, right?
Is there an alternative to
std::atomic<bool>that will silence Thread Sanitizer but is computationally cheaper thanstd::atomic<bool>?
                        
An obvious modification could be to specify
memory_order_relaxedto minimise memory barriers.See https://en.cppreference.com/w/cpp/atomic/memory_order
and https://bartoszmilewski.com/2008/12/01/c-atomics-and-memory-ordering/
Also see Herb Sutter's classic "Atomic Weapons" : https://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2
Caveat (see articles above) - if there are any co-dependencies on the object being flagged for deletion (e.g. another state variable, freeing resources etc) then you may need to use
memory_order_releaseto ensure that thecan be deletedflag setting occurs last and is not reordered by the compiler optimiser.Assuming the "garbage collector" is only checking the
can be deletedflag alone it would not need to usememory_order_acquirein the load; relaxed would be sufficient. Otherwise it would need to use acquire to guarantee that any co-dependent accesses are not reordered to occur before reading the flag.