Skip to content

Potential memory order inconsistency on freelist block reusage #404

Open
@ayles

Description

@ayles

Hi! So I noticed that TSAN complains about races inside concurrentqueue. I was able to reproduce it with the following minimal example:

#include "concurrentqueue.h"

#include <thread>
#include <vector>

int main() {
    moodycamel::ConcurrentQueue<int> q;

    std::vector<std::thread> threads;

    for (size_t i = 0; i < 5; ++i) {
        threads.emplace_back([&q]() {
            while (true) {
                int item;
                q.try_dequeue(item);
            }
        });
    }

    for (size_t i = 0; i < 10; ++i) {
        threads.emplace_back([&q]() {
            while (true) {
                q.try_enqueue(1);
            }
        });
    }

    for (auto& t : threads) {
        t.join();
    }
}
...

WARNING: ThreadSanitizer: data race (pid=1709853)
  Write of size 4 at 0x728c00000718 by thread T14:
    #0 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ImplicitProducer::enqueue<(moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::AllocationMode)1, int>(int&&) ... concurrentqueue.h:2544:4
    #1 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::inner_enqueue<(moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::AllocationMode)1, int>(int&&) ... concurrentqueue.h:1376:94
    #2 moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::try_enqueue(int&&) ... concurrentqueue.h:1074:15
    #3 main::$_1::operator()() const ... main.cpp:23:19
    
...

  Previous read of size 4 at 0x728c00000718 by thread T5:
    #0 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ImplicitProducer::dequeue<int>(int&) ... concurrentqueue.h:2596:17
    #1 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::ProducerBase::dequeue<int>(int&) ... concurrentqueue.h:1720:50
    #2 bool moodycamel::ConcurrentQueue<int, moodycamel::ConcurrentQueueDefaultTraits>::try_dequeue<int>(int&) ... concurrentqueue.h:1146:32
    #3 main::$_0::operator()() const ... main.cpp:15:19

...

I believe this race (in terms of C++ memory model) happens like this:
T0: Thread A reads from memory slot S.
T1: Thread A releases memory slot S of block X.
T2: Thread B releases last slot of block X, pushes block into the freelist.
T3: Thread C acquires block X from the freelist, writes to slot S.

I guess release in this and similar places should be acq_rel (or release + acquire fence on the last increment):
https://github.com/cameron314/concurrentqueue/blob/master/concurrentqueue.h#L1607
to build happens-before relation between the slot release (thread A) and the whole block release (thread B) and subsequent push to the freelist.
(Tested it, TSAN stopped complaining).

I am not so good with C++ memory model, so maybe I am missing something here?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions