I'm taking a course about concurrency and lock-free programming, we touched on RCU as a way to do memory reclamation. Assume we know all threads involved. The following toy implementation of RCU was provided:
// shared data
std::atomic current_phase{};
std::array, THREADS> rcu_phase{};
std::array, THREADS> rcu_rcs{};
void rcu_read_lock()
{
int tmp = current_phase.load(std::memory_order_acquire);
rcu_phase[this_thread].store(tmp, std::memory_order_release);
rcu_rcs.store(1, std::memory_order_release);
std::atomic_thread_fence(std::memory_order_seq_cst);
}
void rcu_read_unlock()
{
rcu_rcs.store(0, std::memory_order_release);
}
void rcu_synchronize()
{
std::atomic_thread_fence(std::memory_order_seq_cst);
current_phase.store(1, std::memory_order_release);
for (size_t i = 0; i < THREADS; i++)
{
do {
int a = rcu_rcs.load(std::memory_order_acquire);
int b = rcu_phase.load(std::memory_order_acquire);
} while(a == 1 && b == 0);
}
for (size_t i = 0; i < THREADS; i++)
{
do {
int a = rcu_rcs.load(std::memory_order_acquire);
int b = rcu_phase.load(std::memory_order_acquire);
} while(a == 1 && b == 1);
}
}
There are at least three things I don't undestand about this:
What is the need of having
rcu_phase? Isn't havingrcu_rcssufficient as that is what signals if a reader thread is actually reading the critical section?If having the phases makes sense, why does
rcu_synchronizeneed switch between twice? Couldn't it simply look what the current phase is and flip it, and then wait for the threads from the previous phase?What is the purpose of those two sequentially consistent fences?
Thank you for help! Online resources about this are either very shy of going into the implementation details and mostly explain how to use the API or they refer to more complex/different implementations.