Atomic Store Ordering: memory_order_release vs memory_order_seq_cst
Question 3 / 51 • Correct so far: 0 (0 answered)
Release Store
static void publishIndices(std::atomic<std::uint32_t>& tail, int n) {
for (int i = 0; i < n; ++i) {
tail.store(static_cast<std::uint32_t>(i), std::memory_order_release);
benchmark::DoNotOptimize(tail);
}
}
publishIndices(tail_release, kIterations); Seqcst Store
static void publishIndices(std::atomic<std::uint32_t>& tail, int n) {
for (int i = 0; i < n; ++i) {
tail.store(static_cast<std::uint32_t>(i), std::memory_order_seq_cst);
benchmark::DoNotOptimize(tail);
}
}
publishIndices(tail_seqcst, kIterations); Shared test data (shared-setup)
constexpr int kIterations = 65536; Which snippet is faster? (Assume x86-64)
Snippet A is faster. A release store compiles to a plain MOV on x86. A seq_cst store requires a full memory fence — typically XCHG or MOV+MFENCE — which serialises the store buffer and stalls the pipeline. The gap is even larger on ARM where seq_cst needs an explicit barrier instruction. This pattern appears in SPSC queues: the producer writes payload then publishes the index with release, not seq_cst.
Benchmark results
| Snippet | CPU time / iteration | Speedup |
|---|---|---|
| Release Store | 12.2 us | 7.9× |
| Seqcst Store | 96 us | 1.0× |
Explore the source
Open in Compiler ExplorerQuiz complete. You can return to the question list to restart and compare.