Comment 10 for bug 1876230

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@sil2100,

thanks for the trust. My TL;DR version for you is:

From one of liburcu maintainers (https://github.com/compudj):

"""
Posted Nov 24, 2013 23:55 UTC (Sun) by compudj (subscriber, #43335) [Link]

Tracking threads running in multiple processes using a common shared memory is not possible with the currently implemented URCU flavors, but we look forward to adding a new URCU flavor to support this kind of use case.
"""

So that satisfies the corner case I have thought of.

By that, I'm +1 on the SRU.

----

continuing the longer version...

> I agree with @ddstreet, I don't think liburcu gives that sort of guarantee when it comes to cross process synchronisation. It was my belief that liburcu targets synchronisation across a set of threads within the current process only.

It does not (like stated above) but I had to check, specially cause I was just checking by the membarrier() syscall point of view (not too much into liburcu implementation).

NOW with all that I got curious =)...

From liburcu documentation:

"""
There are multiple flavors of liburcu available:

memb, qsbr, mb, signal, bp.

The API members start with the prefix "urcu__", where is the chosen flavor name.

Usage of liburcu-memb

#include <urcu/urcu-memb.h>

Link the application with -lurcu-memb

This is the preferred version of the library, in terms of grace-period detection speed, read-side speed and flexibility. Dynamically detects kernel support for sys_membarrier().

Falls back on urcu-mb scheme if support is not present, which has slower read-side. Use the --disable-sys-membarrier-fallback configure option to disable the fall back, thus requiring sys_membarrier() to be available. This gives a small speedup when sys_membarrier() is supported by the kernel, and aborts in the library constructor if not supported.

Usage of liburcu-qsbr

#include <urcu/urcu-qsbr.h>

Link with -lurcu-qsbr

The QSBR flavor of RCU needs to have each reader thread executing rcu_quiescent_state() periodically to progress. rcu_thread_online() and rcu_thread_offline() can be used to mark long periods for which the threads are not active. It provides the fastest read-side at the expense of more intrusiveness in the application code.

Usage of liburcu-mb

#include <urcu/urcu-mb.h>

Link with -lurcu-mb

This version of the urcu library uses memory barriers on the writer and reader sides. This results in faster grace-period detection, but results in slower reads.

Usage of liburcu-signal

#include <urcu/urcu-signal.h>

Link the application with -lurcu-signal

Version of the library that requires a signal, typically SIGUSR1. Can be overridden with -DSIGRCU by modifying Makefile.build.inc.

Usage of liburcu-bp

#include <urcu/urcu-bp.h>

Link with -lurcu-bp

The BP library flavor stands for "bulletproof". It is specifically designed to help tracing library to hook on applications without requiring to modify these applications.

urcu_bp_init(), and urcu_bp_unregister_thread() all become nops, whereas calling urcu_bp_register_thread() becomes optional. The state is dealt with by the library internally at the expense of read-side and write-side performance.
"""

> If the program links against liburcu 0.9 or lower, the sys_membarrier syscall did not exist yet, and liburcu will use the default compiler based membarrier, which is only good within the current process. Synchronisation across shared memory pages fails. This is the case on Xenial, Trusty and the like.

> If the program links against liburcu 0.11 or newer, the sys_membarrier syscall does exist, but MEMBARRIER_CMD_SHARED is only used if the current running kernel does not support MEMBARRIER_CMD_PRIVATE_EXPEDITED.

Yep. Showed here => https://tinyurl.com/y96692o8

> There is no toggle option in the API at all, so for users with a kernel 4.14 or higher, MEMBARRIER_CMD_PRIVATE_EXPEDITED will be used, and synchronisation across shared memory pages will fail. This is the case on Eoan, Focal, Groovy.

Understood and agreed. The SRU line of thinking is always "not introducing regressions" so I was more interested in the "change of behavior" (even if "it is all broken").

> If the program links against liburcu 0.10, and uses the -qsbr, -md and -signal variants, sys_membarrier is not used at all, and it falls back to the compiler based membarrier, which is only good within the current process. Synchronisation across shared memory pages
will fail.

Agreed per documentation.

> If the program links against liburcu 0.10, and is used within a container, with a kernel version less than 4.3 that does not support sys_membarrier, such as a Bionic container on a Trusty 3.13 host, or on a 3.10 RHEL host, the sys_membarrier syscall fails, and it falls back to the compiler based membarrier. Synchronisation across shared memory pages will fail.

Agreed, per "urcu_bp_sys_membarrier_status()".

> Now, the upstream developers added MEMBARRIER_CMD_PRIVATE_EXPEDITED as the default in liburcu 0.11. They did not change the API to accommodate both MEMBARRIER_CMD_SHARED and MEMBARRIER_CMD_PRIVATE_EXPEDITED, and instead, if the kernel is greater than 4.14, MEMBARRIER_CMD_PRIVATE_EXPEDITED will be used. Upstream are well aware of their consumers, and they would not break everyone's usages out of the blue, without adding some sort of API provision for legacy users.

I see your point but that is usually not an assumption we can make, thus the review.

> Thus, our initial assumption that liburcu can be used to synchronise access to shared memory pages for IPC between a sister process is wrong, since no one will create a program that potentially only works in one specific environment, which is bionic on bare metal and liburcu 0.10 only. I'm not even sure how you would co-ordinate liburcu over multiple processes either.

I was checking from sys_membarrier() POV only, so I agree with you.

> So, because of the above, I don't think any librcu consumers are depending on a full membarrier, driven by the kernel, for shared pages among different processes.
>
> I still think this is safe to SRU.

Just like TL;DR version, now backed by solid arguments, +1.

Thanks for all this information!