Comment 5 for bug 1880193

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

From systemd 245 release notes (https://lwn.net/Articles/814068/):

----
  * A new component "userdb" has been added, along with a small daemon
          "systemd-userdb.service" and a client tool "userdbctl". The framework
          allows defining rich user and group records in a JSON format,
          extending on the classic "struct passwd" and "struct group"
          structures. Various components in systemd have been updated to
          process records in this format, including systemd-logind and
          pam-systemd. The user records are intended to be extensible, and
          allow setting various resource management, security and runtime
          parameters that shall be applied to processes and sessions of the
          user as they log in. This facility is intended to allow associating
          such metadata directly with user/group records so that they can be
          produced, extended and consumed in unified form. We hope that
          eventually frameworks such as sssd will generate records this way, so
          that for the first time resource management and various other
          per-user settings can be configured in LDAP directories and then
          provided to systemd (specifically to systemd-logind and pam-system)
          to apply on login. For further details see:

          https://systemd.io/USER_RECORD
          https://systemd.io/GROUP_RECORD
          https://systemd.io/USER_GROUP_API
----

and yet we don't have userdbctl tool or the daemon

https://www.freedesktop.org/software/systemd/man/userdbctl.html

looks like an ongoing effort of unifying user/group information coming from
pam-systemd to logind management scheme within systemd.

I believe making all information coming from pam-systemd to logind available
through this varlink interface is what is causing the issue and where the problem
relies.

----

Nevertheless...

Error is coming from the userdb codeset, from the assertion:

        assert_se(set_remove(iterator->links, link) == link);

when userdb code is being called by the varlink protocol.

Many subsystems within systemd now have an embedded varlink server to provide
IPC through simple json protocol. The journal daemon creates a varlink server on its
own through systemd-journald -> server_init -> server_open_varlink() ->
varlink_server_listen_fd() being one example.

The execution path for this error is either coming from:

(1)

process_connection() -> varlink_process() -> varlink_dispatch_reply() -> reply_callback()

and the reply_callback is a pointer to userdb_on_query_reply(), since this callback is set with varlink_bind_reply().

if (IN_SET(v->state, VARLINK_AWAITING_REPLY, VARLINK_AWAITING_REPLY_MORE)) {
    varlink_set_state(v, VARLINK_PROCESSING_REPLY);

if (v->reply_callback)
    r = v->reply_callback(v, parameters, error, flags, v->userdata)

OR

(2) from an error coming from:

varlink_dispatch_disconnect()
varlink_dispatch_method()
varlink_dispatch_reply()
varlink_dispatch_timeout()

all of them calling varlink_dispatch_local_error().

These errors come from varlink_process() main logic, processing the varlink protocol.

- A timeout in connection would trigger varlink_dispatch_local_error().
- An error in varlink protocol in dispatch a reply w/ "invalid" json object, triggering varlink_dispatch_local_error().
- An error in varlink protocol when being asked to dispatch a method:
  - org.varlink.service.GetInfo
  - org.varlink.service.GetInterface
  - org.varlink.service.*
  are not implemented, for example, and would cause a call to varlink_dispatch_local_error()
- a disconnect would also cause a call to varlink_dispatch_local_error().

varlink_dispatch_local_error():

r = v->reply_callback(v, NULL, error, VARLINK_REPLY_ERROR|VARLINK_REPLY_LOCAL, v->userdata);

-----------------

Commits related to varlink that are not merged:

$ git log --no-merges v246..HEAD --oneline --grep varlink
8d91b2206c varlink: properly allocate connection event source
77472d06a4 varlink: do not parse invalid messages twice
0c73f4f075 nss-resolve: port over to new varlink interface
9581bb8424 resolved: add minimal varlink api for resolving hostnames/addresses
65a01e8242 resolved: move query bus tracking to resolved-bus.c
c9de4e0f5b resolved: rename request → bus_request
7466e94f13 varlink: add helper for generating errno errors

and it looks like they're adding features to nss-resolv so it can resolv hostnames
using systemd-resolved... but not fixing anything related to it.