Comment 1 for bug 1695789

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

## Valgrind

==31831== Invalid read of size 1
==31831== at 0x4C2E0F4: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31831== by 0x56FC243: find_mp_by_alias (structs.c:295)
==31831== by 0x404B2F: ev_add_map (main.c:264)
==31831== by 0x404A8B: uev_add_map (main.c:244)
==31831== by 0x40623C: uev_trigger (main.c:756)
==31831== by 0x5713958: service_uevq (uevent.c:118)
==31831== by 0x5713B67: uevent_dispatch (uevent.c:167)
==31831== by 0x406485: uevqloop (main.c:815)
==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so)
==31831== by 0x5A2EBEC: clone (clone.S:111)
==31831== Address 0x728d8d1 is 1 bytes inside a block of size 6 free'd
==31831== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31831== by 0x404A9A: uev_add_map (main.c:245)
==31831== by 0x40623C: uev_trigger (main.c:756)
==31831== by 0x5713958: service_uevq (uevent.c:118)
==31831== by 0x5713B67: uevent_dispatch (uevent.c:167)
==31831== by 0x406485: uevqloop (main.c:815)
==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so)
==31831== by 0x5A2EBEC: clone (clone.S:111)

All the errors above exist because of a use-after-free caused by the code:

static int
uev_add_map (struct uevent * uev, struct vectors * vecs)
{
 char *alias;
 int major = -1, minor = -1, rc;

 condlog(2, "%s: add map (uevent)", uev->kernel);
 alias = uevent_get_dm_name(uev);
        ...
 rc = ev_add_map(uev->kernel, alias, vecs);
 FREE(alias); <--- freed here
 return rc
}

Mainly because ev_add_map will call:

 /*
  * now we can register the map
  */
 if (map_present && (mpp = add_map_without_path(vecs, alias))) {
  sync_map_state(mpp);
  condlog(2, "%s: devmap %s registered", alias, dev);
  return 0;
 }

which will set:

extern struct multipath *
add_map_without_path (struct vectors * vecs, char * alias)
{
 struct multipath * mpp = alloc_multipath();

 if (!mpp)
  return NULL;

 mpp->alias = alias;

So when this logic returns, mpp-> alias will be already freed by uev_add_map.
Any other code traversing mpvecs using "mpp" as "struct multipath" will have
pointer "alias" pointing into random memory causing all kinds of memory
corruption.

This logic is triggered every time the "ueventloop" gets a new uevent from
kernel saying that there was a device map change. If device is "dm-" and
its action is "change" (meaning a device mapper was changed), this logic
will try to create a new "mpp" structure in global vecs->mpvec vectors
with this discovered multipath.

This is not used in daemon startup, since all paths are initially created
by the configure() -> {path_discovery, map_discovery, coalesce_paths, coalesce_
maps}. Multipath devices are more susceptible to be created from the beginning,
or by a discovered path, that would trigger another logic to for mpp structures
(thus creating multipath devices): That might explain why this bug has not
been fixed before.

Example of crashes because of this:

#0 malloc_consolidate (av=av@entry=0x7f5b58000020) at malloc.c:4149
#1 0x00007f5b62df3cf8 in _int_malloc (av=0x7f5b58000020, bytes=16384) at malloc.c:3423
#2 0x00007f5b62df66d0 in __GI___libc_malloc (bytes=16384) at malloc.c:2891
#3 0x00007f5b638134d7 in dm_task_run () from /lib/x86_64-linux-gnu/libdevmapper.so.1.02.1
#4 0x00007f5b6314be5c in dm_map_present (str=0x7f5b58000990 "lun02") at devmapper.c:304
#5 0x0000000000404ac7 in ev_add_map (dev=, alias=, vecs=) at main.c:257
#6 0x0000000000000000 in ?? ()

Another possible error is when trying to free mpp->alias again when garbage collector runs:

#0 0x00007f13a5933c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f13a5937028 in __GI_abort () at abort.c:89
#2 0x00007f13a59702a4 in __libc_message (do_abort=do_abort@entry=1, fmt=fmt@entry=0x7f13a5a81ef0 "") at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007f13a597c56e in malloc_printerr (ptr=<optimized out>, str=0x7f13a5a82020 "double free or corruption (out)", action=1) at malloc.c:4996
#4 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:3840
#5 0x00007f13a5cdbe86 in free_multipath (mpp=0x7f138c033d60, free_paths=0) at structs.c:174
#6 0x00007f13a5cfe117 in _remove_map (mpp=0x7f138c033d60, vecs=0x8adaa0, stop_waiter=1, purge_vec=1) at structs_vec.c:143
#7 0x00007f13a5cfe175 in remove_map_and_stop_waiter (mpp=0x7f138c033d60, vecs=0x8adaa0, purge_vec=1) at structs_vec.c:156
#8 0x0000000000406b4d in mpvec_garbage_collector (vecs=<error reading variable: can't compute CFA for this frame>) at main.c:950
...
#14 0x00000000004076b7 in checkerloop (ap=<error reading variable: can't compute CFA for this frame>) at main.c:1163

This can also be seen in valgrind:

==31831==
==31831== Invalid read of size 1
==31831== at 0x4C2E902: strncmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31831== by 0x56FC26E: find_mp_by_alias (structs.c:296)
==31831== by 0x404B2F: ev_add_map (main.c:264)
==31831== by 0x404A8B: uev_add_map (main.c:244)
==31831== by 0x40623C: uev_trigger (main.c:756)
==31831== by 0x5713958: service_uevq (uevent.c:118)
==31831== by 0x5713B67: uevent_dispatch (uevent.c:167)
==31831== by 0x406485: uevqloop (main.c:815)
==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so)
==31831== by 0x5A2EBEC: clone (clone.S:111)
==31831== Address 0x728d8d1 is 1 bytes inside a block of size 6 free'd
==31831== at 0x4C2BDEC: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==31831== by 0x404A9A: uev_add_map (main.c:245)
==31831== by 0x40623C: uev_trigger (main.c:756)
==31831== by 0x5713958: service_uevq (uevent.c:118)
==31831== by 0x5713B67: uevent_dispatch (uevent.c:167)
==31831== by 0x406485: uevqloop (main.c:815)
==31831== by 0x4E3F183: start_thread (in /lib/x86_64-linux-gnu/libpthread-2.19.so)
==31831== by 0x5A2EBEC: clone (clone.S:111)

mpvec_garbage_collector runs from the checkerloop, the thread responsible to
check if all paths are up by using selected pathcheckers (readsector0,
directio). Every 5 checker - all paths - loops, the garbage collector runs and
it checks every multipath (from vecs->mpvec vectors) is still present in the OS
by querying device mapper. If the "dm" (multipath) is gone, it removes the map
from internal vector, stops waiter threads () and orphan multipath paths. When
freeing multipath, the "mpp->alias" field is freed and, since it was already
freed before, it gets freed again causing random errors.

The following upstream patch, solves this by simply creating a new string that
will me freed when appropriate.

commit 735f2992398d4179601f3d7043fad606c90171d4
Author: Benjamin Marzinski <email address hidden>
Date: Thu Sep 1 08:50:54 2011 +0200

    multipath: strdup multipath alias, so that it isn't deleted

    When a multipath device is added to multipathd with ev_add_map(),
    the alias is not duplicated, and is freed immediately after ev_add_map()
    returns, causing a memory error. This patch corrects that.

diff --git a/libmultipath/structs_vec.c b/libmultipath/structs_vec.c
index 189f25b..364e36e 100644
--- a/libmultipath/structs_vec.c
+++ b/libmultipath/structs_vec.c
@@ -373,7 +373,7 @@ add_map_without_path (struct vectors * vecs, char * alias)
        if (!mpp || !alias)
                return NULL;

- mpp->alias = alias;
+ mpp->alias = STRDUP(alias);

        if (setup_multipath(vecs, mpp)) {
                mpp->alias = NULL;