nscd segmentation fault

Bug #997096 reported by Vincent Fortier
56
This bug affects 10 people
Affects Status Importance Assigned to Milestone
eglibc (Ubuntu)
Incomplete
Undecided
Unassigned
eglibc (openSUSE)
Fix Released
Low

Bug Description

nscd segfaults a few seconds after every getent netgroup requests:

root@cmo-cluster2:~# nscd -d
Wed 09 May 2012 12:59:54 PM UTC - 13953: register trace file /etc/passwd for database passwd
Wed 09 May 2012 12:59:54 PM UTC - 13953: register trace file /etc/group for database group
Wed 09 May 2012 12:59:54 PM UTC - 13953: register trace file /etc/services for database services
Wed 09 May 2012 12:59:54 PM UTC - 13953: register trace file /etc/netgroup for database netgroup
Wed 09 May 2012 12:59:54 PM UTC - 13953: cannot stat() file `/etc/netgroup': No such file or directory
Wed 09 May 2012 12:59:56 PM UTC - 13953: handle_request: request received (Version = 2) from PID 13962
Wed 09 May 2012 12:59:56 PM UTC - 13953: GETFDNETGR
Wed 09 May 2012 12:59:56 PM UTC - 13953: handle_request: request received (Version = 2) from PID 13962
Wed 09 May 2012 12:59:56 PM UTC - 13953: GETNETGRENT (cmo-nfs-netgroup)
Wed 09 May 2012 12:59:56 PM UTC - 13953: Haven't found "cmo-nfs-netgroup" in netgroup cache!
Wed 09 May 2012 12:59:56 PM UTC - 13953: provide access to FD 11, for netgroup
Segmentation fault (core dumped)

Revision history for this message
Vincent Fortier (th0ma7) wrote :
Revision history for this message
Vincent Fortier (th0ma7) wrote :

root@cmo-cluster2:~# gdb -c core /usr/sbin/nscd
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /usr/sbin/nscd...(no debugging symbols found)...done.
[New LWP 13858]
[New LWP 13850]
[New LWP 13856]
[New LWP 13855]
[New LWP 13857]
[New LWP 13854]
[New LWP 13853]
[New LWP 13852]
[New LWP 13851]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `nscd -d'.
Program terminated with signal 11, Segmentation fault.
#0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:32
32 ../sysdeps/x86_64/multiarch/../strlen.S: No such file or directory.
(gdb) backtrace
#0 __strlen_sse2 () at ../sysdeps/x86_64/multiarch/../strlen.S:32
#1 0x000000000041afe1 in ?? ()
#2 0x000000000041bb5e in ?? ()
#3 0x00000000004071b7 in ?? ()
#4 0x00007fee33576e9a in start_thread (arg=0x7fee29027700) at pthread_create.c:308
#5 0x00007fee3308a4bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#6 0x0000000000000000 in ?? ()

Revision history for this message
Vincent Fortier (th0ma7) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in eglibc (Ubuntu):
status: New → Confirmed
Revision history for this message
vhp (vhp) wrote :

Faced a similar issue with the newest version of nscd. Users in netgroups would be able to use sudo one time. Once authed the nscd process would die for all users. `/etc/init.d/nscd status` showed it was not running

> /etc/init.d/nscd status
 * Status of Name Service Cache Daemon service: * not running.

syslog showed the following

May 22 11:46:31 default kernel: [84228.494446] nscd[28565]: segfault at 0 ip 00007f3400ed03e1 sp 00007f33f7d2f108 error 4 in libc-2.15.so[7f3400d6f000+1b3000]
May 22 11:53:18 default nscd: 29514 cannot stat() file `/etc/netgroup': No such file or directory
May 22 11:53:29 default nscd: 29558 cannot stat() file `/etc/netgroup': No such file or directory
May 22 11:55:19 default nscd: 29991 cannot stat() file `/etc/netgroup': No such file or directory
May 22 12:01:38 default nscd: 30481 cannot stat() file `/etc/netgroup': No such file or directory
May 22 12:02:03 default kernel: [85160.598268] nscd[30489]: segfault at 0 ip 00007f794433a3e1 sp 00007f793af98108 error 4 in libc-2.15.so[7f79441d9000+1b3000]
May 22 12:09:42 default nscd: 31130 cannot stat() file `/etc/netgroup': No such file or directory
May 22 12:09:50 default kernel: [85627.475382] nscd[31140]: segfault at 0 ip 00007f57b81ef3e1 sp 00007f57aea4b108 error 4 in libc-2.15.so[7f57b808e000+1b3000]
May 22 12:17:01 default CRON[31512]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
May 22 12:20:39 default kernel: [86276.573029] nscd[31569]: segfault at 0 ip 00007f0a8626b3e1 sp 00007f0a7cac7108 error 4 in libc-2.15.so[7f0a8610a000+1b3000]

To resolve I removed the following from the bottom on /etc/nscd.conf

       enable-cache netgroup yes
       positive-time-to-live netgroup 28800
       negative-time-to-live netgroup 20
       suggested-size netgroup 211
       check-files netgroup yes
       persistent netgroup yes
       shared netgroup yes
       max-db-size netgroup 33554432

Just removing these stopped the slow authentication time and more importantly stopped the segfaults. Users in netgroups can now use sudo and auth as many times as they like. There is still a bug to fix here. Hopefully this workaround can help some until patched.

Revision history for this message
Kees Bakker (keestux) wrote :

Our configuration is a NIS network with Linux PCs. As soon as I do "rsh" to one of these PCs it triggers a netgroup request and that leads to a segfault in nscd.

This is the stdout of "nscd -d".

Thu 12 Jul 2012 09:16:34 AM CEST - 29485: handle_request: request received (Version = 2) from PID 30398
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: GETFDHST
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: handle_request: request received (Version = 2) from PID 30398
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: GETHOSTBYADDR (172.17.2.127)
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: handle_request: request received (Version = 2) from PID 30398
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: GETFDPW
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: provide access to FD 5, for passwd
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: handle_request: request received (Version = 2) from PID 30398
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: GETFDNETGR
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: provide access to FD 11, for netgroup
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: handle_request: request received (Version = 2) from PID 30398
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: INNETGR (general)
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: Haven't found "general (hamina.tasking.nl,,)" in netgroup cache!
Thu 12 Jul 2012 09:16:34 AM CEST - 29485: Haven't found "general" in netgroup cache!
Segmentation fault

Perhaps this bug has always been present. But now that these extra lines were added in the /etc/nscd.conf (see previous comment) it pops up.

So, indeed the workaround is to remove these extra lines. Or just say "enable-cache netgroup no".

Revision history for this message
Raubvogel (raubvogel) wrote :

Just to let anyone know, it still happens in 12.04:

Sep 20 14:29:21 ubuntu64 kernel: [84118.210121] nscd[17964]: segfault at 0 ip 00007f6d44ef2b91 sp 00007f6d3d9515e8 error 4 in libc-2.15.so[7f6d44e6a000+1b3000]

Revision history for this message
Raubvogel (raubvogel) wrote :

In 12.04, after I went to /etc/nscd.conf and changed

        enable-cache netgroup yes

to

        enable-cache netgroup no

and restarted nscd, I stopped seeing segfault messages in my syslog file. Is this the solution? I honestly do not know, but it has worked for me... so far. ;)

Revision history for this message
Adam Conrad (adconrad) wrote :

Disabling caching netgroups certainly doesn't fix the bug, but it works around it well enough. The glibc in precise-proposed does this.

Revision history for this message
In , Twild (twild) wrote :

nscd[2320]: segfault at 0 ip 00007f911bd87741 sp 00007f910e202188 error 4 in libc-2.17.so[7f911bc38000+1a3000]

I get frequently the error in syslog files:

nscd[2320]: segfault at 0 ip 00007f911bd87741 sp 00007f910e202188 error 4 in libc-2.17.so

fortunately the systemd re-spawns the nscd immediately. After my quick core dump analysis I'm quite sure that the problem comes from:

glibc-2.17-4.4.1, nscd netgroupcache.c in line 205:

size_t userlen = strlen (nuser) + 1; <------ core dump

because nuser in the triple code is 0 in case of (host,,) entries in NIS! This may causes the core dump.

I found two work arounds to avoid a nscd crash:

1.) switch off netgroup in nscd.conf
        enable-cache netgroup yes -> no
or

2.) in NIS change entries in netgroup
        (host,,) -> (host,-,<youdomain>

regards Thomas

Revision history for this message
In , E-kuemmerle (e-kuemmerle) wrote :

I can confirm that bug, I observe exactly the same segfaults on my server:

(gdb) where
#0 0x00007f32fd360741 in __strlen_sse2_pminub () from /lib64/libc.so.6
#1 0x00007f32fde513cd in addgetnetgrentX (db=db@entry=0x7f32fe059640 <dbs+1440>,
    fd=fd@entry=-1, req=req@entry=0x7f32f19977a0,
    key=key@entry=0x7f32f19979e0 "asslgc", uid=uid@entry=4294967295, he=he@entry=0x0,
    dh=dh@entry=0x0, resultp=resultp@entry=0x7f32f1997798) at netgroupcache.c:205
#2 0x00007f32fde51eb5 in addinnetgrX (db=db@entry=0x7f32fe059640 <dbs+1440>,
    fd=fd@entry=15, req=req@entry=0x7f32f1997860, key=<optimized out>,
    key@entry=0x7f32f19979e0 "asslgc", uid=uid@entry=4294967295, he=he@entry=0x0,
    dh=dh@entry=0x0) at netgroupcache.c:487
#3 0x00007f32fde52074 in addinnetgr (db=db@entry=0x7f32fe059640 <dbs+1440>,
    fd=fd@entry=15, req=req@entry=0x7f32f1997860, key=key@entry=0x7f32f19979e0,
    uid=uid@entry=4294967295) at netgroupcache.c:652
#4 0x00007f32fde3ed36 in handle_request (pid=<optimized out>, key=0x7f32f19979e0,
    req=0x7f32f1997860, fd=15, uid=<optimized out>) at connections.c:1326
#5 nscd_run_worker (p=<optimized out>) at connections.c:1792
#6 0x00007f32fd9fde0f in start_thread () from /lib64/libpthread.so.0
#7 0x00007f32fd2f97dd in clone () from /lib64/libc.so.6

(gdb) print data.val.triple
$3 = {host = 0x7f32f19971c8 "ass801", user = 0x0, domain = 0x0}

Revision history for this message
In , Swamp-a (swamp-a) wrote :

openSUSE-SU-2013:1510-1: An update that solves 6 vulnerabilities and has 5 fixes is now available.

Category: security (moderate)
Bug References: 779320,801246,805054,813121,813306,819383,819524,824046,830257,834594,839870
CVE References: CVE-2012-4412,CVE-2013-0242,CVE-2013-1914,CVE-2013-2207,CVE-2013-4237,CVE-2013-4332
Sources used:
openSUSE 12.3 (src): glibc-2.17-4.7.1, glibc-testsuite-2.17-4.7.2, glibc-testsuite-2.17-4.7.3, glibc-utils-2.17-4.7.1

Revision history for this message
In , Schwab-5 (schwab-5) wrote :

Fixed.

Revision history for this message
Moritz Hassert (mhassert) wrote :
Download full text (6.5 KiB)

This bug or a similar is still present in 13.10 (and has been in 12.10 and 13.04):

I believe this is essentially the same as a bug report over at Novell:
https://bugzilla.novell.com/show_bug.cgi?id=819524

Seems strlen is being called on NULL pointers. As the bug report at Novell has been fixed very recently, maybe that fix could be ported to Ubuntu 13.10?

-------
Some more info:
On my machine nscd is *not* respawning after it segfaults. The only noticeable effects are the nasty log messages and apport greeting me every morning with a crash report when logging in.

Running "nscd -d" in a terminal and running "sudo echo test" in another gives me the following:

mhassert@mhassert:/var/log# sudo echo test
[[stalls for a few seconds]]
test

root@mhassert:/var/log# LC_ALL=C nscd -d
Tue Oct 22 11:18:46 2013 - 11188: register trace file /etc/passwd for database passwd
Tue Oct 22 11:18:46 2013 - 11188: register trace file /etc/group for database group
Tue Oct 22 11:18:46 2013 - 11188: register trace file /etc/hosts for database hosts
Tue Oct 22 11:18:46 2013 - 11188: register trace file /etc/resolv.conf for database hosts
Tue Oct 22 11:18:46 2013 - 11188: register trace file /etc/services for database services
Tue Oct 22 11:18:52 2013 - 11188: handle_request: request received (Version = 2) from PID 11201
Tue Oct 22 11:18:52 2013 - 11188: GETFDPW
Tue Oct 22 11:18:52 2013 - 11188: provide access to FD 5, for passwd
Tue Oct 22 11:18:52 2013 - 11188: handle_request: request received (Version = 2) from PID 11201
Tue Oct 22 11:18:52 2013 - 11188: GETFDGR
Tue Oct 22 11:18:52 2013 - 11188: provide access to FD 7, for group
Tue Oct 22 11:18:52 2013 - 11188: handle_request: request received (Version = 2) from PID 11201
Tue Oct 22 11:18:52 2013 - 11188: GETFDHST
Tue Oct 22 11:18:52 2013 - 11188: provide access to FD 9, for hosts
Tue Oct 22 11:18:52 2013 - 11188: handle_request: request received (Version = 2) from PID 11201
Tue Oct 22 11:18:52 2013 - 11188: GETAI (mhassert)
Tue Oct 22 11:18:52 2013 - 11188: Haven't found "mhassert" in hosts cache!
Speicherzugriffsfehler (Speicherabzug geschrieben)
[[german for: "segmentation fault (core dumped)"]]

last entry in dmesg:
[ 4972.251625] traps: nscd[11193] general protection ip:7fb19e3a2c41 sp:7fb194680178 error:0 in libc-2.17.so[7fb19e319000+1bd000]

Running nscd -d within gdb:
root@mhassert:/var/log# LC_ALL=C gdb --args nscd -d
[...]
Reading symbols from /usr/sbin/nscd...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/sbin/nscd -d
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Tue Oct 22 11:21:56 2013 - 11339: register trace file /etc/passwd for database passwd
Tue Oct 22 11:21:56 2013 - 11339: register trace file /etc/group for database group
Tue Oct 22 11:21:56 2013 - 11339: register trace file /etc/hosts for database hosts
Tue Oct 22 11:21:56 2013 - 11339: register trace file /etc/resolv.conf for database hosts
Tue Oct 22 11:21:56 2013 - 11339: register trace file /etc/services for database services
[New Thread 0x7fffedb03700 (LWP 11343)]
[New Thread 0x7fffed902700 (LWP 11344)]
[New Thread 0x7fffed701700 (LWP 1...

Read more...

Changed in eglibc (openSUSE):
importance: Unknown → Low
status: Unknown → Fix Released
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Hasn't this been fixed across the board with the latest security update? http://www.ubuntu.com/usn/usn-1991-1/ that was pushed out across the board.

Changed in eglibc (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Moritz Hassert (mhassert) wrote :

Dmitrijs:
I don't see how this bug is related to any of the points mentioned in that security notice.
On a closer look at the openSuse bug report comment #2 just mentions those security fixes without directly claiming that they fix the bug itself. In Comment #3 the bug was declared fixed, perhaps prematurely.

So I guess the bug in nscd is still there and the openSuse bug report was false hope?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.