Comment 15 for bug 1688575

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hm, I'm not getting a segfault.
I have two databases on the server: dc=example,dc=com and dc=example,dc=org. Both have syncprov, and my slave is syncrepling from both using gssapi.

I created a replicator principal, added an ACL to allow it to read everything in both trees.

I didn't use k5start in the slave, since this is just a test. I kinit'ed the replicator user, chowned the credentials cache file to openldap and set KRB5CCNAME in /etc/default/slapd.

Upon starting the slave, I get two connections from it logged on the master and their respective searches for each tree (see http://pastebin.ubuntu.com/25171586/ for full log and better formatting):
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 fd=13 ACCEPT from IP=10.0.100.149:60168 (IP=0.0.0.0:389)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 fd=20 ACCEPT from IP=10.0.100.149:60170 (IP=0.0.0.0:389)
(...)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 BIND authcid="Replicator" authzid="Replicator"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 BIND dn="uid=replicator,cn=gssapi,cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 RESULT tag=97 err=0 text=
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=3 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"
(...)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 BIND authcid="Replicator" authzid="Replicator"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 BIND dn="uid=replicator,cn=gssapi,cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 RESULT tag=97 err=0 text=
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=3 SRCH base="dc=example,dc=org" scope=2 deref=0 filter="(objectClass=*)"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=3 SRCH attr=* +

I tried multiple restarts on the slave, also between master updates with a script that creates 100 users in each tree, but no segfault.

Could you share your syncrepl and syncprov settings perhaps for both databases?

On the master I have just for each db:
olcOverlay: {0}syncprov
olcSpCheckpoint: 100 10
olcSpSessionlog: 100

And on the slave I have:
olcSyncrepl: {0}rid=0 provider=ldap://xenial-slapd-segfault-1688575.lxd bind
 method=sasl saslmech=GSSAPI searchbase="dc=example,dc=com" schemachecking=
 off type=refreshAndPersist retry="60 +"

and

olcSyncrepl: {0}rid=1 provider=ldap://xenial-slapd-segfault-1688575.lxd bind
 method=sasl saslmech=GSSAPI searchbase="dc=example,dc=org" schemachecking=
 off type=refreshAndPersist retry="60 +"

I understand it might not be an immediate segfault, or 100% cpu usage, but given the bug description I thought it was more or less constant. But maybe I'm missing something.