Hm, I'm not getting a segfault.
I have two databases on the server: dc=example,dc=com and dc=example,dc=org. Both have syncprov, and my slave is syncrepling from both using gssapi.
I created a replicator principal, added an ACL to allow it to read everything in both trees.
I didn't use k5start in the slave, since this is just a test. I kinit'ed the replicator user, chowned the credentials cache file to openldap and set KRB5CCNAME in /etc/default/slapd.
Upon starting the slave, I get two connections from it logged on the master and their respective searches for each tree (see http://pastebin.ubuntu.com/25171586/ for full log and better formatting):
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 fd=13 ACCEPT from IP=10.0.100.149:60168 (IP=0.0.0.0:389)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 fd=20 ACCEPT from IP=10.0.100.149:60170 (IP=0.0.0.0:389)
(...)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 BIND authcid="Replicator" authzid="Replicator"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 BIND dn="uid=replicator,cn=gssapi,cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=2 RESULT tag=97 err=0 text=
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2230 op=3 SRCH base="dc=example,dc=com" scope=2 deref=0 filter="(objectClass=*)"
(...)
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 BIND authcid="Replicator" authzid="Replicator"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 BIND dn="uid=replicator,cn=gssapi,cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=2 RESULT tag=97 err=0 text=
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=3 SRCH base="dc=example,dc=org" scope=2 deref=0 filter="(objectClass=*)"
Jul 25 18:48:23 xenial-slapd-segfault-1688575 slapd[4697]: conn=2229 op=3 SRCH attr=* +
I tried multiple restarts on the slave, also between master updates with a script that creates 100 users in each tree, but no segfault.
Could you share your syncrepl and syncprov settings perhaps for both databases?
On the master I have just for each db:
olcOverlay: {0}syncprov
olcSpCheckpoint: 100 10
olcSpSessionlog: 100
And on the slave I have:
olcSyncrepl: {0}rid=0 provider=ldap://xenial-slapd-segfault-1688575.lxd bind
method=sasl saslmech=GSSAPI searchbase="dc=example,dc=com" schemachecking=
off type=refreshAndPersist retry="60 +"
I understand it might not be an immediate segfault, or 100% cpu usage, but given the bug description I thought it was more or less constant. But maybe I'm missing something.
Hm, I'm not getting a segfault.
I have two databases on the server: dc=example,dc=com and dc=example,dc=org. Both have syncprov, and my slave is syncrepling from both using gssapi.
I created a replicator principal, added an ACL to allow it to read everything in both trees.
I didn't use k5start in the slave, since this is just a test. I kinit'ed the replicator user, chowned the credentials cache file to openldap and set KRB5CCNAME in /etc/default/slapd.
Upon starting the slave, I get two connections from it logged on the master and their respective searches for each tree (see http:// pastebin. ubuntu. com/25171586/ for full log and better formatting): slapd-segfault- 1688575 slapd[4697]: conn=2229 fd=13 ACCEPT from IP=10.0. 100.149: 60168 (IP=0.0.0.0:389) slapd-segfault- 1688575 slapd[4697]: conn=2230 fd=20 ACCEPT from IP=10.0. 100.149: 60170 (IP=0.0.0.0:389) slapd-segfault- 1688575 slapd[4697]: conn=2230 op=2 BIND authcid= "Replicator" authzid= "Replicator" slapd-segfault- 1688575 slapd[4697]: conn=2230 op=2 BIND dn="uid= replicator, cn=gssapi, cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56 slapd-segfault- 1688575 slapd[4697]: conn=2230 op=2 RESULT tag=97 err=0 text= slapd-segfault- 1688575 slapd[4697]: conn=2230 op=3 SRCH base="dc= example, dc=com" scope=2 deref=0 filter= "(objectClass= *)" slapd-segfault- 1688575 slapd[4697]: conn=2229 op=2 BIND authcid= "Replicator" authzid= "Replicator" slapd-segfault- 1688575 slapd[4697]: conn=2229 op=2 BIND dn="uid= replicator, cn=gssapi, cn=auth" mech=GSSAPI sasl_ssf=56 ssf=56 slapd-segfault- 1688575 slapd[4697]: conn=2229 op=2 RESULT tag=97 err=0 text= slapd-segfault- 1688575 slapd[4697]: conn=2229 op=3 SRCH base="dc= example, dc=org" scope=2 deref=0 filter= "(objectClass= *)" slapd-segfault- 1688575 slapd[4697]: conn=2229 op=3 SRCH attr=* +
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
(...)
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
(...)
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
Jul 25 18:48:23 xenial-
I tried multiple restarts on the slave, also between master updates with a script that creates 100 users in each tree, but no segfault.
Could you share your syncrepl and syncprov settings perhaps for both databases?
On the master I have just for each db:
olcOverlay: {0}syncprov
olcSpCheckpoint: 100 10
olcSpSessionlog: 100
And on the slave I have: ldap:// xenial- slapd-segfault- 1688575. lxd bind "dc=example, dc=com" schemachecking= Persist retry="60 +"
olcSyncrepl: {0}rid=0 provider=
method=sasl saslmech=GSSAPI searchbase=
off type=refreshAnd
and
olcSyncrepl: {0}rid=1 provider= ldap:// xenial- slapd-segfault- 1688575. lxd bind "dc=example, dc=org" schemachecking= Persist retry="60 +"
method=sasl saslmech=GSSAPI searchbase=
off type=refreshAnd
I understand it might not be an immediate segfault, or 100% cpu usage, but given the bug description I thought it was more or less constant. But maybe I'm missing something.