gtk file dialog blocks on trackerd (via dbus) for 25s for users with NFS homedirs

Bug #218230 reported by markgalassi
60
This bug affects 2 people
Affects Status Importance Assigned to Milestone
gtk+2.0 (Ubuntu)
Fix Released
Low
Unassigned
Nominated for Hardy by Andreas Heinlein
tracker (Ubuntu)
Fix Released
Low
Unassigned
Nominated for Hardy by Andreas Heinlein

Bug Description

mozart->~$ lsb_release -rd
Description: Ubuntu 8.04
Release: 8.04
mozart->~$

with a full "apt-get -u dist-upgrade" from 2008-04-16

Every program which brings up a gnome file dialog (open or save) freezes for half a minute to a minute before showing the dialog.

Examples: Firefox File->Open File, Openoffice Write File->Save As, gimp File->Open image

Since the gimp also has it, I wonder if it's a gtk problem instead of a gnome problem.

I consider it a crippling problem.

Revision history for this message
nossal (nossal) wrote :

Yes, I have the same problem

Revision history for this message
JL Falcone (jl-falcone) wrote :

Me too.

Revision history for this message
JL Falcone (jl-falcone) wrote :

The problem is perhaps due to fact I use an NFS export as home. I tried to make a new account with a local home in the same machine without experiencing it.

Revision history for this message
nossal (nossal) wrote :

All my system is local.
This is the layout:

/dev/sda1 /boot (ext3)
/dev/sda2 Swap
/dev/sda3 / (xfs)
/dev/sda4 /home (xfs)

[]s,

Ps.:The last Ubuntu update doesn't fix it.

Revision history for this message
Christian Lins (cli) wrote :

I have the same problem, whereas browsing the filesystem via Nautilus is fast as always.

Revision history for this message
dustingram (dustin-ingram) wrote :

Same problem here.

Revision history for this message
HeWhoE (hewhoe) wrote :

I just noticed the same problem.

Revision history for this message
HeWhoE (hewhoe) wrote :

% df -Th
Filesystem Type Size Used Avail Use% Mounted on
/dev/sdb1 ext3 5.6G 4.0G 1.4G 76% /
varrun tmpfs 375M 236K 375M 1% /var/run
varlock tmpfs 375M 0 375M 0% /var/lock
udev tmpfs 375M 68K 375M 1% /dev
devshm tmpfs 375M 12K 375M 1% /dev/shm
lrm tmpfs 375M 38M 338M 10% /lib/modules/2.6.24-17-generic/volatile
/dev/sdb2 ext3 7.3G 724M 6.2G 11% /var/cache/apt
/dev/sda1 ext3 2.0G 208M 1.7G 11% /usr/src
/dev/sda4 ext3 7.3G 724M 6.2G 11% /var
/dev/sda2 ext3 46G 43G 511M 99% /home
/dev/sdc1 ext3 2.0G 1.4G 567M 71% /home/hewhoe

% cat /etc/fstab
# /etc/fstab: static file system information.
#
# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0
/dev/sdb1 / ext3 defaults,errors=remount-ro 0 1
/dev/sdb2 /var/cache/apt ext3 defaults,errors=remount-ro 0 1
/dev/sdb3 none swap sw 0 0
/dev/sda1 /usr/src ext3 defaults,errors=remount-ro 0 1
/dev/sda4 /var ext3 defaults,errors=remount-ro 0 1
/dev/sda2 /home ext3 defaults,errors=remount-ro 0 1
/dev/sda3 none swap sw 00
/dev/sdc1 /home/hewhoe ext3 defaults,errors=remount-ro 0 1

Revision history for this message
Emil Sit (emilsit) wrote :

I can reproduce this by running:
  $ wget http://www.pygtk.org/pygtk2tutorial/examples/filechooser.py
  $ python filechooser.py

It appears to hang on a poll call to dbus:

  $ strace -tt -T -o /tmp/trace python filechooser.py
  $ perl -ne '/<(\d+\.\d+)>/ && print "$1 $_"' /tmp/trace | sort -n | tail -2
  0.104393 14:03:02.498640 poll([{fd=3, events=POLLIN, revents=POLLIN}], 1, 25000) = 1 <0.104393>
  24.999640 14:03:02.848054 poll([{fd=7, events=POLLIN}], 1, 25000) = 0 <24.999640>

... since fd 7 is the dbus connection.

  $ grep connect /tmp/trace | tail -1
  14:03:02.736829 connect(7, {sa_family=AF_FILE, path=@/tmp/dbus-ZJ0rg3SfV8}, 23) = 0 <0.000008>

dbus-monitor is not terribly informative (for me, anyway)... the calls right before the hang are:

method call sender=:1.91 -> dest=org.freedesktop.DBus path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=GetNameOwner
   string "org.freedesktop.Tracker"
method call sender=:1.91 -> dest=org.freedesktop.Tracker path=/org/freedesktop/tracker; interface=org.freedesktop.Tracker; member=GetVersion
<hangs here>

Revision history for this message
Emil Sit (emilsit) wrote :

Given the number of packages affected and the reproduction steps, it seems most likely to be a problem with gtk.

Revision history for this message
nossal (nossal) wrote :

I just uninstall the Tracker Search Tool and the system works fine!

Thanks Emil Sit!

Revision history for this message
Emil Sit (emilsit) wrote :

Ah, it does appear that trackerd is attempting to do some SQLite locking on a file that is (in my case) on an NFS home directory. It repeatedly fails (fcntl returns EAGAIN), and it seems that trackerd is spinning there and thus not responding to dbus RPCs.

One work-around (besides 'sudo apt-get remove tracker') is to symlink the directories that trackerd needs into a local directory. e.g.
  $ mkdir -p /var/tmp/${USER}-tracker/{cache,local}
  $ killall -9 trackerd
  $ cd ~/.cache && rm -rf tracker && ln -s /var/tmp/${USER}-tracker/cache
  $ cd ~/.local/share && rm -rf tracker && ln -s /var/tmp/${USER}-tracker/local
You may need to logout and in again. Obviously, this blows away any existing tracker database you might have.

Not sure if this should be assigned to tracker or gtk. On the one hand, why should gtk be talking to tracker at all for a file dialog? On the other, trackerd is not behaving in an NFS friendly fashion. Adding tracker and maybe someone there will triage more completely.

Revision history for this message
nossal (nossal) wrote :

The description: "...users with NFS homedirs" is not totally correct, my home dir is not NFS, all my file system is local.

Revision history for this message
Ville Koskinen (villek) wrote :

I'm experiencing the same problem. Ubuntu 8.04 desktop, 64-bit version. We are using NIS, Winbind, and NFS shares for home directories. Home directories are automounted (autofs).

First:
  $ pwd
  /home/villek
  $ wget http://www.pygtk.org/pygtk2tutorial/examples/filechooser.py
  $ strace -tt -T -o /tmp/trace python filechooser.py

Then:
  $ perl -ne '/<(\d+\.\d+)>/ && print "$1 $_"' /tmp/trace | sort -n | tail -2
  3.987531 12:23:35.026965 open("/home/.hidden", O_RDONLY) = -1 ENOENT (No such file or directory) <3.987531>
  125.049059 12:23:39.016066 open("/home/.hidden", O_RDONLY) = -1 ENOENT (No such file or directory) <125.049059>

After about 125 seconds the window opens, but it does not list anything (only the Open and Cancel buttons are visible), and the window does not redraw itself (i.e. minimizing and maximizing produces a blank window).

The home directory is mounted from the server.

  $ mount | grep home
  automount(pid6125) on /home type autofs (rw,fd=4,pgrp=6125,minproto=2,maxproto=4)
  skippy:/vol/home/villek on /home/villek type nfs (rw,nosuid,nodev,hard,intr,udp,addr=192.168.8.3

Removing tracker did not solve the issue for me.

Revision history for this message
Emil Sit (emilsit) wrote :

nossal: my hypothesis is that SQLite in trackerd is attempting to lock a file in NFS which fails and causes it to hang. However, it's clear from Ville's report and yours that perhaps other things may be at fault.

Can you re-install the tracker packages, relogin (to restart all trackerd in a clean set up) and then run:
  $ lsof -p $(pidof trackerd) > /tmp/trackerd.lsof
  $ strace -tt -T -o /tmp/trackerd.trace -p $(pidof trackerd)
in one window while reproducing the problem with the python filechooser command?
Then control-C the trackerd strace and attach /tmp/trackerd.lsof and /tmp/trackerd.trace to the bug report?

Ville Koskinen: Can you check to see if the updated packages in bug 214041 solve your problem? (They may already be released.) Thanks.

Revision history for this message
Ville Koskinen (villek) wrote :

Confirmed: the fix to bug 214041 solves the problem.

(I could not find the updated package in hardy-updates, though, so I manually patched the source and created a new package out of it.)

Revision history for this message
Dmitriy Geels (dmig) wrote :

I have local home mounted from separate partition, but I experience this bug.
here are last 10 lines of filechooser.py example trace:
0.025732 14:54:58.490295 access("/usr/share/X11/locale/ru_RU.UTF-8/XLC_LOCALE", R_OK) = 0 <0.025732>
0.026288 14:54:57.194259 open("/usr/lib/python2.5/site-packages/python-support.pth", O_RDONLY|O_LARGEFILE) = 5 <0.026288>
0.029566 14:54:58.612365 select(5, [4], [], NULL, NULL) = 1 (in [4]) <0.029566>
0.038757 14:55:27.299303 select(5, [4], [4], NULL, NULL) = 1 (out [4]) <0.038757>
0.040829 14:54:58.429995 open("/usr/share/X11/locale/locale.alias", O_RDONLY) = 4 <0.040829>
0.043928 14:54:56.715207 execve("/usr/bin/python", ["python", "filechooser.py"], [/* 37 vars */]) = 0 <0.043928>
0.045297 14:54:58.651040 access("/usr/share/themes/Human/gtk-2.0/gtkrc", F_OK) = 0 <0.045297>
0.051719 14:55:27.114798 select(5, [4], [], NULL, NULL) = 1 (in [4]) <0.051719>
0.061372 14:55:27.053096 select(5, [4], [], NULL, NULL) = 1 (in [4]) <0.061372>
24.999649 14:55:02.002804 poll([{fd=8, events=POLLIN}], 1, 25000) = 0 <24.999649>

As you see, it hangs in poll().

Revision history for this message
Dmitriy Geels (dmig) wrote :

BTW, trackerd isn't running

Revision history for this message
Dmitriy Geels (dmig) wrote :
Revision history for this message
mjv (mjvolrath) wrote :

Ran into this problem in FC6, but posting here since here is where I found the path to the solution.

Following the suggestions Ville Koskinen wrote on 2008-05-15: https://bugs.launchpad.net/ubuntu/+source/tracker/+bug/218230/comments/14
I found the .beagle directory to be the problem:

9.874704 15:44:53.648968 connect(7, {sa_family=AF_FILE, path="/home/haemer/.beagle/socket"}, 110) = ? ERESTARTSYS (To be restarted) <9.874704>
16.922473 15:45:07.476954 connect(7, {sa_family=AF_FILE, path="/home/haemer/.beagle/socket"}, 110) = ? ERESTARTSYS (To be restarted) <16.922473>

The fix for me was simply:
mv .beagle .beagle.bak

Now all the gtk file dialogs in Firefox, acroread, etc. work OK.
I hadn't been using beagle, so I'm not sure the impact if that would be important.

Revision history for this message
markgalassi (mark-galassi) wrote : Re: [Bug 218230] Re: gtk file dialog blocks on trackerd (via dbus) for 25s for users with NFS homedirs

> Following the suggestions Ville Koskinen wrote on 2008-05-15: https://bugs.launchpad.net/ubuntu/+source/tracker/+bug/218230/comments/14
> I found the .beagle directory to be the problem:

Doesn't fix it for me. I wonder how many other directories with stale
stuff are still an issue -- people have suggested beagle, nfs, but
ultimately socket reads and network filesystem reads can take a long
time to time out, but it should not hang an application that is not
reading/writing form/to them. So it's a bug in gtk. KDE applications
do not have this problem.

Revision history for this message
Traxter (mattias-smartcore) wrote :

I'd like to add my two cents as I got this problem today and was able to resolve it.

I followed the advice above to uninstall the tracker tool but this itself didn't get rid of the problem. I therefore restarted dbus with "sudo /etc/init.d/dbus restart" and after a couple of seconds when I got my desktop back the problem was gone.

Revision history for this message
markgalassi (mark-galassi) wrote :

    Traxter> I'd like to add my two cents as I got this problem today
    Traxter> and was able to resolve it.

    Traxter> I followed the advice above to uninstall the tracker tool
    Traxter> but this itself didn't get rid of the problem. I therefore
    Traxter> restarted dbus with "sudo /etc/init.d/dbus restart" and
    Traxter> after a couple of seconds when I got my desktop back the
    Traxter> problem was gone.

That was one of the things people recommended long ago, but it did not
work for me. Amazing how a bug that makes a system almost unusable is
still around after so long.

Revision history for this message
Svein Tore (sveint) wrote :

Wow, I recently started experiencing this bug and after getting really, really annoyed I started searching around for a solution. I found this thread after some extensive search:
http://ph.ubuntuforums.com/showthread.php?t=784326

Deleting the tracker cache (~/.cache/tracker) solves the problem (but of course removes the cache). It's incredible that this bug is still around given it's consequences.

Revision history for this message
markgalassi (mark-galassi) wrote :

    Svein> Deleting the tracker cache (~/.cache/tracker) solves the
    Svein> problem (but of course removes the cache). It's incredible
    Svein> that this bug is still around given it's consequences.

That does not solve the problem for everyone else. People have proposed
nfs, dbus, tracker, various other things. The problem is still there
when I apply all the solutions I have seen.

Clearly there is some foolish delicate code in the file selection dialog
which almost anything can cause to hang. We can enumerate many of the
things that can make it hang, but that won't solve the problem. For
example, nobody's solution has worked for me :-)

Revision history for this message
Alan Trick (trick) wrote :

I have this problem on my work computer, but I have never used NFS. My home folder is local. Maybe this more general problem?

Revision history for this message
Andrew Pollock (apollock) wrote :

I think there are two potentially separate problems here. We're certainly seeing the stats for /home/.hidden, which causes autofs to want to reload its maps because /home/.hidden doesn't exist.

Revision history for this message
Simon Lehmann (simon-lehmann) wrote :

I also think it might be a more general problem, but as it seems, on my computer it was caused by trackerd. I tried to disable the use of tracker by removing it from my session and disabling it completely. But apparently it is started automagically as soon as the file dialog is opened, which is strange enough.

It then produces heavy network/disk activity (home is on a nfs share), but this doesn't seem to be the main reason the file dialog blocks, because after some time the dialog opens fully functional but the disk activity remains. If I manually kill the trackerd during the blocking period, the file dialog comes up immediately.

For now I moved /usr/bin/trackerd to somewhere else and put a symlink to /bin/true there instead, which solves the issue for me. Of course this diables trackerd completely, but I haven't really used it.

Revision history for this message
Andrew Pollock (apollock) wrote :

In my environment, where we're seeing the problem with NFS, we have trackerd disabled already.

Revision history for this message
Tessa Lau (tlau) wrote :

I'm seeing the same problem. I first observed the symptoms in Evolution, where replying or composing a new message took 2+ minutes before the compose window appeared. (See my previous comment and stacktrace at https://bugs.launchpad.net/ubuntu/+source/evolution/+bug/159153). I traced the problem to the tracker_get_version call in libtrackerclient.so, which led me to this bug report. I can reproduce the problem with the filechooser as described above too.

My home directory is local (no NFS whatsoever).

Removing the ~/.cache/tracker directory did not help, but uninstalling libtrackerclient0 and all its dependencies did work.

Revision history for this message
Brian Wang (dingting) wrote :

Hi everyone,

I think there are two separate problems here - one with trackerd and one with GTK+ itself. In our setup we have trackerd completely disabled, but we still have the same behavior with certain GTK+ widgets trying to access /home/.hidden.

In addition to file dialogs, we also have the problem with gnome-appearance-properties ( https://bugs.launchpad.net/ubuntu/+source/gnome-control-center/+bug/218959 ).

If it makes things easier, would it be possible to split this bug into two? One for GTK+ and one for trackerd, since they seem to be separate issues that just manifest in similar ways.

In any case, this is a major problem for us right now, and we hope everyone can try their best to find a solution.

Cheers,
Brian

Revision history for this message
Sebastien Bacher (seb128) wrote :

having a different bug for the gtk issue would make sense indeed

Revision history for this message
Heiko Harders (heiko-harders) wrote :

I experienced the same problem. Opening a file dialog in any application causes a 25 second freeze of the program. Thereafter the dialog appears and can be used.

I am using fully updated Ubuntu 8.04.2 (64 bits) clients and server. The problem occurs only with user accounts that have NFS mounted homedirs, when using a program on a client pc.

Removing `tracker' from the clients solved the problem for me.

I also have seen some programs trying to access `/home/.hidden' like described by somebody above. Allthough I am not sure yet if this causes any problems in my case.

Revision history for this message
Martyn Russell (martyn-lanedo) wrote :

This was problematic with a broken API in older versions of libtrackerclient and the file chooser dialog complaining about missing symbols IIRC. This should have been fixed for a while now.

Revision history for this message
Sebastien Bacher (seb128) wrote :

closing since that works now, reopen if you still get in jaunty

Changed in tracker (Ubuntu):
importance: Undecided → Low
status: New → Fix Released
Changed in gtk+2.0 (Ubuntu):
importance: Undecided → Low
status: Confirmed → Fix Released
Revision history for this message
Andreas Heinlein (aheinlein) wrote :

Reopen this bug (at least the one regarding trackerd) since it needs to be fixed in hardy (LTS Release!) as well. NFS Homes are quite common in corporate environments, as is the use of LTS releases...

Changed in tracker (Ubuntu):
status: Fix Released → Confirmed
Revision history for this message
Chris Coulson (chrisccoulson) wrote :

Please don't reopen bugs that are fixed already in later releases. If it affects an earlier release, use the "Nominate for release" button

Changed in tracker (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.