Comment 5 for bug 1666676

Revision history for this message
Carlos Garnacho (carlosg) wrote : Re: Enable tracker by default for Unity too

Hey Jorge, some answers, for the same definition of "some" :P

- What does an upgrade look like? Let's say I have a home directory with gigs of data, when I accomplish an upgrade to 17.10 when/how does indexing take place?

Indexing would take place when the services have been started on the user session, presumably started by the install scripts, or on session restart. tracker-miner-fs has a configurable timeout (defaults to 15s) to avoid I/O during session startup, as it's already pretty high at that time.

- Is this something that happens in the background or during a package installation?

All indexing happens in the background, on startup tracker-miner-fs performs an initial crawling where it 1) sets up directory monitors and 2) ensures its idea of the FS is up-to-date (eg. there's not been changes between reboots).

After that, tracker-miner-fs just listens to directory monitor events and updates its DB, subdirectories newly added to recursively indexed folders would be crawled in a similar way when discovered. And same for local mounted volumes, although we don't index those by default.

Indexing is also influenced by AC adapter/battery state and configuration, it reduces to a crawl or pauses altogether given the right conditions.

- Is indexing a long process?

It depends, Tracker only indexes XDG folders (and $HOME non-recursively) by default, so indexing depends on the amount of files/directories and I/O throughtput. There's of course worst cases like multi-TB 2500rpm HDDs with millions of files, but on more average setups tracker-miner-fs should take the order of seconds. A somewhat favorable example, reindexing from scratch ~8K items take tracker-miner-fs ~2s on this ssd-powered laptop.

But tracker-miner-fs only manages FS-level information, the isolated tracker-extract process performs the actual content sniffing, and the time spent there is variable too per mimetype, eg. heavy PDF documents might take poppler several seconds each, while plain text files could be handled in the ballpark of hundreds per second.

Oh, and there's also file/dir name pattern matching to cut down uninteresting portions of the filesystem, tracker-miner-fs eg. tries to avoid git repos, uncompressed tarballs and whatnot.

- Is it something we need to display to the user?

It's your call really... gnome doesn't bother for example. Some apps like gnome-music do track activity and show a "Loading..." in-app banner, but little else.

- Is it one of those things where we'll need to inform users in the release notes that an expensive io operation will take place as part of the upgrade?

It's maybe wise to do that, it will churn a few extra cycles globally.

- In the past I recall having to modify inotify handles for performance, at some point the default handles we set in ubuntu would run out and search wouldn't work well at all. Since that was years ago I'm assuming these sorts of issues have been sorted out?

I assume so, tracker is admittedly greedy with inotify handles, but there's been for quite some time now the runtime checks for the user limit, leaving also some room for other apps wanting file monitors.

If with the big slice of handles that Tracker takes for itself there's really not enough to cover the indexed portions of the filesystem, Tracker takes this as a soft failure, the non-monitored directories would just be checked for changes on the startup phase.