Comment 4 for bug 75695

Revision history for this message
Peter Cordes (peter-cordes) wrote :

The performance hit of -i hasn't changed with 12.04 LTS. Will have to check with a newer grep, I guess. Seeing e.g. 25 secs to grep -i on the .c/.h files in a Linux source tree, 0.5 secs to grep without -i. 1.3 secs for a LANG=C grep -i. No disk I/O, files are cached.

  So a factor of about 20 slowdown for en_CA.utf8 vs. POSIX case insensitive grepping.

 Ubuntu 12.04 does set LANG=en_CA.utf8, and /usr/lib/locale now just contains locale-archive. So I'm not seeing any system calls trying to open non-existant files like ahendry was.

 Again, haven't yet tried with the most recent ubuntu. This should be trivially easy for most people to test, as it doesn't require grep to actually match anything. (I still used the volatile s3tc pattern from my original report when searching the Linux tree). You just need a new version of grep, and locale support for a utf8 English locale (e.g. en_US.utf8).

 just run these 3 commands:
time find -name '*.[ch]' | xargs grep -i 'volatile.*s3tc'
time find -name '*.[ch]' | xargs grep 'volatile.*s3tc'
time find -name '*.[ch]' | LANG=C xargs grep -i 'volatile.*s3tc'

 If the LANG=C version isn't much faster than the grep -i with your default locale (and/or LANG=en_US.utf8 if your default for some reason isn't slow), then the problem is fixed and grep has fast case-insensitive utf8 matching.