Optimisation issues with glibc-2.2 and LANG=???

simos at pc96.ma.rhbnc.ac.uk simos at pc96.ma.rhbnc.ac.uk
Sat Jan 20 23:13:20 EET 2001


Dear All,
	I would like to draw to your attention a minor issue of
optimisation for the glibc-2.2 library.
	This is related to non-English users or users that set the "LANG"
enviroment variable (when someone changes the locale). Additionally, I am
using RH 7.0 with glibc-2.2, however I assume it affects others too.

	I will make the following demonstration using the "el" or "greek"
language. Other languages have similar results.
	In /usr/share/locale/locale.alias, "greek" is an alias to
el_GR.ISO-8859-7.

	When someone is invoking an application like "ls", several files
related to NLS are opened.
	a. 13 files are opened in /usr/lib/locale/el_GR.ISO-8859-7/*
	b. libc.mo may be opened (in case a text message of libc needs to
be printed)
	c. the application's .mo file will be opened (in case a text
message of the application needs to be printed).

	In total, about 15 files are opened.

	However, by using "strace" to check what system calls an
application invokes, we see that our application does not find with the
first shot the correct 15 files.
	We have used the command

		strace -o output.txt ls -z

		strace: the "strace" application (available on the
			installation CD, if not already installed.
			Alternative: ltrace)
		-o output.txt: the output is sent to this file.
		ls -z: this is the test command. When you run it,

			$ ls -z
			ls: invalid option -- z		(a libc message)
			Try `ls --help' for more information. (a
							fileutils/ls message)

	Then, we counted the failed attempts by doing:

		$ grep open output.txt

	Possible values for Greek users, for the LANG variable, are:

	a. setenv LANG el
	Did work with previous version of libc, now it's not recognised.

	b. setenv LANG el_GR
	It worked with previous versions of libc, still works.

	The mechanism to open the corresponding locale files results
	into 2 failed "open" attempts.

	c. setenv LANG greek
	This is the proper value for glibc-2.2.

	The mechanism to open the corresponding locale files results
	into 34 failed "open" attempts.

	At Appendix A you can see a sample output of case "c".
	At Appendix B you can find a simple benchmark of the difference
	and a demonstration of what happens if LANG is not set.

Conclusion
	Setting the LANG variable looks to slow down a system.
	It is good to try to optimise the procedure of opening the
	related NLS files. A first step (important on slow machines)
	is to enable libc to locate on the first time the files needed.

Thanks,
Simos Xenitellis
Hellenic Localisation Project
http://hlp.sourceforge.net

Appendix A.
Internals of the command "ls -z", when ran with "LANG=greek".
We show the "open" system calls.

open("/usr/share/locale/locale.alias", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_IDENTIFICATION", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_IDENTIFICATION", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/lib/locale/el_GR/LC_IDENTIFICATION", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_MEASUREMENT", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_MEASUREMENT", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/lib/locale/el_GR/LC_MEASUREMENT", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_TELEPHONE", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_TELEPHONE", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_TELEPHONE", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_ADDRESS", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_ADDRESS", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_ADDRESS", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_NAME", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_NAME", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR/LC_NAME", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_PAPER", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_PAPER", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR/LC_PAPER", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_MESSAGES", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_MESSAGES", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_MESSAGES", O_RDONLY) = 4
open("/usr/lib/locale/el_GR/LC_MESSAGES/SYS_LC_MESSAGES", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_MONETARY", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_MONETARY", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_MONETARY", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_COLLATE", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_COLLATE", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_COLLATE", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_TIME", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_TIME", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR/LC_TIME", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_NUMERIC", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_NUMERIC", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR/LC_NUMERIC", O_RDONLY) = 4
open("/usr/lib/locale/el_GR.ISO-8859-7/LC_CTYPE", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/lib/locale/el_GR.iso88597/LC_CTYPE", O_RDONLY) = -1 ENOENT (No
such file or directory)
open("/usr/lib/locale/el_GR/LC_CTYPE", O_RDONLY) = 4
open("/usr/share/locale/el_GR.ISO-8859-7/LC_MESSAGES/libc.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/el_GR.iso88597/LC_MESSAGES/libc.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/el_GR/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/share/locale/el.ISO-8859-7/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/el.iso88597/LC_MESSAGES/libc.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/el/LC_MESSAGES/libc.mo", O_RDONLY) = 4
open("/usr/lib/gconv/gconv-modules", O_RDONLY) = 4
open("/usr/lib/gconv/ISO8859-7.so", O_RDONLY) = 4
open("/usr/share/locale/el_GR.ISO-8859-7/LC_MESSAGES/fileutils.mo",
O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/el_GR.iso88597/LC_MESSAGES/fileutils.mo",
O_RDONLY) = -1 ENOENT (No such file or directory)
open("/usr/share/locale/el_GR/LC_MESSAGES/fileutils.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/el.ISO-8859-7/LC_MESSAGES/fileutils.mo", O_RDONLY)
= -1 ENOENT (No such file or directory)
open("/usr/share/locale/el.iso88597/LC_MESSAGES/fileutils.mo", O_RDONLY) =
-1 ENOENT (No such file or directory)
open("/usr/share/locale/el/LC_MESSAGES/fileutils.mo", O_RDONLY) = 4


Appendix B.

Running the command "ls -z 2&> /dev/null" one thousand times,
using the default configuration of glibc and LANG=greek, produces

1. Using default installation parameters
Time spent in user mode   (CPU seconds) : 8.150s
Time spent in kernel mode (CPU seconds) : 2.430s
Total time                              : 0:10.80s
CPU utilisation (percentage)            : 97.9%
Times the process was swapped           : 0
Times of major page faults              : 175218
Times of minor page faults              : 60836

2. Using modified parameters so that all locale files are found
with the first attempt (using symlinks)
Time spent in user mode   (CPU seconds) : 8.390s
Time spent in kernel mode (CPU seconds) : 2.350s
Total time                              : 0:10.95s
CPU utilisation (percentage)            : 98.0%
Times the process was swapped           : 0
Times of major page faults              : 175218
Times of minor page faults              : 62838

The difference is about 150ms.
We used the tcsh "time" facility as enabled by the tcshrc package
(http://tcshrc.sourceforge.net)
The important value is the "Total time" recorded.
Other values may not be very reliable, like distribution between
user/kernel mode and CPU utilisation.
We contacted the benchmark on a PIII 700Mhz system and 128MB RAM.
Slower systems and with less RAM should be affected more.
It was obvious that the affected files were cached in RAM all time.

i. Setting LANG=spanish we got (default installation)
Time spent in user mode   (CPU seconds) : 8.180s
Time spent in kernel mode (CPU seconds) : 2.540s
Total time                              : 0:10.94s
CPU utilisation (percentage)            : 97.9%
Times the process was swapped           : 0
Times of major page faults              : 175218
Times of minor page faults              : 62839

ii. Unsetting the LANG variable (default installation)
Time spent in user mode   (CPU seconds) : 1.480s
Time spent in kernel mode (CPU seconds) : 1.300s
Total time                              : 0:02.83s
CPU utilisation (percentage)            : 98.2%
Times the process was swapped           : 0
Times of major page faults              : 124200
Times of minor page faults              : 39331


Disclaimer: We did not test the above in single user mode but under
X and with network connectivity. We repeated the mesurement 5 times
and recorded a typical value.




More information about the I18ngr mailing list