A Technical Comparison of glibc 2.x With Legacy System Libraries

The Wayback Machine - http://web.archive.org/web/20040411191201/http://people.redhat.com:80/%7Esopwith/old/glibc-vs-libc5.html

A Technical Comparison of glibc 2.x With Legacy System Libraries

In order for application software to be able to run, an operating system must provide several key components. On Linux systems, the Linux kernel is widely known as being one of those components, but not everyone is aware of the integral role that the standard C library (commonly referred to as "libc") plays in the framework necessary to run virtually all software. UNIX has historically been written in the C programming language, so it is no surprise that programmers created a set of programming routines that could be used to make writing software in C easier. The IEEE codified a standard, the POSIX.1 standard, to specify the programming routines that an operation system must make available to be compatible with the UNIX standard.

History

Over its brief but fast-moving history, the GNU/Linux operating system has used several different series of libc's to provide this functionality. During the very early days of Linux development, several consecutive libc series ensued, each one attempting to come a bit closer to POSIX.1. These series, libc's 1-4, were the C libraries for programs stored in the a.out binary format.

libc 5 heralded the move to storing programs in the ELF (Executable and Linking Format) binary format, which provided much greater flexibility and the ability to easily make libraries of routines that could be shared between running programs. While a.out had this ability, ELF made it easy, and brought Linux up to speed with other modern UNIX-like OS's (such as Sun Solaris, Digital UNIX, and IRIX) that use the same format.

The Move to glibc

libc 5 worked well, but it had several drawbacks. The GNU project had had a C library (hereafter called glibc), from which libc 4 and 5 had been derived. Although work on glibc had been halted for a long time, programmers on the Internet decided to resume work, and glibc 2.0 showed definite advantages over libc 5. H.J. Lu, the libc 5 maintainer, decided not to continue support of libc 5, and recommended the use of glibc. Red Hat Software, following this lead, decided to switch to using glibc as the main C library for their C distribution, and released Red Hat Linux 5.0 in December of 1997, with glibc 2.0.5c serving in this role. Other distributions, most notable Debian GNU/Linux, are planning to switch to glibc in the future, too.

Why Switch?

People are reluctant to make any change when the present solution seems to work well, and the transition from libc 5 to glibc was no exception. Here are the main reasons why the switch was needed:

glibc offers much greater standards conformance. For example, the POSIX.1 test suite wouldn't even compile under libc5, but all of the tests compile and run almost perfectly under glibc. This gives UNIX software developers confidence that software that uses the POSIX.1 Application Programming Interface (API) is portable between Linux and other UNIX-like systems.
glibc offers programming features that libc 5 does not have:
- Name Service Switch - the ability to change the databases access methods that programs use to access basic system information. For example, with these switchable name service modules, your system can instantly recognize all of the users listed in a network server's Lightweight Directory Access Protocol (LDAP) listing.
- Support for IPv6, the next generation networking protocol of the Internet, is included in glibc.
- glibc has support for 64-bit file data access, allowing programs to access up to 16 million terabytes of data (by my estimation, the amount of information stored in 87 billion encyclopedia sets, enough to form a line of all the way from the Earth to Venus).
- Improved support for internationalization/localization, allowing the user to receive program messages in their native language.
- libc 4 & 5 had been hacked together from a very very very old glibc (version 1.09, I believe). libc 5 had reached the end of its usable lifecycle. The code base was becoming ugly and unmaintainable, and many security holes were still hidden. It was also very difficult to port to new architectures and operating systems.
  
  In contrast, glibc 2.x has been engineered to allow ease of portability and maintainability. The code base is much cleaner and easier to find remaining problems in.
- In addition, the portability mechanism mentioned above allows programmers to provide optimized routines for a particular architecture. This makes some programs run faster when using glibc than the same program run using libc 5.
- Programs written for glibc are also more portable than those written for libc 5, because they are insulated from the operating system kernel internals.
- Writing multithreaded programs using libc 5 ranged between difficult and impossible. In contrast, glibc supports multithreading properly, allowing programmers to take the fullest possible advantage of symmetric multiprocessing and other performance-increasing techniques.
- Programs compiled to use libc 5 will not run with glibc, and vice versa. However, glibc includes support for "symbol versioning". This will virtually eliminate the need for another incompatible libc switchover, like the one from libc 5 to glibc, to ever happen again.
The present glibc developers have a good bug tracking mechanism to ensure that problems get fixed.

During the transition, glibc has definitely had its teething pains, but it is needed to ensure that Linux is ready to expand into new markets.

Please send comments, questions, additions, and corrections to Elliot Lee

References

The actual list of reasons is largely taken from a newsgroup post of mine on the subject.

The main glibc web page is here.

Last modified: Thu Jul 9 1998

Feb	APR	Jun
	11
2003	2004	2005