One of the most remarkable achievement in the history of mankind is computers. Another amazing fact about this remarkable achievement called computers is that its a collection of different electronic components, and they work together in coordination to give you a meaningful output. We in our day to day lives use this complex system in one or the other way. But not many of us know how these things work together.
In this tutorial, we will be discussing how a computer boots. I must say how a computer with Linux operating system boots. A couple of days back one of our reader requested us to put this up, as this is a common question asked in interviews. I hope this will be helpful to other readers as well.
Understanding the boot process, will help you understand how the hardware and software is working together, and also will give you the required information to begin troubleshooting a booting problem you have.
Linux booting process can be divided to multiple stages. The below shown diagram explains the different stages. We will be discussing each of these stages in detail, in their respective sections.
Step 1: Power Supply & SMPS
One of the main component of a computer is SMPS(Switching Mode Power Supply). The primary objective of this component is to provide the perfect required voltage level to the motherboard and other computer components. Computer internals work in DC, however the power supply we have at home and other places are AC. SMPS converts AC to DC and maintain the required voltage level so that the computer can work flawlessly.
But the main task performed by SMPS, is to inform about the good power supply. As I told if the voltage is more/ or is less, in both the conditions a computer cannot work. As soon as you supply power to the computer, the SMPS checks the voltage level's its providing to the motherboard. If the power signal level is perfect, then SMPS will send a POWER GOOD signal to the motherboard timer.
On receiving this POWER GOOD signal from SMPS, the motherboard timer will stop sending reset signal to the CPU. Which means the power level is good and the computer can boot.
Step 2: Bootstrapping
Something has to be programmed by default, so that the CPU knows where to search for instructions.
This is an address location in the ROM. This address location is almost always constant in X86 based computers. The address location is FFFF:0000h.
This address location is the last region of the ROM. It only contains one instruction. The instruction is to jump to another memory address location. This JUMP command, will tell the location of the BIOS program in the ROM.
This is how the computer will come to know where the BIOS program is located.
Step 3: The Role of BIOS in booting process
The word booting comes from another word called bootstrapping. The computer knows how to bring itself up, when you press the start button, because of the instructions that are fed to a program called as BIOS. BIOS stands for Basic Input Output System. The most important use of BIOS during the booting process is POST. POST stands for Power on Self Test. Its a series of tests conducted by the bios, which confirms the proper functioning of different hardware components attached to the computer.
POST is very important thing to have before the Operating system is loaded. Just imagine if you have a faulty hard drive or faulty memory, sometimes these things can cause data loss. POST checks and confirms the integrity of the following hardware components.
Timer IC's
DMA controllers
CPU
Video ROM
A full POST check will confirm the integrity of the following devices as well.
Motherboard
Keyboard
Printer port
Hard Drive etc
If you are doing a warm start (which means you did a reset of a running machine, most of the times reset button is the small one near the power button on the CPU), a full POST check will not be conducted by the BIOS. However if you are doing a Cold Start, which means you have applied the power now, it will conduct a full POST.
BIOS determines whether its a cold or warm start, by looking at a flag in a predefined memory location. Once the POST completes, the BIOS will inform you about any problems it found with the help of beep codes (through system speaker). Different number of beep codes have different meaning.
There are two things that people often get confused with. Its CMOS & BIOS. CMOS & BIOS are two completely different things in the computer motherboard. CMOS is a small memory RAM chip that's present in the motherboard. This RAM is different from the computers main RAM chip (which are replaceable memory chips.
These days RAM chips are available in the range of Gigabytes.). Unlike the main RAM chip, CMOS RAM does not flush its memory when a computer is turned off. It remembers all the configuration with the help of a battery called CMOS battery.
Removing a CMOS battery will make the CMOS to forget all the configuration you have saved previously.
This is the reason you can unlock a computer that's protected with a CMOS password, by simply removing the CMOS battery.
Also removing the CMOS battery will make the operating system to show you wrong time. Because system time consistency is maintained in CMOS settings.
So its always advisable to replace your CMOS battery at regular intervals, for proper functioning of the computer.
Normally people say that we have modified BIOS settings. But its completely wrong. What they modified is in fact CMOS settings. CMOS settings is the place where you modify the boot order etc.
Bios settings cannot be altered by the user. It requires a flash program provided by the manufacturer.
So Let's get back to our booting process. Once the POST check is completed successfully, BIOS will look CMOS settings to know what is the boot order. Boot order is nothing but a user defined order which tells where to look for the operating system.
The order will be something like the below.
CD ROM
HARD DISK
USB
Floppy DISK
The above shown order means that the BIOS will look at CD ROM first to check whether an OS can be loaded from there, if it does not find a bootable disk in the CD ROM, it will look check whether a bootable OS is there in the hard disk, then USB and then Floppy disk.
Let's assume that you don’t have a bootable CD in your CD ROM drive, then the BIOS will turn to HARD disk.
Step 4: MBR and GRUB
Now as you don't have any bootable CD in your CD ROM Drive, the bios will look at the second device from the boot order settings. The second device is your Hard Disk.
BIOS is programmed to look at a permanent location on the hard disk to complete its task. This location is called a Boot sector. This is nothing but the first sector of your hard disk. This area is sometimes called as MBR (Master Boot Record). This is the location that contains the program that will help our computer to load the operating system. As soon as bios finds a valid MBR, it will load the entire content of MBR to RAM, and then further execution is done by the content of MBR.
This first sector of the hard disk is only of 512 bytes. Yeah its too small an area for an entire boot loader program to fit in. Hence most of the operating system store only the first stage of their boot loader program in here. Only the first 440 bytes from the total of 512 bytes is used by the first stage boot loader, the remaining part is used to store partition table information.
Read:What is MBR and GPT Partition Table
As the title of this article says "Linux Booting process", we will be having grub stage one in the first 440 bytes of the MBR. I would suggest to read the below artcle to understand grub and its details.
Read:Grub tutorial
You can take an entire dump/backup of your MBR in linux with the help of the following command.
1 2 3 4 | root@ubuntu-dev:~# dd if =/dev/sda of=mbr bs= 512 count= 1 1 + 0 records in 1 + 0 records out 512 bytes ( 512 B) copied, 0.000134061 s, 3.8 MB/s |
What the above command does is to simply dump the contents of your first sector which of 512 bytes to a file named mbr. bs in the above command stands for block size. And count means to dump only the first sector (the dump will start from the beginning of the hard disk, and the size is 512 bytes)
The MBR contains the first stage of the grub, and partition table information. You can see the partition table information from our MBR backup file as shown below.
1 2 | root@ubuntu-dev:~# file mbr.bin mbr.bin: x86 boot sector; partition 1 : ID= 0x83 , active, starthead 32 , startsector 2048 , 19451904 sectors; partition 2 : ID= 0x5 , starthead 254 , startsector 19455998 , 2093058 sectors, code offset 0x63 |
I have only two partition on the system, that's why its showing partition 1 & partition 2. The MBR contains of the following things to summarize. From the partition table shown in the above mbr dump output, Grub stage 1 in the MBR will look for a partition with active flag set on it. Please remember the fact that there can only be one active partition from the 4. If you see my partition table output above, partition 1 is having an active flag.
First stage grub
Partition table information
Error messages
Magic Number
The fourth point is a Magic Number of 2 bytes. This magic number serves as a method of verification for the MBR. This 2 byte magic number will contain values that will be something like AA55. A different magic number indicates a corrupted MBR or invalid MBR.
The primary job of the stage 1 bootloader is to load the second stage boot loader. The second stage boot loader is the stage 2 grub, that actually does the job of loading the kernel and other initrd p_w_picpath files (we will come to that part in some time). GRUB (Grand Unified Boot Loader) is the combined name given to different stages of grub.
Now there is a little surprise that am going to reveal. There are not two stages of grub. But there are three stages of grub in total.I will explain you the reason behind this. The three stages are mentioned below.
GRUB Stage 1
GRUB Stage 1.5
Grub Stage 2
Now from where does this stage 1.5 came. The hard disk sectors are counted form 0 to the last sector. As explained previously the first sector (sector 0) contains the GRUB stage 1. Normally partititons will not start before sector 63. So partitions will start form sector number 63. Hence we have sectors form 1-63 free. This space is used for storing GRUB stage 1.5. This free space between MBR and the beginning of the partitions is called as MBR GAP.
Now you might think what is the requirement of an additional stage in grub. If you are a linux guy, you might have already configured grub configuration file. If you remember the configuration, it contains the kernel file location and name, its partition. Now how will the grub access those kernel files without the file system drivers?
Grub Stage 1.5 located in the MBR GAP (sector 1 to 63 before the beginning of the first partition) basically contains the drivers for reading file systems. So grub stage 1 will load grub stage 1.5 to the RAM, and will pass the control to it.
Now grub stage 1.5 will load the file system drivers and once the file system drivers are loaded, it can now access /boot/grub/grub.conf file which contains other details about kernel path and initrd path etc.
Now this is the point where you are presented with a beautiful TUI (Terminal user interface), where you can select your operating system kernal and press enter to boot it.
Step 5 Loading The kernel Image
Similar to GRUB, kernel is also loaded in stages. A linux kernel is responsible for handling Process management, Memory Management, Users, Inter process communication etc. I must say kernel is never used by a user. What the kernel does is to maintain a good environment for programs to run. Yeah we use kernel thrugh different programs.
Kernel is a compressed p_w_picpath file. The location of this compressed kernel p_w_picpath is specified in the grub 2 configuration file. Its basically an executable bzImage file.
Now you need to have a lot of drivers and modules to access underlying hardware and other stuff. For example, if you have RAID configured on your / partition how will you mount it without knowing the programs, or think how will you include a kernal module or how to remove a kernel module. All these things required different set of programs and code. Including all these codes inside the kernel will make it a larger file p_w_picpath.
But our kernel p_w_picpath file needs to be smaller, that's the reason its compressed p_w_picpath file. So most of these drivers and tools along with a small similar root file system feel is given by initrd. Initrd is sometimes called as initial root file system. This is used by the kernel before the real root file system is mounted. Initrd is available in the form of an p_w_picpath similar to the kernel p_w_picpath file. Let's see what's the content of this initrd p_w_picpath file. You can find the initrd p_w_picpath file and the kernel p_w_picpath file in the /boot directory.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | [root@localhost initrd]# mv initramfs- 2.6 . 32 - 358.14 . 1 .el6.x86_64.img initramfs- 2.6 . 32 - 358.14 . 1 .el6.x86_64.gz [root@localhost initrd]# gunzip initramfs- 2.6 . 32 - 358.14 . 1 .el6.x86_64.gz [root@localhost initrd]# cpio -id < initramfs- 2.6 . 32 - 358.14 . 1 .el6.x86_64 90587 blocks [root@localhost initrd]# ls bin initqueue mount sysroot cmdline initqueue-finished pre-pivot tmp dev initqueue-settled pre-trigger usr dracut- 004 - 303 .el6 initqueue-timeout pre-udev var emergency initramfs- 2.6 . 32 - 358.14 . 1 .el6.x86_64 proc etc lib sbin init lib64 sys |
If you see the above commands, we have first uncompressed the initrd p_w_picpath file, then we can view the contents of that file with the help of cpio command.
Now you can see the contents of initrd p_w_picpath file. There are folders that are very much similar to our linux directory strucutre. There is /etc/, /lib, and some necessary commands in /sbin etc. Its a small root file system that the kernel loads as a temporary root file system before the real root file system is loaded.
Loading and unloading of kernel modules is done with the help of programs like insmod, and rmmod present in the initrd p_w_picpath.
Now as the kernel is loaded into the memory, the execution begins by checking the processor family and architecture. The kernel conducts a lot of hard ware specific operations and the first user space program it executes is /sbin/init.
As this is the first program executed by the kernel, it has got a process id number of 1. The process id number 1 for init was not kept intentionally, but it was due to the fact that it is the first process executed by the kernel. Now as soon as the kernel executes the init process, it will look at the /etc/inittab configuration file to see the default run level.
There are different run level's in linux.
Run-Level | Usage |
0 | System Halt/Shut Down |
1 | Single User Mode |
2 | Multiuser Mode Without Networking |
3 | Full Multiuser Mode |
4 | Unused |
5 | GUI/X11 |
6 | Reboot |
The /etc/inittab file contains the default run level like the one shown below.
1 | id: 3 :initdefault: |
Now that means, we have a run level 3 as the default run level. Once this is identified, then run level specific programs are started by the kernel. This is the reason you have the following directories.
1 2 3 4 5 6 7 8 9 10 11 12 13 | [root@localhost]# cd /etc/rc.d/ [root@localhost rc.d]# ll drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 init.d -rwxr-xr-x. 1 root root 2617 Jul 24 08 : 53 rc drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc0.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc1.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc2.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc3.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc4.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc5.d drwxr-xr-x. 2 root root 4096 Oct 8 00 : 50 rc6.d -rwxr-xr-x. 1 root root 499 Aug 13 10 : 55 rc.local -rwxr-xr-x. 1 root root 19216 Jul 24 08 : 53 rc.sysinit |
The folders rc0.d, rc1.d, rc2.d etc contains run level specific programs that will be executed depending upon the default run level you have in your inittab configuration file.
Let's see what are the contents inside these run level folders.
1 2 3 4 5 | [root@localhost rc3.d]# ls K01numad K80kdump S13cpuspeed S28autofs K01smartd K84wpa_supplicant S13irqbalance S50choose_repo K02oddjobd K86cgred S13rpcbind S50cloud-init-local K10psacct K87restorecond S15mdmonitor S50mcelogd |
If you see the files inside these run level specific folders, they either begin with S or they begin with K. The files are also numbered. Now files with an S at starting will be executed during the startup process, and files that begins with K, will be killed during shutdown process.
The number after either S or K is the sequence with which these will be executed.
Once the kernel has started all programs in your desired run level directory. You will get a login screen to log inside your booted system. Hope this article was helpful and easy to understand. Please let us know in case anybody finds a mistake in this article, so that we can correct it.