Case Study: Pintos Booting
Pintos is a simple operating system framework for the 80x86 architecture. It supports kernel threads, loading and running user programs, and a file system, but it implements all of these in a very simple way.
Loading
The loader is in threads/loader.S
The PC BIOS loads the loader from the first sector of the first hard disk (MBR master boot record) into memory. MBR is comprised of partition table and boot loader.
The loader finds the kernel by reading the partition table on each hard disk and finding the bootable partition of the type used for a Pintos kernel. Then the kernel is loaded into the memory and execute.
PC conventions reserve 64 bytes of the MBR for the partition table, and Pintos uses about 128 additional bytes for kernel command-line arguments. This leaves a little over 300 bytes for the loader's own code.
PC Bootstrap
Bootstrapping: the process of loading the operating system into memory for running after a PC is powered on.
Bootloader can be stored in memory, floopy disk and partitioned computer mass storage devices like fixed disks or removable drives.
IA32 bootloaders generally have to fit within 512 bytes in memory for a partition or floppy disk bootloader. For a bootloader in the Master Boot Record (MBR), it has to fit in an even smaller 436 bytes. The BIOS and bootloader should be written in assembly.
The PC’s Physical Address Space
The 640KB area marked "Low Memory" was the only random-access memory (RAM) that Intel 8086 could use.
The 384KB area from 0x000A0000
through 0x000FFFFF
was reserved by the hardware for special uses such as video display buffers and firmware held in non-volatile memory. The most important part of this reserved area is the BIOS.
Nowadays, the PC architects still preserved the original layout for the low 1MB of physical address space in order to ensure backward compatibility with existing software (especially Intel 8086 with 20 address lines).
Modern PCs therefore have a "hole" in physical memory from 0x000A0000 to 0x00100000, which was used for devices in 8086, dividing RAM into "low" or "conventional memory" (the first 640KB) and "extended memory" (everything else). Instead, space at the very top of the modern PC's 32-bit physical address space, above all physical RAM, is now commonly reserved by the BIOS for use by 32-bit PCI devices.
The 80386 and later CPUs define the following predefined data in CPU registers after the computer resets:
Bootloader
Floppy and hard disks for PCs are divided into 512-byte regions called sectors.
If the disk is bootable, the first sector is called the boot sector, since this is where the boot loader code resides.
When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector into memory at physical addresses 0x7c00
through 0x7dff
, and then uses a jmp
instruction to set the CS:IP
to 0000:7c00
, passing control to the boot loader.
IA32 bootloaders have the unenviable task of running in real mode (also known as 16-bit mode). In this mode, the segment registers are utilized to compute memory addresses using the following formula: address = 16 * segment + offset
. In later chapters, we will see more details about addressing with segment registers.
The code segment CS
is used for instruction execution. For instance, if the BIOS jumps to 0x0000:7c00
, the corresponding physical address would be 16 * 0 + 7c00 = 7c00
. Other segment registers include SS for the stack segment, DS for the data segment, and ES for data movement.
It should be noted that each segment is 64KiB in size. Since bootloaders often need to load kernels larger than 64KiB, they must carefully utilize the segment registers.
Physical Memory Map
00000000
-000003ff
CPU
Real mode interrupt table.
00000400
- 000005ff
BIOS
Miscellaneous data area.
00000600
- 00007bff
--
---
00007c00
- 00007dff
Pintos
Loader.
0000e000
- 0000efff
Pintos
Stack for loader; kernel stack and struct thread
for initial kernel thread.
0000f000
- 0000ffff
Pintos
Page directory for startup code.
00010000
- 00020000
Pintos
Page tables for startup code.
00020000
- 0009ffff
Pintos
Kernel code, data, and uninitialized data segments.
000a0000
- 000bffff
Video
VGA display memory.
000c0000
- 000effff
Hardware
Reserved for expansion card RAM and ROM.
000f0000
- 000fffff
BIOS
ROM BIOS.
00100000
- 03ffffff
Pintos
Dynamic memory allocation.
Low-level Kernel Initialization
The entry point for the kernel is the start()
function in threads/start.S
. This function's primary role is transitioning the CPU from 16-bit "real mode" to 32-bit "protected mode," which modern 80x86 operating systems use.
Key steps in this process include:
Determining Memory Size: The code first queries the BIOS for the PC's memory size, storing this data in
init_ram_pages
.Enabling the A20 Line: This step is crucial for accessing memory beyond 1 MB.
Introduction of A20 Line80286 and Higher Addressing: The 80286 processor introduced a 24-bit address bus, allowing direct addressing up to 16 MB of memory. This change meant that the address wraparound at the 1 MB boundary no longer occurred naturally.
Compatibility Issues: Many older software programs written for the 8086 relied on this wraparound behavior for certain operations. When running on the 80286, this software would malfunction because the memory addresses beyond 1 MB would not wrap around to the start but continue into the higher memory.
Solution – The A20 Line: To maintain backward compatibility, the A20 line was introduced. This is the 21st address line in the 80286 and later processors. When the A20 line is enabled, addresses can access memory beyond 1 MB. When it is disabled, the address line is forced low (0), causing addresses at and above 1 MB to wrap around to the beginning of memory, simulating the 8086's behavior.
Creating a Basic Page Table: This table maps the first 64 MB of virtual memory to their corresponding physical addresses. It also maps the same physical memory starting at
LOADER_PHYS_BASE
(default: 3 GB).Final Preparations for Protected Mode: The startup code enables protected mode, sets up paging, and prepares the segment registers. Interrupts are disabled at this stage as the system isn't ready to handle them yet.
Initial Kernel Functions: The process then proceeds to call
pintos_init()
.
High-level Kernel Initialization
In pintos_init()
:
Call
bss_init()
. In most C implementations, whenever you declare a variable outside a function without providing an initializer, that variable goes into the BSS.Call
read_command_line()
to break the kernel command line into arguments, thenparse_options()
into read any options at the beginning of the command line.Call
thread_init()
initializes the thread system.Initialize the console and print a startup message to the console.
Initialize the kernel’s memory system.
palloc_init()
sets up the kernel page allocator, which doles out memory one or more pages at a timemalloc_init()
sets up the allocator that handles allocations of arbitrary-size blocks of memorypaging_init()
sets up a page table for the kernel
Initializes the interrupt system.
intr_init()
sets up the CPU's interrupt descriptor table (IDT) to ready it for interrupt handlingtimer_init()
andkbd_init()
prepare for handling timer interrupts and keyboard interrupts, respectively.input_init()
sets up to merge serial and keyboard input into one stream.
Start the scheduler with
thread_start()
, which creates the idle thread and enables interrupts.serial_init_queue()
switch to interrupt-driven serial port I/O mode.timer_calibrate()
calibrates the timer for accurate short delays.If the file system is compiled in, we initialize the IDE disks with
ide_init()
, then the file system withfilesys_init()
.Boot complete.
run_actions()
parses and executes actions specified on the kernel command line.Finally, if
-q
was specified on the kernel command line, we callshutdown_power_off()
to terminate the machine simulator. Otherwise,pintos_init()
callsthread_exit()
, which allows any other running threads to continue running.
Last updated