← Back to Logs

How the Linux Boot Process Actually Works

Try the interactive lab for this articleTake the quiz (6 questions · ~5 min)

Press the power button on a Linux machine and you do not start Linux. You start a sequence of environments, each one building just enough state for the next one to exist. Firmware starts with no kernel. The bootloader starts with no userspace. The kernel starts with no real root filesystem. The initramfs starts with no final service graph. PID 1 starts with a half-built machine that still needs devices, mounts, sockets, sessions, and policy. By the time you reach a shell prompt or a display manager, control has crossed multiple trust boundaries, memory views, storage abstractions, and process trees.

This is the source of most boot confusion. People say "Linux will not boot" as if there is one boot component. There is not. A blank screen before GRUB, a Secure Boot failure in shim, a kernel panic after unpacking the initramfs, a missing /dev/mapper/cryptroot, a dead systemd-udevd, and a slow NetworkManager-wait-online.service all feel similar from the outside. Internally they belong to different owners, different logs, and different debugging tools.

The clean way to understand Linux boot is to ask one question at every moment: who currently owns the machine? When you know that, the next question becomes obvious. Which layer's assumptions are no longer true?

This article follows a modern x86-64 Linux system from power-on to a working login session. The main reference path is:

  • UEFI firmware
  • Secure Boot enabled
  • shim and GRUB
  • Linux kernel with EFI stub
  • initramfs in RAM
  • encrypted LUKS root on NVMe
  • systemd as PID 1

Where older BIOS systems, embedded boards, or cloud images differ, those differences will be called out explicitly. The point is not to document one distribution. The point is to build a model you can reuse when a machine in Athens drops into an initramfs shell, when a server in Frankfurt starts hanging after firmware updates, or when a laptop in Berlin suddenly takes twenty seconds longer to reach a login prompt.

Stage 0: The CPU Starts in Firmware, Not in Linux

On a normal PC the CPU leaves reset and executes firmware from flash. There is no kernel yet, no userspace, and often no normal RAM access in the way the operating system will later use RAM. On a modern machine the firmware is usually UEFI. Its responsibilities include:

  • bringing up the CPU and chipset
  • training DRAM
  • enumerating buses such as PCIe
  • building ACPI tables
  • exposing a memory map
  • choosing a boot entry from NVRAM

By the time Linux enters the picture, firmware has already decided a great deal about the machine's shape. Device paths exist. Boot entries exist. The display may be active. Storage may be reachable through firmware drivers. Secure Boot policy may already have rejected some binaries.

On many systems the firmware boot manager is reading variables like these:

BootCurrent: 0003
Timeout: 1 seconds
BootOrder: 0003,0000,0001,0002
Boot0000* Windows Boot Manager
Boot0001* UEFI OS
Boot0002* UEFI PXEv4
Boot0003* Debian

You can inspect those after boot with:

sudo efibootmgr -v

That command is useful because it exposes the state firmware used before the kernel ever started. If the wrong loader appears, the bug may sit in UEFI variables rather than anywhere in Linux.

At this stage Linux is only a file on disk, usually on the EFI System Partition. Firmware does not "know Linux". It knows it can load an EFI executable from a FAT filesystem and jump into it.

Stage 1: The First Executable Is Usually Not the Kernel

On a Secure Boot machine the first EFI executable is often shim, not GRUB and not the kernel. The firmware trusts keys stored in its databases. Consumer machines commonly trust Microsoft-signed EFI binaries by default. Linux distributions use that fact to insert a small signed bridge, shim, into the path.

The chain often looks like this:

  1. UEFI loads shimx64.efi from the EFI System Partition.
  2. shim verifies GRUB or another next-stage loader against an embedded certificate or Machine Owner Key set.
  3. GRUB verifies or loads the kernel and initramfs.
  4. Control moves into Linux.

The reason this extra stage exists is practical. Without it, every distribution would need its own key enrolled into firmware on commodity hardware. shim lets distributions work with the trust store that machines already ship with.

You can see the relevant EFI binaries on a typical Debian-family system:

ls -R /boot/efi/EFI

Typical output:

/boot/efi/EFI:
BOOT  debian  Microsoft
 
/boot/efi/EFI/debian:
grubx64.efi  mmx64.efi  shimx64.efi

If Secure Boot breaks, the symptom may look dramatic, the root cause often is not. A revoked signing certificate, a replaced but unsigned GRUB binary, or a stale NVRAM boot entry can stop the chain before Linux has emitted a single kernel log line.

Stage 2: GRUB Chooses a Kernel, an Initramfs, and a Command Line

GRUB is more than a menu. It is a program loader, a filesystem reader, a configuration interpreter, and in some cases a crypto-aware pre-boot environment. On an ordinary Linux installation GRUB is responsible for:

  • reading grub.cfg
  • presenting entries or auto-selecting one
  • loading the kernel image into memory
  • loading the initramfs into memory
  • constructing the kernel command line
  • handing off boot parameters in the format the kernel expects

A menu entry may boil down to two important lines:

linux   /boot/vmlinuz-6.11.0 root=UUID=4d1d3f77-... ro quiet splash rd.luks.uuid=luks-1f2e...
initrd  /boot/initrd.img-6.11.0

The kernel command line matters more than most people realise. It tells Linux where the root filesystem lives, whether it should mount root read-only first, which console to use, whether to suppress boot output, whether to enter a different systemd target, and what early boot helpers must activate.

Common parameters include:

  • root=UUID=...
  • ro
  • rw
  • quiet
  • splash
  • console=ttyS0,115200
  • systemd.unit=rescue.target
  • rd.luks.uuid=...
  • resume=UUID=...
  • loglevel=7
  • init=/bin/sh

This is a direct control plane over early boot. If you add init=/bin/sh, Linux still boots, but the normal PID 1 handoff is replaced by a shell. If you add systemd.unit=emergency.target, the service graph changes before userspace fully comes up. If you remove quiet, the kernel becomes far more talkative, which is often the fastest route to identifying where boot stopped.

You can inspect the command line of the running system with:

cat /proc/cmdline

That file is often the shortest path from symptom to cause. If the root UUID is wrong, or the machine is still carrying an old resume= target from a deleted swap partition, the boot problem may have been baked in at the bootloader stage.

Stage 3: The Kernel Image Is Loaded, but Linux Still Is Not Ready

Once GRUB jumps into the kernel, Linux begins in a tiny architecture-specific entry path. The kernel image is usually compressed. The early entry code has to:

  • establish a stack
  • interpret boot parameters
  • decompress the real kernel image if needed
  • set up early page tables
  • discover the memory map
  • reserve memory regions used by firmware, the kernel image, and the initramfs

This is still not "normal kernel execution". The system is in a special early boot regime where many conveniences that later kernel code expects do not exist yet.

On x86-64 the EFI stub and setup code bridge the gap between bootloader conventions and what the real kernel wants. Early console output starts here if you have not silenced it. A stripped-down log may look like this:

[    0.000000] Linux version 6.11.0-amd64 ...
[    0.000000] Command line: root=UUID=... ro quiet splash
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] e820: [mem 0x0000000000000000-0x000000000009ffff] usable
[    0.000000] e820: [mem 0x0000000000100000-0x000000087b7fffff] usable
[    0.000000] NX protection: active
[    0.000000] SMBIOS 3.4 present.

The kernel is reading what firmware told it about the machine and turning that into internal structures it can trust. If this stage fails, you often see:

  • a panic early
  • hard hangs after "Loading initial ramdisk"
  • complaints about ACPI, APIC, EFI memory descriptors, or page tables

That is already a strong clue. If you can see early kernel logs, firmware and bootloader stages mostly succeeded.

Stage 4: Memory Management Comes Online Before Most Other Things

The kernel cannot do much without an allocator. Before filesystems, drivers, and process creation can become normal, Linux has to turn the firmware-provided memory map into allocator state it can use. Very early in boot the kernel:

  • identifies usable physical memory ranges
  • reserves regions that must not be overwritten
  • creates page-frame metadata
  • initializes the buddy allocator
  • sets up higher-level allocators for small kernel objects

This is one of the reasons boot can fail in ways that mention memory zones or allocation flags before anything "interesting" seems to have happened. The machine may still be only seconds from reset, but allocation invariants already matter.

A real early boot log often includes lines like:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.000000]   Normal   [mem 0x0000000100000000-0x000000087b7fffff]
[    0.000000] Built 1 zonelists, mobility grouping on.
[    0.000000] Kernel command line: ...

The presence of those lines tells you memory setup reached the point where the core allocator structures exist. If the machine dies before this area, the failure is even earlier, usually in firmware handoff or low-level architecture setup.

Stage 5: ACPI, CPU Topology, Interrupts, and the Machine Description

After the earliest entry code stabilises the environment, Linux begins understanding the machine in more practical terms. It reads and processes firmware-provided data structures such as ACPI tables, CPU topology information, interrupt controller layouts, PCI configuration mechanisms, and NUMA details.

This stage matters because the kernel is not booting on an abstract computer. It is booting on one specific machine with:

  • a specific APIC layout
  • specific power states and AML methods
  • specific PCIe root complexes
  • specific CPU and memory topology
  • specific quirks that firmware may or may not describe correctly

A large category of "Linux boot regression" issues on laptops sits here. A firmware update changes ACPI tables. A new kernel uses them differently. Suddenly suspend or a specific controller probe starts misbehaving during boot.

Typical messages:

[    0.126114] ACPI: PM-Timer IO Port: 0x1808
[    0.128322] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[    0.143017] smpboot: CPU0: Intel(R) Core(TM) ...
[    0.151001] x2apic enabled
[    0.173400] pci 0000:00:1f.2: AHCI controller

These are not background details. They determine whether later driver and scheduler work can proceed sanely.

Stage 6: Initcalls Build the Rest of the Kernel in Waves

Much of the kernel is registered as initcalls. Subsystems ask to be initialised at one of several ordered levels. The effect is a layered bring-up:

  • core kernel bits first
  • architecture-specific setup next
  • bus infrastructure after that
  • filesystems and device drivers later
  • late init routines at the end

This ordering exists because the kernel is huge. It cannot initialise every subsystem at once. Dependencies must be honoured. A block driver depends on lower-level bus infrastructure. Filesystem code depends on VFS and memory allocators. Networking depends on core socket and packet infrastructure.

This staged design is also visible in logs. You can often identify where the boot is hanging by the class of messages that appear last:

  • if you are still seeing ACPI and CPU lines, you are early
  • if you are seeing PCI and storage controller lines, bus setup is underway
  • if you are seeing filesystem registration and block devices, device init is progressing
  • if you are seeing service manager logs, the kernel stage is mostly complete

Boot traces from ftrace or initcall debugging make this even clearer:

systemd-analyze plot > boot.svg

For deeper kernel-focused inspection you can also boot with:

initcall_debug ignore_loglevel

Then the kernel logs the timing and completion of initcalls. This is noisy, but for certain regressions it is exactly the noise you need.

Stage 7: The Initramfs Is a Whole Temporary Userspace

Many people speak as if the bootloader loads the kernel and then the kernel mounts the root disk. On many real systems there is a whole userspace environment in between. The initramfs is a compressed cpio archive unpacked into RAM. The kernel mounts it as the first root filesystem and executes /init from that environment.

This is not a trick or a minor helper. It is a temporary operating environment with its own jobs:

  • load kernel modules not built in
  • start userspace device management
  • discover block devices
  • unlock encrypted volumes
  • assemble RAID
  • activate LVM volume groups
  • mount the final root filesystem
  • hand off to the real root

You can inspect the initramfs content on a running machine:

lsinitramfs /boot/initrd.img-6.11.0-amd64 | less

You often find:

  • kernel modules
  • cryptsetup
  • udev rules
  • scripts for local top or local mount
  • hooks for resume, networking, mdraid, LVM

That list tells you what the distro expects to need before the normal root filesystem is available.

On a dracut system, the boot logic tends to be modular and event-driven. On initramfs-tools systems, the script layout differs. The purpose does not. The initramfs exists because the kernel alone often cannot mount the final root yet.

Stage 8: udev Makes Dynamic Devices Usable

The kernel can discover hardware, but userspace often provides the naming and policy layer that makes those discoveries practical. During initramfs boot, udevd or systemd-udevd receives uevents from the kernel and turns them into useful device nodes and symlinks.

Without that layer you might have a block device in the kernel sense, but not the stable user-facing paths scripts depend on:

  • /dev/nvme0n1p2
  • /dev/disk/by-uuid/...
  • /dev/mapper/cryptroot
  • /dev/md0

This matters because early boot scripts often reference devices by UUID or mapper names rather than hard-coded kernel names. The initramfs waits for the right node to appear, then proceeds.

A common failure mode looks like this:

ALERT! UUID=4d1d3f77-... does not exist. Dropping to a shell!

That message means more than "the disk is gone". It can also mean:

  • the storage driver never loaded
  • the device was found but renamed differently
  • LUKS unlock failed so the mapped device never appeared
  • LVM was not activated
  • the wrong UUID was baked into the command line

At this point you are already in early userspace. Firmware, bootloader, and much of the kernel path succeeded. The machine is now failing while trying to construct the final storage view.

Stage 9: Encryption, LVM, RAID, and Other Storage Stacking

Modern Linux root filesystems often sit behind several layers:

  • NVMe or SATA block device
  • partition table
  • LUKS encryption
  • device mapper target
  • LVM logical volume
  • ext4, xfs, or btrfs filesystem

The initramfs is where these layers become usable. Consider an encrypted machine:

  1. the storage driver makes /dev/nvme0n1p3 visible
  2. cryptsetup prompts for or otherwise retrieves the key
  3. the kernel creates /dev/mapper/cryptroot
  4. LVM scans and activates volume groups if needed
  5. /dev/mapper/vg0-root appears
  6. the filesystem is mounted at /sysroot

When boot fails on encrypted systems, the exact missing layer matters. If the kernel cannot see the NVMe controller, the problem is in driver or hardware bring-up. If the NVMe device exists but cryptsetup fails, the problem is in initramfs scripts or key handling. If decryption works but the logical volume is absent, LVM activation is the suspect.

This is where an initramfs shell is valuable. You can inspect reality directly:

blkid
ls /dev
ls /dev/mapper
cat /proc/cmdline
modprobe nvme
cryptsetup luksOpen /dev/nvme0n1p3 cryptroot
lvm vgchange -ay

That is far more informative than calling the whole problem "a Linux boot issue".

Stage 10: The Kernel and Initramfs Meet at switch_root

Once the initramfs has mounted the final root filesystem, the bootstrap environment is no longer needed as the main root. The handoff is usually done with switch_root or a similar mechanism.

A canonical form is:

switch_root /sysroot /usr/lib/systemd/systemd

This step is easy to underestimate. It is not a cosmetic change of directory. It is the moment the machine stops treating the RAM-backed bootstrap filesystem as its world and starts treating the real root filesystem as the system.

Before switch_root:

  • the machine is in early userspace
  • the initramfs owns the process tree
  • many mounts are temporary or preparatory

After switch_root:

  • the final root filesystem is /
  • PID 1 for the real system is about to start
  • the bootstrap environment can be discarded

If you land in an initramfs shell, the kernel has already succeeded in creating a userspace process. That narrows the fault domain sharply. The bug is somewhere between early userspace and the final root handoff.

Stage 11: PID 1 Starts, and Boot Stops Being Linear

After the root handoff, the kernel executes the configured init process. On most general-purpose Linux systems in 2026, that means systemd becomes PID 1.

This changes the shape of boot completely. Up to here, boot mostly looked like a chain. From here onward it looks like a dependency graph. systemd reads unit files and resolves relationships between:

  • mount units
  • service units
  • socket units
  • device units
  • timer units
  • target units

The machine moves through targets such as:

  • initrd.target
  • sysinit.target
  • basic.target
  • multi-user.target
  • graphical.target

What users feel as "boot time" is often mostly this graph settling. The kernel may be healthy while userspace still waits on:

  • filesystem checks
  • network readiness
  • storage timeouts
  • slow service startup
  • dependency cycles

A useful split is:

  • kernel time
  • userspace time

systemd-analyze shows both:

systemd-analyze

Example:

Startup finished in 7.214s (firmware) + 1.842s (loader) + 3.116s (kernel) + 6.908s (userspace) = 19.081s
graphical.target reached after 6.881s in userspace.

That line is gold. It tells you immediately whether the machine feels slow because of firmware, loader, kernel, or userspace.

Stage 12: Device Units, Mount Units, and Socket Activation

Once systemd is in charge, the rest of boot is mostly about satisfying dependencies. A service often does not start because another service "should go first". It starts because the units it depends on became active.

Examples:

  • a mount unit becomes active once the filesystem is available
  • a socket unit may activate before its service, allowing connections to queue
  • a device unit appears when udev and the kernel agree that a device exists
  • a service may depend on network-online.target, which can be much slower than network.target

This is one reason boot debugging changed over the last fifteen years. Old SysV init scripts encouraged thinking in serial order. Modern Linux userspace is more like a graph scheduler over units.

To see what actually blocked progress, use:

systemd-analyze critical-chain

To see what merely took time, use:

systemd-analyze blame

Those are different questions. A service can take ten seconds and still not control wall-clock boot time if nothing important waited on it.

Stage 13: Journald Turns Boot into a Single Timeline

One of the most useful things about a systemd machine is that kernel messages and userspace messages can be inspected together in the journal:

journalctl -b

This is often better than staring at console output because boot problems frequently straddle the kernel and userspace boundary. You may need to see:

  • kernel storage-driver messages
  • initramfs forwarding logs
  • systemd-udevd state
  • mount failures
  • login manager errors

Useful variants include:

journalctl -b -p warning
journalctl -b -u systemd-udevd
journalctl -b -u NetworkManager
journalctl -b -g cryptroot

If the machine reaches a rescue shell or later userspace, the journal is often the most complete story of what happened.

Stage 14: A Concrete End-to-End Boot Walk

Let us walk one realistic path on a laptop with:

  • UEFI firmware
  • shim and GRUB
  • Linux kernel 6.11
  • initramfs built by dracut
  • LUKS-encrypted root on NVMe
  • ext4 root filesystem
  • systemd

The flow looks like this:

  1. UEFI firmware finishes memory training, enumerates PCIe, validates Secure Boot state, and chooses the "Debian" NVRAM boot entry.
  2. Firmware loads shimx64.efi from the EFI System Partition.
  3. shim verifies and loads GRUB.
  4. GRUB reads grub.cfg, selects the default kernel, loads vmlinuz-6.11.0-amd64 and initrd.img-6.11.0-amd64, passes root=UUID=... rd.luks.uuid=... ro quiet.
  5. The kernel enters, sets up page tables, parses the memory map, builds the buddy allocator, initializes ACPI and CPU topology, and starts initcalls.
  6. The kernel unpacks the initramfs into RAM and executes /init.
  7. Early userspace starts udev, loads NVMe and dm-crypt modules if needed, waits for the root partition, prompts for the disk passphrase, maps /dev/mapper/cryptroot.
  8. The initramfs mounts the ext4 root filesystem at /sysroot.
  9. switch_root replaces the RAM root with the real root and executes systemd as PID 1.
  10. systemd mounts filesystems, starts sockets, initializes services, and reaches graphical.target.
  11. A display manager launches and presents the login screen.

At every line in that sequence there is a different class of failure. If GRUB appears, firmware mostly worked. If the kernel banner appears, GRUB mostly worked. If an initramfs shell appears, the kernel mostly worked. If PID 1 starts but login does not appear, the root handoff mostly worked. This is how stage models turn vague symptoms into actionable debugging.

Stage 15: Common Failures by Owner

Firmware owner failures

Symptoms:

  • machine loops back to firmware setup
  • no GRUB menu
  • Secure Boot complaints before Linux starts

Likely causes:

  • wrong BootOrder
  • deleted or stale EFI path
  • revoked or unsigned next-stage binary
  • firmware bug after update

Useful tools:

  • firmware setup screen
  • efibootmgr -v
  • checking files under /boot/efi/EFI

Bootloader owner failures

Symptoms:

  • GRUB rescue prompt
  • missing menu entries
  • wrong kernel chosen
  • root UUID mismatch in config

Likely causes:

  • broken grub.cfg
  • outdated initramfs path
  • old disk UUID after cloning or repartitioning
  • missing filesystem modules in GRUB

Useful tools:

  • GRUB command line
  • /boot/grub/grub.cfg
  • cat /proc/cmdline after a successful boot

Kernel owner failures

Symptoms:

  • panic before initramfs shell
  • hard hang during ACPI or storage controller setup
  • no root-device driver

Likely causes:

  • bad kernel build
  • broken ACPI interaction
  • missing built-in or early-load module
  • architecture regression

Useful tools:

  • remove quiet
  • add loglevel=7 ignore_loglevel
  • serial console via console=ttyS0,115200

Initramfs owner failures

Symptoms:

  • dropped into emergency BusyBox shell
  • cryptroot missing
  • root UUID not found
  • RAID or LVM not assembled

Likely causes:

  • stale initramfs
  • missing module
  • changed device names
  • wrong passphrase or key retrieval path
  • bad command line

Useful tools:

  • lsinitramfs
  • manual cryptsetup
  • lvm vgchange -ay
  • blkid

PID 1 owner failures

Symptoms:

  • boot reaches userspace, then stalls
  • emergency target
  • login screen missing
  • network waits dominating boot

Likely causes:

  • failed unit dependency
  • broken mount
  • display manager failure
  • service timeout

Useful tools:

  • systemd-analyze
  • systemctl --failed
  • journalctl -b

Stage 16: Resume from Hibernate Is a Special Boot Path

Hibernate complicates the story because boot may include a resume attempt before normal userspace starts. If the kernel command line contains resume=UUID=..., the initramfs often checks the target swap area for a hibernation image.

If present, the machine may restore kernel and userspace memory state from disk rather than constructing everything from scratch. If absent or stale, the boot continues normally.

This creates a subtle failure class:

  • the machine appears to boot normally but wastes time probing a dead resume target
  • the machine fails resume and then falls back
  • a moved or recreated swap partition leaves a stale resume= parameter behind

These issues can add seconds to boot or create misleading hangs. Checking /proc/cmdline and the initramfs resume hooks often resolves them quickly.

Stage 17: Network Boot, Cloud Images, and Embedded Boards Change the Early Path

Not every Linux system follows the exact same pre-kernel story.

Network boot

Firmware or a bootloader may fetch the kernel and initramfs over the network through PXE, iPXE, HTTP, or similar mechanisms. The storage discovery problem shifts earlier. The initramfs may still need networking too if the real root is remote.

Cloud images

Virtual machines often use slim firmware and device sets. Boot may feel simpler because the hardware model is narrower. The kernel still goes through the same broad stages, but with fewer buses, fewer quirks, and a less elaborate device graph.

Embedded Linux

Many boards do not use UEFI and GRUB. They may use:

  • on-chip Boot ROM
  • SPL
  • U-Boot
  • kernel
  • initramfs or direct root mount

The vocabulary changes, the staged ownership model does not.

Stage 18: Containers Skip Boot, MicroVMs Compress It

A Linux container has no firmware stage of its own and no kernel bring-up stage of its own. It enters an already running kernel. The "boot" of a container is really process creation plus namespace and cgroup setup. That distinction matters because many people moving between container work and real-machine work forget how much machinery a full boot normally includes.

A microVM compresses the path. A Firecracker-style guest often has:

  • tiny virtual firmware or direct kernel loading
  • a minimal virtual hardware model
  • a small root filesystem
  • short userspace path

The same broad questions remain:

  • who loads the kernel
  • how does the kernel find root
  • what is the first userspace
  • what does PID 1 do

The timeline is shorter, not conceptually different.

Stage 19: BIOS and UEFI Still Produce Different Linux Failure Modes

Most new Linux installations use UEFI, but legacy BIOS boot is still operationally relevant because Linux tooling has supported both worlds for years. A BIOS machine and a UEFI machine can both "fail to boot Linux" while requiring completely different recovery logic.

On a BIOS system:

  • firmware reads fixed sectors from disk
  • the first-stage loader space is tiny
  • GRUB embedding rules matter
  • GPT layouts may need a BIOS boot partition

On a UEFI system:

  • firmware loads EFI executables from a FAT filesystem
  • boot entries live in NVRAM
  • the EFI System Partition is part of the design
  • Secure Boot can block the chain

The practical consequence is simple. The last visible failure point decides the right tool. If a machine in a lab in Vienna still boots in BIOS mode, efibootmgr is not your first move. If a laptop in Stockholm lost its UEFI entry after a firmware reset, MBR repair is not your first move. Same symptom category, different owner.

Stage 20: The Initramfs Is Part of the System, Not a Disposable Side Product

Many people talk about the initramfs as if it were a generated blob that happens to exist. In practice it is part of the machine's real boot design. It decides what the system can do before the final root filesystem is available.

Typical rebuild commands:

sudo update-initramfs -u -k all
sudo dracut -f

Those commands determine:

  • which drivers are present in early userspace
  • whether cryptsetup support exists
  • whether LVM activation logic exists
  • whether hibernate resume logic exists
  • whether network-root support exists

If a new storage controller needs a module that never made it into the initramfs, Linux can fail before the final root is even visible. If the initramfs still carries an old root UUID or missing cryptsetup hook, the installed root filesystem may look fine on disk while the machine itself is unbootable.

This is one reason disciplined operators treat the kernel, modules, and initramfs as one deployable unit. Updating only one of them is often how a machine becomes "mostly updated" and still fails where it counts.

Stage 21: switch_root Separates Bootstrap from the Real System

The root handoff is not a small implementation detail. Before switch_root, the machine is still living inside a temporary bootstrap environment whose entire job is to make the final system possible. After switch_root, the machine has committed to the durable root filesystem that PID 1 will own.

That border is useful in debugging because it cleanly separates two broad classes of failure:

  • failures before switch_root, usually root discovery or assembly problems
  • failures after switch_root, usually PID 1, mounts, or later service-graph problems

That distinction saves time. An initramfs shell proves the kernel and early userspace are alive. A later emergency target proves the final root exists and the bug sits further along.

Stage 22: systemd Generators Make Part of the Boot Graph Dynamic

Not every unit that participates in boot comes from a static file an administrator wrote by hand. systemd also uses generators, small helper programs that inspect system state and create transient units or dependency fragments during boot.

Examples include:

  • fstab-derived mount generation
  • cryptsetup generation
  • GPT auto-discovery
  • command-line-driven unit generation

This means the boot graph depends on more than the static contents of /usr/lib/systemd/system or /etc/systemd/system. It also depends on:

  • current block-device layout
  • /etc/fstab
  • kernel command-line parameters
  • encryption and volume-manager state

Generated units often appear under:

/run/systemd/generator
/run/systemd/generator.late

This is valuable when a dependency seems to appear from nowhere. It usually did not. It was generated from current machine state at boot time.

Stage 23: emergency.target, rescue.target, and Initramfs Shell Mean Different Things

Operators often blur several degraded states together. They are not the same.

Initramfs shell

Early userspace started, but the handoff to the final root did not complete.

emergency.target

PID 1 started on the final root, but the machine entered a minimal emergency state because critical units or mounts failed.

rescue.target

A reduced single-user style environment with more normal userspace available.

Each state proves different earlier stages already succeeded. That changes the right debugging tools immediately.

Stage 24: Serial Consoles and Early Output Paths Decide What You Can Actually Observe

A blank local screen does not always mean nothing happened. It may only mean the chosen console path is wrong or too late in the boot sequence.

Useful kernel parameters include:

console=ttyS0,115200
earlycon
loglevel=7
ignore_loglevel

These control where early output goes and how much of it you can see. On headless systems and cloud machines, serial output is often the most truthful view of boot. On systems with graphics issues, it can separate "the display path failed" from "the kernel never got this far".

Stage 25: Firmware Time Can Dominate and Linux Cannot Fix That Internally

It is common to blame Linux for slow boot when firmware is actually consuming most of the time. Firmware may spend seconds on:

  • memory training
  • USB probing
  • Thunderbolt enumeration
  • TPM interaction
  • Secure Boot verification

If systemd-analyze reports firmware time as the biggest chunk, tuning userspace services is not the first lever to pull. The machine feels slow because Linux started late, not because Linux itself is necessarily inefficient.

Stage 25A: Bootloader Choice Changes the Recovery Surface

GRUB is common, but it is not the only Linux bootloader in active use. systemd-boot, rEFInd, U-Boot, iPXE, and vendor-specific chains all appear in real systems. The practical recovery surface changes with the loader.

With GRUB, you often debug:

  • grub.cfg
  • GRUB filesystem modules
  • menu entries
  • embedded core images on legacy systems

With systemd-boot, you often debug:

  • EFI stub kernels
  • loader entries under the ESP
  • NVRAM boot selection

With U-Boot, the relevant problems may live in:

  • boot scripts
  • environment variables
  • device-tree handoff
  • storage probing before the kernel starts

This is worth calling out because "the bootloader" is often treated as one interchangeable box. In practice the debugging route depends on which loader owns the machine after firmware hands off. A broken GRUB menu and a broken U-Boot script are both bootloader problems, but almost none of the repair steps overlap.

Stage 25B: The Root Filesystem Transition Often Includes a Read-Only Phase

Many Linux systems do not immediately mount the real root as read-write. A common pattern is:

  1. the initramfs mounts the final root read-only
  2. switch_root hands control to PID 1
  3. userspace later remounts / read-write

This gives the system a cleaner way to handle:

  • journal replay
  • integrity checks
  • early mount ordering
  • failure modes where the root should not be modified until more of the machine is stable

This detail matters because a machine may technically have mounted the root filesystem while still not being ready for services that expect a fully writable system view. If a service fails early with odd filesystem assumptions, the mount mode and remount sequence can be part of the explanation.

Stage 25C: Later Boot Delays Often Belong to One Unit, Not to "Linux"

Once PID 1 is in charge, a surprising number of "slow boot" complaints collapse into one slow edge in the unit graph:

  • NetworkManager-wait-online.service
  • remote mounts waiting for a server
  • a display manager retrying GPU state
  • a filesystem check on a slow disk
  • a service with an aggressive timeout

The right framing is not "Linux boot got slower". The right framing is "which unit is on the critical chain". systemd-analyze critical-chain is valuable because it turns a vague complaint into one concrete blocking path.

This is also why post-login inspection can mislead. A service that looks healthy after the machine settles may still have delayed boot badly because it had to wait for a dependency that is no longer visible as a problem after the fact.

Stage 25D: Root Device Naming Is Stable Only If You Ask for Stable Names

One of the oldest avoidable boot problems is relying on kernel device names that are convenient but not stable enough across hardware or timing changes.

Examples:

  • /dev/sda2 may become /dev/sdb2
  • NVMe namespaces can move when controller ordering changes
  • USB storage order is notoriously fragile

Modern boot setups therefore usually prefer:

  • UUID=...
  • PARTUUID=...
  • /dev/disk/by-uuid/...
  • mapped device names such as /dev/mapper/cryptroot

The boot chain depends on those names at several stages:

  • the bootloader command line may point to root=UUID=...
  • the initramfs may wait for a specific LUKS UUID
  • /etc/fstab may define mounts by UUID
  • resume logic may reference a swap UUID

If any of those identifiers go stale after cloning disks, restoring images, replacing storage, or regenerating partitions, the machine can fail in ways that look like a missing driver even when the device is physically present. Stable naming is therefore part of boot reliability, not just administration style.

Stage 25E: Filesystem Checks and Journal Replay Are Real Boot Work

Not all storage-related boot delay belongs to discovery or decryption. Sometimes the root filesystem is found quickly, but the machine still waits because the filesystem itself has cleanup or verification work to do.

Typical examples:

  • ext4 journal replay after an unclean shutdown
  • fsck on a filesystem marked dirty
  • xfs log recovery
  • btrfs mount-time checks and replay work

These cases matter because the machine can appear to hang "after root was found". In reality, Linux is doing exactly what you want before exposing the system for broader write activity.

This is also why blindly forcing filesystem checks off is a bad operational reflex. A boot delay caused by replay or verification may be preserving consistency after a crash or power loss. The right response is to confirm:

  • which filesystem is involved
  • whether replay is expected
  • whether the delay is proportional to the amount of dirty state
  • whether storage performance made recovery slower than usual

At the ownership level, this is no longer a firmware or initramfs problem. It is storage integrity work happening at or just after the root mount boundary.

Stage 25F: Network-Online Targets Confuse Many Otherwise Healthy Systems

network.target and network-online.target are not the same thing, and boot delays often come from units depending on the stronger one without really needing it.

network.target broadly means the networking stack is up enough that network management is present.

network-online.target usually means the system believes a usable network configuration is actually available. That can require:

  • DHCP completion
  • carrier detection
  • Wi-Fi association
  • waiting for a remote dependency timeout

On laptops, cloud images, and servers with flaky links, this target can dominate later boot time if:

  • a unit depends on it unnecessarily
  • NetworkManager-wait-online.service or an equivalent wait unit is aggressive
  • the machine is expected to boot usefully even when the network is absent

This is why a machine can "boot slowly" while every kernel and root-mount stage is completely healthy. The delay belongs to a late policy choice in userspace, not to Linux boot in the broad sense.

Stage 25G: Display Managers Fail Late Enough to Mislead People About Earlier Stages

If the graphical login screen never appears, many users conclude the whole machine failed to boot. Often the opposite is true. The kernel, root filesystem, and much of userspace may already be healthy. The failure may sit specifically in:

  • GPU driver initialisation
  • display manager startup
  • Wayland or Xorg session setup
  • PAM or seat management integration

That distinction changes the debugging path immediately. A system that reaches multi-user mode but fails to start the display manager is already far beyond initramfs and early service-graph problems.

Useful checks after reaching a text console include:

systemctl status display-manager
journalctl -b -u display-manager

and often GPU-related logs in the wider boot journal.

This matters because "black screen" is one of the least precise symptom descriptions in Linux boot work. It can mean:

  • firmware never left graphics init
  • the kernel never selected the expected console
  • userspace boot succeeded but the display manager failed
  • graphics came up and then the session crashed

The active owner at the moment of failure is what separates those cases.

Stage 25H: Immutable and Image-Based Systems Shift More Boot Logic Into Early Policy

Traditional Linux distributions assume a mutable root and a package-managed filesystem tree. Image-based and more immutable designs change that assumption. Systems using:

  • read-only roots
  • overlay filesystems
  • OSTree-style deployments
  • A/B slots on appliances

often push more policy into the bootloader, kernel command line, initramfs, or early systemd generators.

That changes recovery patterns. You may need to determine:

  • which deployment slot was selected
  • whether the overlay mounted successfully
  • whether the deployment metadata matched the kernel and initramfs
  • whether rollback logic selected an older tree

The useful lesson is broader than any one distribution style. Boot is always shaped by what "the final system" is supposed to look like. If the final system is image-based and partly immutable, the root handoff and early userspace policy become even more central than on a traditional mutable host.

Stage 25I: Many Boot Regressions Are Really Contract Mismatches Between Layers

The cleanest mental model for real failures is often "two neighbouring layers stopped agreeing".

Examples:

  • firmware says a boot entry exists, but the ESP contents no longer match
  • GRUB points to a kernel and initramfs pair that no longer belong together
  • the kernel command line still names a root UUID that has changed
  • the initramfs expects a storage module that is no longer present
  • PID 1 expects mounts or units derived from machine state that is no longer true

This framing is practical because it avoids vague blame. Instead of saying "Linux boot broke", ask which contract broke:

  • firmware to loader
  • loader to kernel
  • kernel to initramfs
  • initramfs to real root
  • real root to PID 1

That question is often enough to cut the search space from hundreds of files and services down to one boundary.

Stage 25J: Kernel, Modules, and Initramfs Versions Need to Stay in Lockstep

Another common source of hard-to-explain boot failure is version skew between the kernel image, the module tree, and the initramfs. On a healthy system those three pieces move together. When they do not, the boot can fail in confusing ways.

Examples:

  • GRUB loads a new kernel with an older initramfs
  • the initramfs expects modules from a different kernel release
  • /lib/modules/<version> does not match the kernel actually booted
  • out-of-tree drivers were built for the old kernel only

The symptoms depend on what mismatched first:

  • missing storage drivers in early boot
  • GPU or network drivers failing later in userspace
  • cryptsetup or mdraid support missing because the initramfs was built from stale state

This is why package managers and distro tooling are careful to regenerate boot artifacts as one transaction. If you ever have to debug a machine that looks inconsistent after an interrupted update, check:

uname -r
ls /lib/modules
cat /proc/cmdline

and compare the loaded kernel release to the installed module trees and initramfs names under /boot.

Stage 26: Measuring Boot Properly

If you want to measure boot with discipline, split it into owners.

Firmware and loader time

systemd-analyze reports this if the firmware exposes enough information.

Kernel time

Also shown directly by systemd-analyze.

Userspace critical path

Use:

systemd-analyze critical-chain
systemd-analyze blame

Full visual graph

Use:

systemd-analyze plot > boot.svg

Kernel-only detail

Use:

dmesg --ctime

Good measurement means not mixing these layers together. If firmware grew from 3 seconds to 12 seconds, tuning a userspace service is irrelevant. If userspace is slow because NetworkManager-wait-online.service takes 18 seconds, rebuilding the initramfs changes nothing.

Stage 27: A Practical Debugging Playbook

When a Linux system fails to boot, use this order:

  1. Ask what was the last clearly visible owner.

    • firmware screen only
    • GRUB menu
    • early kernel logs
    • initramfs shell
    • systemd emergency mode
    • graphical login missing
  2. Remove output suppression.

    • remove quiet
    • add loglevel=7
  3. Confirm the kernel command line.

cat /proc/cmdline
  1. If the issue is storage-related, inspect the initramfs and try the steps manually.

  2. If the issue is after PID 1 starts, use the journal and systemd analysis tools.

  3. If the issue started after firmware updates, compare NVRAM entries and Secure Boot state.

This method sounds simple because it is simple. The hard part is resisting the urge to treat all boot failures as one class of bug.

The Full Mental Model

Linux boot is a relay:

  • firmware owns hardware bring-up and the first trusted handoff
  • the bootloader owns kernel selection and boot parameters
  • the kernel owns early memory, topology, and driver bring-up
  • the initramfs owns root-device preparation
  • switch_root hands the machine into its final filesystem view
  • PID 1 owns service orchestration

Every layer leaves evidence in different places:

  • firmware variables and firmware screens
  • bootloader config and command line
  • early kernel logs
  • initramfs shell state
  • journal and unit graph

This is the reason strong Linux operators and kernel developers sound calm when a system "will not boot". They are not guessing. They are locating the current owner and then asking the owner-specific questions that make sense at that stage.

Boot is not a magical slide from off to on. It is a sequence of carefully staged environments, each one making the next one possible. Once you see the boundaries, the process stops looking mysterious and starts looking inspectable.

The companion lab renders that ownership model directly. You can step through firmware, GRUB, kernel early boot, initramfs, switch_root, and systemd, with the active log line and current owner changing at each stage. It is meant to teach the one habit that makes real boot debugging faster: always know who owns the machine right now.