How Android Flash Storage Actually Works, And Why You Cannot Really Delete A File
Try the interactive lab for this articleTake the quiz (6 questions · ~5 min)Open the back of an old laptop and you will find a hard drive with spinning platters and a tiny actuator arm. Ask the drive to overwrite a file and it will fly the head to the right track, find the right sector, and literally magnetise the same spot on the platter with new bits. The old data is gone in the most physical sense of the word. The atoms that carried it are now carrying something else.
Open a modern Android phone and there is no such thing. There is a package of silicon soldered to the mainboard, about the size of a SIM card, that holds the entire userland: the operating system, every installed app, every photo in the gallery, every WhatsApp message, every saved password, and the key material in the Trusted Execution Environment. Ask the phone to overwrite a file and almost nothing happens at the silicon level. The file system returns success, the app carries on, and the original bits are still sitting inside the chip, unchanged, waiting to be read back by anyone who can reach them.
That gap between "the file is gone" and "the bits are still there" is the story of this article. It is also the reason that the Android factory reset does not erase your data in any meaningful sense of the word. What it erases is a 256-bit key inside the Trusted Execution Environment, and once that key is destroyed the ciphertext in the flash chip becomes mathematically worthless. The data is still there, bit for bit. It is just permanently unreadable.
To understand why Android works this way, and why it is actually a reasonable design given the physics of the storage chip, you have to walk through the stack from the NAND die up to the file system. Every layer adds a constraint, and every constraint explains something about how a modern phone behaves. By the end of the walk it should be clear that flash storage is not a replacement for a hard drive. It is a completely different kind of device with its own rules, its own failure modes, and its own relationship to the concept of deletion.
The Chip On The Board: eMMC And UFS
The storage inside a phone is not a loose NAND die wired up to the SoC. It is a small self-contained subsystem with its own controller, its own DRAM cache, and its own firmware, presented to the rest of the phone through a standard interface. On Android the two interfaces that matter are eMMC and UFS.
eMMC (Embedded MultiMediaCard) is the older of the two. It is a direct descendant of the SD card standard, repackaged as a soldered ball grid array part. Under the hood it speaks a command set very similar to SD: the host issues a command like CMD24 (single block write), the card returns status, the host streams data over a parallel bus with a data-strobe signal. The bus is eight bits wide, the clock tops out at around 200 MHz in HS400 mode, and the theoretical peak bandwidth is about 400 MB per second. In practice a mid-range eMMC part from 2020 will deliver around 300 MB per second on sequential read and around 150 MB per second on sequential write. Random I/O is much slower, usually under 50 MB per second for small blocks.
UFS (Universal Flash Storage) is the newer interface and has almost completely replaced eMMC in flagship phones. UFS borrows its signalling stack from SCSI: the host sends SCSI command descriptor blocks over a full-duplex serial bus based on M-PHY and UniPro. The electrical layer runs as differential pairs, like PCIe, and each lane carries data independently in each direction. UFS 3.1, which is the dominant version in phones sold between 2021 and 2024, uses two lanes at HS-Gear4, and the raw line rate is 11.6 Gbps per lane per direction. UFS 4.0, shipping in 2023 and later flagships, doubles that to 23.2 Gbps. Real-world sequential reads on a UFS 4.0 part reach 4000 MB per second, and sequential writes reach about 2800 MB per second. That is not far from a desktop NVMe SSD from a few years ago.
The reason UFS feels so much faster than eMMC is not only the raw bandwidth. It is the command queue. eMMC is fundamentally half-duplex and can have only a small number of commands in flight at once. UFS supports SCSI-style command queuing, meaning the host can enqueue 32 commands and the controller is free to reorder and overlap them. When Chrome is loading a web page and asking for 40 small images from the cache at the same time, UFS can pipeline all of them into the NAND channels in parallel. eMMC has to service them one at a time.
Both standards hide a lot of complexity behind that simple "block device" interface. The host sends logical block addresses (LBAs), just like on a SATA drive, and the controller translates those LBAs into physical NAND pages on the die. The host has no visibility into the real layout. It asks for LBA 0x123456, the controller looks up where that LBA is currently stored, reads it out, and returns the bytes. That lookup step is called the flash translation layer, and it is the single most important piece of firmware in any flash device.
Before the FTL makes sense, the NAND it is translating for has to make sense.
Inside NAND: Why You Cannot Overwrite
NAND flash is built from floating-gate transistors arranged in a grid. A single cell is a MOSFET with an extra insulated gate layer floating between the control gate and the channel. You program the cell by tunnelling electrons onto the floating gate through the thin oxide that separates it from the channel. The trapped electrons raise the threshold voltage of the transistor, and you read the cell back by checking whether the transistor conducts at a reference voltage. More trapped electrons means a higher threshold, means the cell reads as a 0. Fewer electrons means a 1.
The physical asymmetry of this process is where all of the weirdness comes from. Writing a cell is easy: you pulse a high voltage on the control gate and electrons tunnel in. Reading a cell is easy: you apply a small voltage and check conduction. But removing electrons is not easy. To clear a cell back to the erased state you have to reverse the electric field and pull the electrons out the other way, and that takes a large negative voltage applied across a much larger region of the die than a single cell. That region is the erase block, and it is the smallest unit you can reset to the all-ones state.
A typical 3D NAND die today has the following hierarchy: a cell holds one to four bits, a string of cells (64 to 176 cells, depending on how many layers the die has) runs vertically through the stack, a page is a horizontal slice through many strings and is typically 16 KiB of user data plus some spare area for error correction and metadata, and a block is a collection of pages (usually 256 or 512 pages per block, so two to eight megabytes). You can read a single page. You can program a single page, but only if it is currently erased. And you can erase an entire block, and only an entire block, at once.
That last rule is the killer. You cannot overwrite. If the host wants to update one byte inside a logical block, the controller cannot walk up to the NAND and say "change page 17 of block 42 from this to that". The NAND physically will not do it. The only operations the NAND supports are "read this page", "program this currently-erased page", and "erase this entire block". To change one byte the controller has to read the old page into its own DRAM buffer, modify the byte, find a currently-erased page somewhere else on the die, and program the modified data there. The old page is still sitting in block 42, untouched, with its original contents intact, because nobody has erased block 42 yet.
This is the single fact that explains everything else about flash storage, file deletion, wear levelling, garbage collection, and why you cannot shred a file on Android. The chip cannot overwrite data in place. Every write is a write to a new location. Every "delete" just marks the old location as garbage. The only way to actually reset cells to the erased state is to erase an entire multi-megabyte block, and that is a destructive operation that can only be done so many times before the oxide layer wears out and the block can no longer hold a reliable charge.
The Flash Translation Layer
The flash translation layer is the firmware running on the storage controller that hides all of this ugliness from the host. The host thinks it is talking to a normal block device with stable LBAs that can be overwritten in place. The FTL is the fiction that makes that illusion hold.
At the heart of the FTL is a mapping table. For every logical block address the host knows about, the table stores the current physical location on the NAND: which die, which plane, which block, which page. When the host writes to LBA 0x123456, the FTL:
- Picks a currently-erased page on some block somewhere in the device (this is called the active block for writes).
- Programs the new data into that physical page.
- Updates the mapping table so that LBA 0x123456 now points to the new physical location.
- Marks the old physical page as stale.
The old page is not touched. It still holds the previous contents of that LBA. The FTL has simply stopped pointing at it. From the host's perspective the LBA has been updated atomically. From the NAND's perspective a new page has been programmed and an old page has been logically abandoned.
The mapping table is large. A typical UFS 3.1 part with 256 GiB of user capacity has to map about 64 million 4 KiB logical blocks. Even at four bytes per entry that is 256 MiB of pure metadata. The controller cannot keep all of that in its SRAM, so it keeps the hot parts in DRAM and pages the cold parts into NAND, with its own mini-journal to recover the mapping table after a power failure. A well-designed FTL uses a mix of fine-grained mapping for frequently-updated regions and coarse-grained mapping (64 KiB or 256 KiB granularity) for cold regions, to keep the resident working set small.
The performance of the device depends on the cleverness of the FTL far more than on the raw NAND speed. A dumb FTL that picks physical pages in address order and never coalesces writes will thrash the garbage collector to death, stall the host for hundreds of milliseconds at a time, and wear out blocks unevenly so that the disk dies years before its theoretical life. A good FTL, like the ones shipping in Samsung and Micron UFS parts today, manages to keep write amplification close to 1.0x on desktop workloads and around 1.5x on phone workloads, while still spreading wear evenly across the die.
Garbage Collection And The Write Amplification Problem
As the device runs, stale pages accumulate. Every file modification, every log rotation, every overwrite, leaves a trail of pages marked "no longer in use" inside otherwise-full erase blocks. After a few hours of real use, a fresh phone will have a patchwork of blocks where, say, 60 percent of the pages are live and 40 percent are stale. That stale space cannot be reused until the entire block is erased. And the block cannot be erased because it still holds 60 percent live data.
Garbage collection is the process that solves this. The FTL picks a victim block with a lot of stale pages, reads the live pages out of it into a buffer, writes those live pages back into a different, currently-erased block, updates the mapping table to point at the new location, and then erases the victim block. The victim is now available again as a fresh write target.
This is where write amplification comes from. To erase one block and reclaim its stale space, the controller may have to rewrite all the live pages first. If the victim block is 60 percent live, every byte of reclaimed space costs the NAND 0.6 bytes of rewriting on top of whatever the host is currently writing. The effective write cost is 1 + 0.6 times the host write volume, so the NAND is doing 1.6x the work the host thinks it is doing. On a heavily-fragmented device the amplification can climb to 3x or 4x.
This matters for two reasons. First, the NAND has a limited number of erase cycles per block, typically 1,000 for QLC, 3,000 for TLC, 10,000 for MLC, and 100,000 for the ancient SLC that nobody uses in phones. Write amplification directly eats those cycles. A phone that writes 10 GB a day to a 256 GB TLC part at an amplification factor of 2.0 will burn through the block budget in roughly seven years. Not bad, but also not infinite. Second, the garbage collection itself uses NAND channels that the host is trying to use, which shows up as latency spikes during heavy use. Anyone who has seen an Android phone stutter in the Play Store while updating apps in the background is looking at garbage collection contention.
The FTL tries to minimise write amplification by grouping writes that are likely to change together onto the same erase block. A clever controller keeps several active write streams going at once: one for hot metadata, one for warm file data, one for cold user files. If the hot stream ends up with a lot of stale pages, the victim block is cheap to collect because most of its pages are already stale. If the cold stream fills a block with long-lived data, that block rarely needs collection at all.
Wear Levelling: Keeping The Die Alive
Erase cycles are not just finite. They are finite per block. Every block on the die has its own counter, and once that counter reaches the endurance limit the block starts to produce uncorrectable errors and has to be retired. If the FTL were naive and always wrote new data to the same few blocks, those blocks would wear out within weeks while the rest of the die stayed nearly pristine.
Wear levelling is the policy that prevents this. The goal is simple: keep the erase counts across all blocks roughly equal, so that the die ages uniformly. There are two flavours of wear levelling, dynamic and static.
Dynamic wear levelling is the easy one. When the controller needs an erased block to write into, it picks the currently-erased block with the lowest erase count. This spreads hot writes across the available pool. It works well for data that changes frequently, because the pool of blocks rotates naturally as pages are invalidated and reclaimed.
Static wear levelling is harder and more important. Imagine a block that contains cold data: the manufacturer's firmware image, a bootloader stage, an installed app you never update. That block is completely full of live pages and never gets invalidated. Dynamic wear levelling will never touch it. Its erase count stays at zero while the hot blocks climb into the thousands. Over time the hot blocks wear out and the cold blocks remain untouched, and the effective endurance of the device collapses because only the hot region is really in use.
To fix this, the controller periodically forces a rewrite of cold blocks. It picks a block with a low erase count, copies its contents to a different physical block (which then has to be rewritten the same way), and erases the original. Now the original block enters the free pool and can be used for hot data. This costs write amplification (you are rewriting data that the host did not ask to rewrite), but it buys you uniform ageing across the die. A good controller does this in the background, during idle windows, so the host never sees it as a stall.
On a phone, both kinds of wear levelling run continuously. The controller is always shuffling pages around, even when the screen is off. Some of the battery drain attributed to "standby" on modern phones is really the storage controller doing background maintenance on the NAND.
Why Android Picked f2fs
On top of the FTL the kernel still runs a file system, because the host needs to organise data into files, directories, inodes, permissions, timestamps, and all the usual Unix abstractions. Early Android used ext4, inherited from its Linux desktop heritage. Starting around Android 7, Google began shipping f2fs (Flash-Friendly File System) on some devices, and by Android 10 it was the default on most flagships. Pixel phones have shipped f2fs as the userdata partition file system since the Pixel 3.
ext4 was never designed for flash. It is a journaling file system that updates metadata in place, overwrites inodes, and assumes that writing to the same logical block repeatedly is cheap. None of that is true on NAND. Every metadata update turns into a page rewrite at the FTL level, every in-place journal commit triggers garbage collection, and the write pattern ends up scattered across the die in a way that maximises write amplification.
f2fs is a log-structured file system designed specifically for the underlying physics of flash. Instead of updating blocks in place, it treats the storage as a circular log and always appends new writes to the head. Old blocks become invalid as the log moves on, and a background cleaner reclaims them, much like an FTL garbage collector but at the file system layer. The design philosophy is: "make every write sequential, let the FTL do its job, and never fight the garbage collector".
The file system divides the disk into segments of 2 MiB each (matching a typical NAND erase block), groups segments into sections and zones, and maintains six separate logs for different kinds of data: hot node (frequently updated inodes), warm node, cold node (rarely updated inodes), hot data (frequently updated file contents), warm data, cold data. Every write is classified by temperature and sent to the matching log. Hot writes invalidate each other quickly, so their segments become garbage-rich fast and are cheap to reclaim. Cold writes accumulate in segments that almost never need cleaning.
The benefit is visible in benchmarks. On a fresh Pixel with a UFS 3.1 part, f2fs random write throughput is roughly 30 percent higher than ext4 on the same hardware, and the write amplification at the FTL level is substantially lower, because f2fs is already doing some of the appending work before the writes even reach the controller. The cost is complexity. f2fs has had a string of corruption bugs and power-loss recovery quirks over the years, and a few high-profile data-loss incidents on Samsung devices around 2018 were traced back to f2fs interacting badly with specific UFS firmware. Both sides have been hardened since, but the file system still feels a little younger than ext4.
f2fs also supports something called atomic writes, which let an application mark a group of writes as all-or-nothing. The file system stages them in a separate log and either commits the whole group or discards it, which is useful for databases like SQLite (the backing store for almost every Android app's local state). That feature, added in f2fs 1.13, cut SQLite transaction latency on Android roughly in half when it was introduced.
The TRIM Command And Why It Matters On Phones
The FTL has a fundamental visibility problem. When the host deletes a file, all the file system does is update some metadata to mark the blocks as free. It does not tell the storage device that those blocks no longer contain useful data. From the controller's point of view, those blocks are still live. It will happily rewrite them during garbage collection, preserve them through wear levelling, and generally treat them as precious. For weeks. Until the host eventually overwrites them with something else, at which point the FTL finally learns that the old contents are garbage.
This is exactly the worst case for write amplification. The FTL is moving dead data around because it does not know the data is dead.
TRIM is the fix. TRIM (on SATA) and UNMAP (on SCSI/UFS) and DSM-Discard (on NVMe) are all variants of the same idea: a command the host sends to the device saying "these LBAs no longer contain data I care about, feel free to drop them". When the device receives a TRIM, the FTL can mark the corresponding physical pages as stale immediately, without waiting for the host to overwrite them. The next time garbage collection runs, those pages are cheap to reclaim because they are already dead.
Android runs TRIM on a schedule. The fstrim utility walks the file system's free block bitmap and issues discard commands for every range of free blocks it finds. On most devices this runs once a day, usually while the phone is charging and idle, and you can see it in the boot logs as a line like fstrim: /data: .... On fast UFS devices the whole pass takes a few seconds. On eMMC it can take longer, sometimes minutes, which is part of the reason it runs overnight.
There is also a kernel option called discard that tells the file system to issue TRIM inline, as soon as blocks are freed, instead of waiting for the periodic batch job. Android used to disable this on ext4 because the inline discard stall was too visible on slow storage. On modern UFS parts and with f2fs the inline discard is fast enough to be acceptable, and Android has been slowly re-enabling it by default.
TRIM is not a security feature. After a TRIM command the LBA will usually read as zeroes, but the underlying NAND cells are not erased immediately. The FTL has only updated its own metadata. The physical cells still hold the old contents and will continue to hold them until garbage collection eventually gets around to reclaiming the block. If you trim a file and then rip the NAND die off the phone and read it directly with a flash programmer, the trimmed data is still sitting there. Which brings us to the real question.
File-Based Encryption: The Real Delete Mechanism
If the hardware cannot overwrite data in place, and the file system cannot force the hardware to erase on command, and TRIM only lies to the host about zeroing, how on earth can a phone guarantee that deleted data is actually gone?
The answer Android settled on is: do not try to erase the data. Encrypt it in the first place, throw away the key, and make the still-present ciphertext useless.
This is file-based encryption (FBE), introduced in Android 7 and mandatory on new devices since Android 10. The old system, full-disk encryption (FDE), used a single key to encrypt the entire userdata partition at the block layer. FBE is finer-grained. Each file has its own encryption key for its contents, and each directory has its own key for its file names. Those per-file and per-directory keys are derived from a handful of master keys, one per user profile and per encryption class. The master keys themselves are wrapped by a hardware-backed key that lives inside the Trusted Execution Environment and never leaves it.
The encryption algorithm is AES-256 in XTS mode for file contents and AES-256 in CBC-CTS for file names. XTS is the standard choice for sector-level encryption because it produces deterministic, tweakable ciphertext for each 16-byte block and does not require storing an IV per block. For every 4 KiB file system block, the tweak is derived from the file's encryption key and the logical block number, and the XTS mode handles the per-block randomisation internally.
There are two encryption classes in FBE: device-protected and credential-protected. Device-protected files (Device Encrypted, DE) are available as soon as the phone boots, before the user has entered their PIN. This is where the alarm clock, accessibility services, and the default dialler live: the parts of the OS that must work in Direct Boot mode. Credential-protected files (Credential Encrypted, CE) are only available after the user unlocks the device for the first time after boot, because the CE key is derived from the user's credential plus the hardware-backed key. Everything else on the phone, including almost all app data, lives in CE storage.
The master keys are not derived directly from the PIN. If they were, a brute-force attack on the PIN would be trivial because the PIN space is tiny (10,000 combinations for a four-digit PIN). Instead, the PIN is used as input to a key derivation step that runs inside the TEE, with a hardware-enforced rate limit. Every guess has to go through the Secure Element, which adds a fixed delay of at least 80 milliseconds per attempt on modern Pixels (sometimes higher if the attacker hits the lockout threshold). Even a one-billion-guess-per-second desktop cracker becomes a one-attempt-per-80-milliseconds crawler once it has to go through the hardware. Four-digit PINs are still weak in absolute terms, but they are not catastrophic, because the rate limit is enforced by silicon.
Each file encryption key (FEK) is generated when the file is created, encrypted with a parent master key, and stored as part of the file's extended attributes. When an app opens the file, the kernel asks the userspace Keymaster daemon to unwrap the FEK, loads the resulting AES key into the fscrypt subsystem in the kernel, and then reads the file as normal. Every page that comes in from the block layer is decrypted on the fly using the FEK and the file's block offset as the XTS tweak.
Now think about what "delete" looks like on this stack. The user deletes a photo. The gallery app tells the media store to remove the file, the media store calls unlink() on the underlying path, and the kernel removes the directory entry and schedules the inode for cleanup. The file system marks the blocks as free. The next fstrim pass issues discard commands for those LBAs. The FTL marks the corresponding physical NAND pages as stale. Sooner or later, garbage collection reclaims the physical block and the actual NAND cells are erased, at which point the ciphertext is physically gone.
But the crucial detail is that none of those steps matter for security. From the moment the unlink happens, the decrypted key for that file no longer exists in the kernel. The file's FEK was decrypted on demand and lived only in kernel memory while the file was open. The wrapped FEK in the file's extended attributes still exists in the now-free blocks, but without the parent master key to unwrap it, those bytes are just random noise. And the only way to get the parent master key is to authenticate to the TEE, which only releases it when the right user credential is presented.
The security of "delete" on Android does not come from overwriting data. It comes from the fact that the data was encrypted with a key chain that the attacker cannot assemble, even if they have bit-perfect access to the raw NAND.
The Factory Reset: A Cryptographic Wipe
Now we can answer the question the article opened with: what does Factory Reset actually do?
Before Android 7, factory reset was a best-effort format. The system would unmount /data, call mke2fs or mkfs.f2fs on the partition, and reboot. That re-initialised the file system metadata but left the actual user data blocks sitting in the NAND, protected only by the fact that the re-formatted file system would not reference them. If you pulled the flash chip and read it directly with a programmer, you could still find the old data. Forensic tools did exactly this for years.
On a modern Android device, Factory Reset is a completely different operation. The flow looks roughly like this:
- The Settings app calls the
RecoverySystem.rebootWipeUserData()API. - The system writes a small command file to the
/cachepartition telling the recovery image to perform a wipe on next boot. - The device reboots into recovery mode.
- Recovery mounts the TEE, requests that the hardware-backed user key be destroyed, and waits for the TEE to confirm.
- Recovery unmounts and formats
/dataand/metadata, which rewrites the superblock and trims the entire partition. - The device reboots into the setup wizard, which on first boot will generate a new hardware key and new user profile keys.
Step four is the one that matters. The TEE holds the master encryption key used to wrap all the per-user and per-file keys on the device. When the TEE is asked to destroy that key, it does so by calling into the RPMB (Replay Protected Memory Block) region of the storage device, which is a small area of eMMC or UFS that can only be written through an authenticated command path. The RPMB entry holding the wrapped master key is overwritten with zeroes through the authenticated interface. On devices with a Secure Element (Titan M on Pixel, Samsung's dedicated chip on Galaxy phones), the key is held inside the Secure Element itself, in a small amount of one-time-programmable or flash-backed storage that the SE can erase directly and irreversibly.
After this step, every ciphertext block still sitting in the NAND is unreadable. The key chain has been broken at its root. The actual data still exists, bit for bit, for however long it takes the FTL to eventually garbage-collect the free blocks, but without the root key nothing can turn ciphertext back into plaintext. You could read the raw NAND die with a JTAG probe and a custom flasher and you would get gigabytes of perfect AES-256 ciphertext, which is the cryptographic equivalent of perfect noise.
This is why Google can market the factory reset as "secure" even though almost no actual erasing happens. The promise is not "we zeroed the bits". The promise is "we destroyed the key that makes the bits meaningful". Those are very different claims, and the second one is much stronger, because it does not depend on the cooperation of the storage controller. No matter how lazy the FTL is about garbage collection, no matter how much the NAND refuses to actually clear its cells, the data is gone the moment the key in the RPMB disappears.
Why shred Does Not Work On NAND
On a traditional hard drive, the Unix shred utility gives you a reasonable approximation of secure deletion. You tell it to overwrite a file, and it opens the file, writes random bytes over it repeatedly, calls fsync(), and closes it. On a spinning disk, those writes really do land on the same physical sectors, because the drive has no reason to relocate them. The original bits are physically replaced with new bits. Run shred a few times and the drive head has passed over the same magnetic region multiple times with fresh noise, and recovering the old data becomes impractical even with a magnetic force microscope (though the "Gutmann method" panic from the 1990s was always a little overblown).
On NAND, shred is completely useless, and you can see why as soon as you unpack the layers the writes have to pass through.
The shred process opens the file and issues a write for the first 4 KiB block. The kernel passes the write down to f2fs. f2fs, being log-structured, does not overwrite the old block. It allocates a fresh logical block at the head of its current segment and writes the new content there. The old block is marked as invalid in the file system metadata. The write then reaches the block layer, which sends it to the UFS driver, which sends it to the controller, which sends it to the FTL, which writes it to a fresh NAND page and updates its mapping table. The old physical page is marked stale. The old data has not been touched. It still holds the original file contents.
shred issues the second pass. The same dance happens. A fresh logical block is allocated by f2fs, a fresh physical page is allocated by the FTL, and yet another copy of the "old" data sits on the NAND. Now there are three copies of the file floating around in stale pages inside various erase blocks.
shred issues the tenth pass. There are now eleven versions of the file in various states of invalidation, and the actual cells holding the original file are still untouched. If anything, you have made the forensic situation worse, because the attacker now has ten progressive snapshots of the file to work with if they can read the raw NAND.
The same problem applies to any other "secure delete" tool. wipe, srm, dd if=/dev/zero of=file, any approach that relies on overwriting specific bytes. They all assume that the block device is literally overwriting the sectors the file system gives them. On flash with an FTL, that assumption is false at every layer.
The only way to actually remove data from NAND is one of the following:
- Issue an erase command on every block that might contain it. You cannot do this from userspace, because the FTL does not expose a "physical erase" operation to the host.
- Fill the entire device with new data until the FTL has garbage-collected every existing block at least once. This works but is slow and uneven, because the wear-levelling allocator picks blocks at the controller's discretion.
- Use the SCSI Sanitize or NVMe Format commands, which instruct the controller to do a cryptographic or block-level erase of the entire device. UFS 3.0 added a Purge operation that does exactly this. On a phone, this is what the factory reset path can invoke to actually clear the storage beyond the key destruction.
- Destroy the key material that protects the data and accept that the ciphertext will stay on the die until it is eventually overwritten by natural use.
Android picks option 4 and supplements it with option 3 on devices that support it. This is the only approach that is fast (milliseconds, not minutes), reliable (it does not depend on the controller cooperating), and forensically strong (the ciphertext is unrecoverable without the key, not "probably unrecoverable").
The Trusted Execution Environment: Where The Key Lives
The key that makes all of this work is not stored in ordinary NAND. If it were, the whole scheme would collapse, because that key would be subject to the same "cannot really delete" problem as the user data, and an attacker who read the NAND directly could just read out the key alongside the ciphertext.
Instead, the root key lives in the Trusted Execution Environment. The TEE is a small execution domain on the main SoC that runs in parallel with the normal (non-secure) world. On ARM chips it is implemented with TrustZone: the CPU has two security states, secure and non-secure, and a special instruction (SMC, secure monitor call) is used to transition between them. The secure world has its own kernel (on most Android phones this is either QSEE on Qualcomm chips, Kinibi on Exynos, or Trusty on Pixel), its own memory regions (carved out of system RAM and protected by the memory controller so the non-secure world cannot see them), and its own trusted applications.
On top of that base, Android defines an interface called Keymaster (recently renamed KeyMint) that exposes key operations to the non-secure world. The non-secure world can say "generate an AES-256 key and store it", "unwrap this blob with my key", "sign this hash with my key", and so on. The TEE performs the operation internally and returns only the result. The raw key bytes never leave the secure world. They are never copied into non-secure memory, never swapped to storage, never visible to the Linux kernel.
Pixel phones add another layer with the Titan M chip. Titan M is a separate secure element, physically distinct from the main SoC, with its own CPU, its own flash, and its own firmware. It is wired to the main SoC over a small serial bus and talks a protocol called StrongBox. The root key is held inside Titan M's own flash, not in the TEE on the SoC. This matters because TrustZone on the SoC is still subject to side-channel attacks, memory disclosure bugs, and occasional breaks in the secure world kernel. Titan M has its own attack surface, but it is smaller and completely separate, and a break in the Android TEE does not give you the key material on the Pixel.
Samsung's equivalent is the Secure Processor included in recent Exynos and Snapdragon-for-Samsung parts, plus the Knox Vault on newer flagships. The exact boundary varies by manufacturer, but the principle is the same: the root key lives in a tamper-resistant region, its use is mediated by a small trusted codebase, and it can be destroyed on demand.
When Factory Reset destroys the root key, it is destroying an entry inside this tamper-resistant region. On Pixel devices with Titan M, the destruction is a physical erase of a cell inside the SE's on-die flash. There is no FTL between the SE CPU and its storage; the erase is direct. After the erase, the key is physically gone, and no amount of desoldering, decapping, or electron microscopy will recover it.
Forensic Realities: What Police Labs Can And Cannot Do
The practical consequence of all of this is that forensic recovery of deleted data on a modern Android device is extraordinarily difficult, and in many cases flatly impossible, regardless of how much budget the lab has.
Consider the two scenarios a forensic lab usually faces.
Scenario A: the phone is unlocked or the user credential is known. In this case the lab has full access to the decrypted file system. They can walk the f2fs structures, read every live file, and extract anything the user has not deleted. For files the user has deleted, the story depends on how long ago and how actively the phone has been used since. If the TRIM has not run yet and the FTL has not garbage-collected the blocks, there may still be recoverable data in the free space of the file system. If the phone has been used heavily since the delete, that data has probably been overwritten by other writes. In practice, labs using tools like Cellebrite UFED or GrayKey can recover a substantial fraction of recently-deleted data from a cooperating phone, but the recovery rate drops off quickly as the phone continues to be used.
Scenario B: the phone is locked and the user credential is not known. This is where the forensic situation collapses. To read any user data the lab has to either brute-force the credential (rate-limited by the TEE to a few attempts per second at best, and often shut down entirely after 30 wrong guesses) or extract the root key from the TEE. Extracting the root key has been done in the past via vulnerabilities in TrustZone implementations (Qualcomm's QSEE has had several memorable holes, most famously the 2016 TrustZone key extraction disclosed by Gal Beniamini), but each vulnerability is patched once it becomes known, and on devices with a dedicated Secure Element like Titan M the attack surface is even smaller. For most current-generation phones, a lab with no credential and no vulnerability has essentially no path to the data.
Scenario B after a factory reset is even worse. The root key is gone. There is no credential to guess, no TEE secret to extract, no wrapped master key to unwrap. The NAND still holds gigabytes of valid ciphertext, but it is all protected by a key that was physically destroyed during the reset. Forensic recovery from this state is not rate-limited or difficult. It is impossible in the cryptographic sense of the word, assuming the reset ran to completion and the hardware key destruction path worked correctly.
This is not marketing from Google. It is a consequence of the architecture. The whole chain from NAND to TEE was designed on the assumption that deleting data through overwriting is physically unreliable on flash, and that the only robust guarantee of deletion is cryptographic. That assumption is correct, and the design follows from it.
Caveats, Failure Modes, And Edge Cases
The picture above is clean, but real devices have rough edges.
The factory reset flow depends on the recovery image, the bootloader, the TEE, and the storage firmware all doing the right thing. On older devices or devices with customised firmware, the key destruction step might silently fail, and the reset falls back to a plain file system format with the key still intact in the TEE. In that case an attacker who later obtains the TEE secret can still decrypt whatever ciphertext survived. In 2019 a researcher showed exactly this on a handful of budget phones from lesser-known brands: the factory reset did not actually invoke Keymaster, so the old data remained recoverable by anyone who could authenticate to the unchanged TEE.
The RPMB region itself is a single area, and on some implementations the "erase" is really a counter-incremented overwrite with zeroes, which is still a logical destruction, but the physical cells may retain the old ciphertext of the wrapped key long enough for a high-end lab to, in theory, read them out. This is considered acceptable because the wrapped key is also useless without the TEE's internal secret, so the RPMB ciphertext would still need a second layer of attack.
File-based encryption does not cover everything. Until Android 10, some metadata was stored unencrypted on the /metadata partition, including file names for Direct Boot apps. Android 11 moved this into the encrypted region. Photos stored on an adopted SD card follow a slightly different encryption path and are historically weaker; several CVEs in 2020 and 2021 traced to the adoptable storage flow. On removable SD cards without adoption, the data is not encrypted at all and a factory reset does nothing to it.
The TRIM schedule on Android is aggressive but not instantaneous. If you delete a large file and then immediately dump the storage before the next fstrim pass, the FTL still holds the mapping and the ciphertext is still addressable. Only after TRIM plus garbage collection plus key destruction is the chain of recoverability fully broken.
And none of this protects you from software running on the phone itself. Malware that has root privileges can read any file the kernel can decrypt, and since the kernel decrypts files on demand when the user unlocks the phone, the encryption is transparent to anything running in the normal execution environment. FBE stops a cold attacker, not a hot one.
What This Means For You
Three practical takeaways, whether you are a developer, a user, or a curious reader.
First, when you tell an Android phone to delete a file, assume the actual bits are still on the NAND for some uncertain amount of time. If the data is sensitive, the only correct mental model is that deletion is a promise about the key chain, not about the storage cells. The file is gone in the sense that nothing on the phone can read it. It is not gone in the sense that the silicon forgets it.
Second, factory reset on a modern, unmodified Android phone (Android 10 or later, on mainstream hardware) is genuinely secure if the hardware root of trust is intact. The key destruction path is the part that matters, not the partition format. A phone that has been factory-reset properly is safe to sell or donate, provided you also remove any SD card and sign out of your Google account beforehand so that Factory Reset Protection does not lock the new owner out.
Third, the performance characteristics of a phone's storage are dominated by the FTL and the file system, not by the raw NAND speed. A phone that feels slow at year four is usually suffering from a combination of a full userdata partition (which makes garbage collection expensive), a fragmented log in f2fs, and accumulated wear on the busiest erase blocks. Clearing space and letting fstrim run aggressively often restores a surprising amount of the original speed, because it gives the FTL a bigger pool of fresh erased blocks to play with. You cannot make NAND young again, but you can make the controller's job easier.
The chip soldered to the mainboard of your phone is not a hard drive. It is a finite, asymmetric, rewriting-hostile piece of physics with a sophisticated controller trying to pretend it is a hard drive. Once you see the rewriting-hostile part clearly, the rest of Android's storage stack, from f2fs up through FBE up through the TEE up through Factory Reset, stops looking like a bag of independent design choices and starts looking like one coherent answer to a single physical constraint: you cannot overwrite, so you had better make sure the data you cannot overwrite is meaningless to anyone without the key.