FAT: DOS File Allocation Table
The FAT is a linked-list table that DOS uses to keep track of the physical
position of data on a disk and for locating free space for storing new
files.
The word at offset 1aH in a directory entry is a cluster number of the
first cluster in an allocation chain. If you locate that cell in the FAT,
it will either indicate the end of the chain or the next cell, etc.
Observe:
starting cluster number ══╗
Directory ╓───────────────────┬─┬───────────────────┬───┬───┬─▼─┬───────╖
Entry ══► ║M Y F I L E T X T│a│ │tim│dat│08 │ size ║
╙─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─║─┴─┴─┴─┴─╜
╔═════════════════════════╝
00 01 02 03 04 05 06 07 ║8 09 0a 0b 0c 0d 0e 0f
┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌▼─┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐
0x │ID││ff││03═►04═►05═►ff││00││00││09═►0a═►0b═►15││00││00││00││00│
└──┘└──┘└──┘└──┘└──┘└──┘└──┘└──┘└──┘└──┘└──┘└║─┘└──┘└──┘└──┘└──┘
╔═══════════════════════╝
┌──┐┌──┐┌──┐┌──┐┌──┐┌▼─┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐┌──┐
1x │00││00││00││00││00││16═►17═►19││f7││1a═►1b═►ff││00││00││00││00│
└──┘└──┘└──┘└──┘└──┘└──┘└──┘└║─┘└──┘└▲─┘└──┘└──┘└──┘└──┘└──┘└──┘
╚═══════╝
This diagram illustrates the main concepts of reading the FAT. In it:
■ The file, MYFILE.TXT, is 10 clusters long. The first byte is in
cluster 08 and the last is in cluster 1bH. The chain is
8,9,0a,0b,15,16,17,19,1a,1b. Each entry indicates the next entry in
the chain, with a special code in the last entry.
■ Cluster 18H is marked bad and is not part of any allocation chain.
■ Clusters 6,7, 0cH-14H, and 1cH-1fH are empty and available for
allocation.
■ Another chain starts at cluster 2 and ends at cluster 5.
Notes: The diagram is simplified in that it shows the FAT as if it held
8-bit values (00-nn). A FAT actually contains 12-bit or 16-bit
entries (see below).
To read the data associated with a FAT cluster number, you must do some
math and use INT 25H. See Converting a Cluster Number, below.
█▌FAT Facts▐█
The FAT normally starts at DOS logical sector 0001H in the DOS partition
(i.e., you can read it with INT 25H with DX=0001H). One way to locate the
FAT is to read the boot sector (DX=0000H), and examine the wResSects field
of the BootSectorRec (offset 0eH). This tells how many boot and reserved
sectors come before the FAT. Use that number (usually 1) in DX to read
the FAT via INT 25H. You may also use DOS fn 32H or 1fH (get drive
parameter block) and examine the word at offset 6 of the DPBRec.
There are usually two complete copies of the FAT. If so, they are always
adjacent (the second FAT directly follows the first). The number of FATs
is in wFatCnt and the size, in sectors, of one FAT is in wFatSects of the
BootSectorRec or DPBRec.
You have the following services available to help you obtain information
about the FAT:
■ Use DOS Fn 36H or 1cH to determine total disk sectors and clusters
■ Use INT 25H to read the Boot Sector and examine the data therein
■ Use DOS Fn 44H (if the device driver supports Generic IOCTL)
■ Use DOS Fn 32H to get all kinds of useful information
Notes: ■ The boot sector of non-booting disks (such as network block
devices and very old hard disks) may contain nothing but
garbage. Use DOS fn 32H to get info on these devices.
■ DoubleSpace simulates all disk data structures and all of the
above-mentioned services work transparently for a mounted,
compressed volume. See Compressed Volume File Layout and
Mapping DOS FAT to MDFAT for details on how it manages this.
█▌12-bit/16-bit▐█
The FAT can contain 12-bit or 16-bit entries.
12-bit entries are efficient for media less than 384K--the entire FAT can
fit in a single 512-byte disk sector. A 12-bit fat can map only 4086
clusters (2^12=4096 and ff7H-fffH have special meaning--see below).
The DOS Format command creates a 12-bit FAT when the partition is less
than 10 MB. It could actually handle larger disks, but the clusters would
need to be very large and wasteful. For instance, on a 32 MB disk, the
clusters would need to be 8K (averaging 4K of cluster slack▲ per file).
16-bit entries were introduced with DOS 3.0 with the necessity of
efficient handling the AT's "whopping" 20-Megabyte hard disk. A 16-bit
FAT can track 65530 data clusters. Assuming a 16-sector cluster, that
maxes out at 512 MB of storage. DoubleSpace's "virtual clusters" hit this
FAT max.
To determine if the FAT is 12-bit or 16-bit: The correct way (according to
Microsoft) is to check the total clusters (e.g., via fn 32H). If there
are 4086 or more, then it is a 16-bit fat.
This should always work for partitions created/formatted after DOS 3.3.
As a sanity check, you should consider looking at the abFileSysID field of
the BootSectorRec and/or validating by seeing how big a FAT is (wFatSects)
compared to the number of entries it contains (wTotSects/bClustSects) and
you might look at the bFileSysCode field of the PartitionEntryRec.
Note: It's a common misconception that it was that creation of the
16-bit FAT that first let DOS work with disks larger than 32
Megabytes. In fact, the limiting factor is that INT 25H/26H
(through which DOS performs its disk I/O) is unable to access
a SECTOR number higher than 65535. Normally, sectors are 512
bytes (½-K), so that sets the 32M limit.
In DOS 4.0+, INT 25H/26H supports a technique for accessing
sector numbers higher than 65535, and thus supports trans-32MB
partitions. This has no effect on the layout of the FAT itself.
Using 16-bit FAT entries and 16-sector clusters, DOS supports
partitions up to 512 MB (twice that for 32-sector clusters,
etc.).
█▌Reading the FAT▐█
To read the value of any entry in a FAT (as when following a FAT chain),
first read the entire FAT into memory and obtain a starting cluster number
from a directory.
Then, for 12-bit entries:
■ Multiply the cluster number by 3 ═╗
■ Divide the result by 2 ═════════╩═► (each entry is 1.5 (3/2) bytes)
■ Read the WORD at the resulting address (offset from start of the FAT)
■ If the cluster number was even, mask the value by 0fffH (keep the low
12 bits). If the cluster number was odd, shift the value right by 4
bits (keep the upper 12 bits)
■ The result is the entry for the next cluster in the chain (0fffH=the
end).
Note: A 12-bit entry can cross over a sector boundary, so be careful
with 1-sector FAT-buffering schemes.
For 16-bit entries, the action is simpler--each entry contains the 16-bit
offset (from the start of the FAT) of the next entry in the chain (ffffH
indicates the end of the chain).
█▌FAT Entries▐█
The first byte of the FAT is called the Media Descriptor (AKA the FAT ID
byte). The next 2 bytes (12-bit FATs) or 3 bytes (16-bit FATs) are 0ffH.
The rest of the FAT is composed of 12-bit or 16-bit cells that each
represent one disk cluster. These cells will contain one of the following
values:
■ 0000H ................. an available cluster
■ fff0H through fff7H ... a reserved cluster
■ fff7H ................. a bad cluster
■ fff8H through ffffH ... the end of an allocation chain
■ 0002H through ffefH ... the number of the next cluster in a chain
Notes: ■ The high nibble of the value is used only in 16-bit FATs; for
instance, a bad cluster is marked with ff7H in 12-bit FATs,
and fff7H with 16-bit FATs.
■ Cluster numbers 0001H and 0002H should never appear in a FAT
chain. You can think of the first two FAT entries as being
"symbolic representations" of the FAT and root directory.
█▌Converting a Cluster Number to a Sector Number▐█
After you obtain a file's starting cluster number from a directory entry
you will want to locate to actual disk sector that holds the file (or
subdirectory) data.
A diskette (or a DOS partition of a hard disk) is laid out like so:
■ Boot and reserved sector(s)
■ FAT #1
■ FAT #2 (optional -- not used on RAM disks; simulated by Dblspace)
■ root directory
■ data area (all file data reside here, including files for directories)
Every section of this layout is variable and the sizes of each section
must be known in order to perform a correct cluster-to-sector conversion.
The following formulae represent all of the steps:
RootDirSectors = wSecSize / (wRootEntries * 32)
FatSectors = bFatCnt * wFatSects
DataStart = wResSects + FatSectors + RootDirSectors
INT 25h/26h Sector = DataStart + ((AnyClusterNo-2) * bClustSects)
Where the variables in thisColor are obtained from the Boot Sector or
from a BPB or DPBRec. The resulting sector number can be used in DX for
INT 25H/26H DOS absolute disk access.
Easier method: DOS fn 32H (first documented with DOS 5.0, but available
with 2.0+) returns a DPBRec containing a field named wFirstData at offset
0bH. It is simply a pre-calculation of "DataStart" in the above formulae.
█▌So What???▐█
All this information about the FAT is irrelevant to most applications.
However, writers of disk utilities must know this stuff in order to
recover deleted files and lost data.
Also, a DoubleSpace-aware program must look through the FAT and compare it
to the MDFAT in order to report the actual compression ratio obtained for
a file (that's how COMMAND.COM does it).
Finally, a high-speed file management program may find use for this
information. For instance, to create a list all files on a disk, it is
roughly twice as fast to read the directory sectors yourself (compared to
using DOS fns to find files via 4eH).
See Also: INT 25H/26H (direct disk read/write)
Mapping DOS FAT to MDFAT
Boot Sector Layout
BPB: BIOS Parameter Block
DPB: Drive Parameter Block
Directory Entry
Device Drivers
Disk Drive Functions
-♦-