Disk Partitions

A hard disk stores numbered sectors. Traditionally sectors have held 512 bytes of data, but newer disks increase the sector size to 4096 bytes. The operating system presents the disk with a starting sector number, the number of sectors to read or write, and the address in memory of the data to write to disk or a buffer to receive data read from the disk.

An operating system puts a “filesystem” on the hard drive. A filesystem has tables that convert the sequence of numbered sectors into a tree of directories that store named files. The filesystem has tables to look up file and directory names, entries to store the attributes like creation date of each file, and tables to track the freespace or unused sectors that can be used to create new files. You reference a particular file name, and the operating system uses the filesystem tables to translate that file name into a sequence of numbers representing the sectors that hold the data for that file. You create a new file and the filesystem assigns unused sectors from the freespace to hold the data, then creates an entry that maps that file name into the sequence of sector numbers that it just allocated.

In the 1960’s when computers measured their memory in thousands of bytes, an operating system kept all the data it could on disk. A disk might require 10,000 times as long to respond with data than regular memory, but when there simply wasn’t enough memory to go around it was necessary to write everything you didn’t absolutely need off to disk. One side effect was that IBM mainframe operating systems could share a single disk between two computers because all the control information was on the disk itself.

However, memory got bigger, faster, and less expensive while disks are still limited by the mechanical speed of rotation and arm movement. Now a laptop has billions of bytes of memory and a disk that is now a million times slower to respond than data stored in memory. So modern systems adopt the opposite strategy of keeping the most frequently used data from disk in memory. In practice this means that some of the file data and much of the filesystem tables are copied into memory. The operating system locates files and allocates space using the in-memory copy of the tables and then writes changes back to the disk.

This means that modern operating systems can no longer share a filesystem on disk. If each system keeps its own separate copy of filesystem tables in memory, then when any OS deletes a file or creates a new file using the tables in its memory there is a long delay before the other system notices the changes. In the interim, the other system may try to open the deleted file and read the wrong data. The only way a file system can be shared across computers is for one system to control it and then use network sharing protocols to make the files available to other computers.

A disk can be divided up into partitions. Each partition has a starting and ending sector number and therefore a size. The next partition starts after the previous one ends. Each partition has its own file system. Different partitions can contain different types of filesystems or different copies of the same operating system. Small special partitions at the start of the first disk can be used by the BIOS or firmware to load diagnostic programs, data recovery tools, or to select which operating system to boot from a menu of available systems. Large partitions can contain data for different computers, or different operating systems, or data with a different performance or availability profile.

For example, a partition can be “mirrored” onto a second partition of the same size located on a different hard disk. Any data written to the partition is automatically duplicated on both disks. If one of the two hard drives fails, all the data is still available on the good disk. In theory, different partitions can contain file systems optimized for desktop or server use, for large video files or random access database.

Two partitions on the same disk share the same physical hardware and bus. If the disk is busy servicing an I/O request for one partition, then a request for another partition on the same disk will have to wait until the first request completes. However, two partitions share no data or control information and are, in that regard, logically independent.

The first sector on every physical hard drive holds the Master Boot Record (MBR). This was the original structure created when the first 10 megabyte hard drive was added to the first PC. The MBR has four entries that can be used to define four “Primary” partitions.

One of the primary partitions can be marked Active by the operating system installation or disk management software. At Boot time, the BIOS firmware of the PC looks for the Active partition on the first hard disk (as the BIOS has been configured to order the hard disks connected to the machine). It loads the first sector from the active partition which contains the boot loader program. That program loads a larger program that may load the OS or display a menu of boot options.

Four partitions quickly became insufficient. One solution quickly developed to create an Extended partition containing “logical volumes”. The Extended partition itself began with a table that defined each of the logical volumes. Logical volumes are like partitions, except that they are not defined in the MBR, cannot be marked Active, and cannot be booted up directly by the BIOS. However, a logical partition can contain a copy of an operating system provided that the boot loader for that system is contained in a small Active Primary partition that is defined in the MBR and can be loaded by the BIOS.

Recently a much more powerful alternative to the MBR has been defined called the GPT. The GPT has room for 128 partitions and it supports 64-bit entries so that partitions and disks can be larger than 2 terabytes. GPT tables are supported by modern operating systems like Windows 7 and by the firmware on some servers, but it is not directly supported by the BIOS firmware on most desktop or laptop computers. So there is an MBR written to the first sector of the disk to protect a GPT disk if it is installed in an old system that doesn’t support GPT.

Almost every Windows computer sold today uses BIOS firmware. BIOS can only boot when the first hard disk has an MBR that designates an Active partition. When the boot sector is loaded from the Active partition, the BIOS support is still the only way to do disk I/O, and the BIOS only works with an MBR disk. If you could get a computer with EFI firmware and install Windows 7 on it, then a Boot Loader would be installed that understands EFI and could load the system from a GPT disk. As long as only BIOS computers are available, then Windows doesn’t load GPT support until after the Kernel and drivers are loaded into memory, and that means that the whole Windows system has to reside on an old fashioned MBR disk.

A Windows disk can be Dynamic. On Dynamic disks, a single partition can extend across more than one disk. If you are using this to mirror volumes for recovery after a failure, this is a good idea. If you are using it to get more space for a single volume, this may be a mistake. Disks fail eventually. Some fail sooner than others. If you have a single volume across several hard drives, then the contents of the entire volume is lost if any one of the disks fails. It may be more work to manage files on different partitions on different volumes, but that also puts you in control of the location of every file and gives you the opportunity to plan a sensible backup and recovery strategy.

When the first hard disks were added to PCs, the DOS operating system assigned the disk letters (C, D, E, …) to hard disk partitions in the order the disks were physically encountered. This worked in simple cases, but it produced problems when a new hard disk or a new partition changed the letters assigned to existing partitions.

So modern Windows systems write a Unique ID value on any disk they encounter that does not already have one. Then as disk letters are assigned to partitions on the disk, the Unique ID and partition number can be remembered in the Windows registry. If you pull your system apart to clean it, and then put it back together and get some of the cables confused, the only disk you have to get right is to make the boot disk the one that comes first in the BIOS configuration of hard drives. The other disks will be recognized by their Unique ID.

This, however, adds a complication if you need to upgrade your system hard disk. If you only copy the partitions from the system disk to the new hard drive and then try and boot from the new disk, there will be a conflict between the Unique ID of the new disk and the disk letters originally assigned to it and its new role as the BIOS boot disk. Windows will try to reassign its old disk letters, but then will not run because there is no C drive. The solution is to use a utility that will “clone” a hard drive instead of just copying the partitions. Cloning a drive will also copy over the Unique ID string to the new drive. After cloning there will be two disks with the same “Unique” ID, so they cannot both be placed in the same computer at the same time. However, now if you replace the old drive with the new one and boot the computer, the new drive will be regarded as the same as the old drive, the C disk letter will be found, and the new copy of the system will run as expected.

The functions previously provided by disk partitions can also be provided (in special cases) by a VHD file. A Virtual Hard Disk was originally created to support Microsoft virtual machines. The VHD is a file in a regular file system of a regular partition. However, with Windows 7 or Server 2008 the VHD is beginning to act as a substitute for a hard drive or partition. For example, you can add a VHD file to the Windows 7 boot menu and boot an operating system from the VHD as if it was a physical disk. If two different computers both have hardware access to the same hard disk, then each of them can use a different VHD in the same file system in the same way that each could previously have used a different partition of the same physical disk. If the VHD is expandable instead of fixed size, then another system using the VHD may have to make a network request to the system that owns and manages the parent file system to make the VHD larger if it runs out of space internally, but all other requests can be managed inside the VHD.