Block, File, and Object Storage

Storage systems organize data in several common ways, and we need to be clear with our terminology.  This is particularly important for people who have worked with only one type of organization!  Traditionally, IT storage is what we would call “block storage”. Technical and desktop computing generally works with “file storage”.  Web and cloud-based applications primarily use “object storage”.  Let’s briefly explore these different types of storage.

Block storage

Block storage: Address by location within a volume.

A block storage system organizes data into volumes.  Data is then addressed by its location in the volume (an offset from the beginning).  Generally the volume consists of a sequence of physical records, or blocks, where each block is a fixed-length sequence of bytes.  Data is read and written as entire blocks, rather than individual bytes or bits.

In the simplest case, a volume represents a single physical storage device, such as a hard drive.  These volumes might be divided in smaller sub-volumes, or partitions.  In other cases, several storage devices are aggregated to create a complex such as a RAID array.  Enterprise storage controllers organize disks into RAID groups, partitioning these RAID groups into volumes, often referred to as logical units, or LUNs.  Storage may be further virtualized, aggregating volumes from multiple storage systems, creating composite volumes and subdividing those into smaller volumes, and often providing higher level operations like data compression, encryption, thin provisioning, and mirroring.

Block storage devices are connected to computers using protocols like SAS, iSCSI, or Fibre Channel.  In simple cases, the block storage device is directly connected to the computer,.  In more complex situations, storage devices are connected to computers through Storage Area Networks (SANs).  A SAN fabric can connect many storage systems, computers, and SAN switches.

Operating systems present volumes as disk devices, usable like internal disk devices.  Most computers boot operating systems from volumes, and many computers can boot from volumes connected over a SAN.

File systems are formatted on block storage volumes.

File Storage

File storage: Address by name in (usually) a hierarchical name space (usually implemented over block storage)

file storage system organizes data into files with names, collected into file systems.  Frequently file systems support hierarchies of named directories (folders) containing files and other directories.  The ordered list of names of directories that must be traversed to reach a file, followed by the name of the file, is the path name of the file.  A file is addressed by its path name.

Generally files can be read, written, and modified.  Some metadata is associated with each file, consisting at least of its length.  Commonly metadata also consists of the time the file was last modified, access controls for the file, and frequently other time stamp information.

A file system is usually formatted on a block storage device, which may be internal, direct-attached, or accessed over a SAN.

Object storage

Object storage: Address by name in (usually) a flat name space (usually implemented over file storage)

Object storage deals with data as objects addressable by name.  Generally the name space is fairly flat.  Object-based storage devices exist (using the SCSI protocol), but in the past few years “object storage” has predominantly meant storage objects accessed through the HTTP protocol — such storage is also known as “cloud storage”.

Cloud storage objects typically may not be modified, other than appending.  Objects are grouped into containers, associated with accounts.  Common cloud object storage systems are Amazon S3 and OpenStack Swift.

