Where in the software stack should we encrypt data? My answer is, do encryption as low in the storage stack where it works well. The storage stack In particular, encrypted data does not compress well. Compression works by looking for and removing redundancy from the data. Effective encryption hides patterns and redundancies in the data, … Continue reading Where to do it? Encryption
Category: basics
Where to do it? Compression
Where in the software stack should we compress data? My answer is, do compression as high in the storage stack where it works well. The storage stack In other words, if an application data can do its own compression, let it compress. Compression relies on reducing redundancies in the data set - if something can … Continue reading Where to do it? Compression
Where to do it? Layers in the storage stack
Storage operations, such as compression, tiering, and replication, may happen throughout the "storage stack", and in some cases the same general operation could happen in more than one layer of the stack. When there is a choice, choosing correctly can make a big difference in performance and scalability. But before I talk about these different … Continue reading Where to do it? Layers in the storage stack
Storage metrics: Bandwidth, “IOPs”, and Latency
Typically when I talk with clients about storage metrics for performance, they will typically focus on either bandwidth or "IOPs". But really there are three dimensions of storage performance! Let's consider these three storage performance metrics and how we can design systems to work around shortcomings. First, there is bandwidth (or throughput). This is a … Continue reading Storage metrics: Bandwidth, “IOPs”, and Latency
Cluster quorum and Spectrum Scale
Spectrum Scale systems are organized into clusters. File systems and underlying resources are owned by clusters, managed by clusters, and possibly exported to remote clusters. Within a cluster, there is a need for certain management functions. Because we would like the cluster to remain active even in the face of systems or network links failing, … Continue reading Cluster quorum and Spectrum Scale
Modeling disk performance (traditional RAID)
Many factors go into determining the performance of a storage system, and a common question is: What will be the performance of this storage system? Ultimately, actual performance can only be determined through benchmarking with the actual workloads, but this isn't always possible when planning to acquire a storage system. We need a way to … Continue reading Modeling disk performance (traditional RAID)
POSIX file system basics
POSIX is the Portable Operating System Interface standard, IEEE Std 1003.1-1988 and related. These standards, based on the Unix operating system, define a set of programming and command interfaces. Programs and scripts following these standards are supposed to be easily portable between operating system platforms providing these interfaces. The POSIX standards imply a model for file system … Continue reading POSIX file system basics