Posted inServers & StorageInfrastructureSoftware

Why data’s “Roaring Twenties” calls for fast file and object storage

Object storage is becoming increasingly important and well-established, driven by the web and the rise of the cloud

Fred Lherault, field CTO, EMEA, Pure Storage

This decade is shaping up to be the Roaring Twenties of unstructured data. According to Gartner, unstructured data growth rates have hit 30% per year, which means total unstructured data volumes will almost quadruple by 2027.

Such data growth is a challenge in itself, but unstructured data also comes in a variety of sizes and can be stored as files or objects, with increasingly demanding storage performance needs. This has resulted in a new category of storage emerging to provide unified fast file and object storage.

What’s driving the need for fast file and object storage?

The general backdrop is the growth in unstructured data, which can comprise very large amounts of very small files or objects — often billions of them. Unstructured data can also come as a smaller number of much larger files or objects, such as video or high-definition images. It could also be a combination of the two. Modern analytics workflows, for example, may need to access a wide variety of data types of different sizes.

Fast file and object: I/O performance and throughput

Another key driver of fast file and object storage is storage performance to access this unstructured data. We’ve seen an explosion in analytics and machine learning, driven by the need to distill value from enormous amounts of raw data.

Meanwhile, digital imagery is a rapidly-growing use case, such as PACS (picture archiving and communication systems) in the healthcare industry. An example here is the pioneering use of machine learning for cancer diagnosis by US-based Paige, which needs petabyte-scale storage capacity with rapid access and high throughput to allow machine recognition across millions of images in patient tissue samples. This demands high performance access to file and object data.

In addition, backup and data protection can produce large numbers of files and objects of various sizes. While backups may once have been consigned to the slowest storage, very fast restore speeds are now required to help recover data rapidly in the event of a ransomware attack.

Object Storage

Fast file and object: Why the and?

The addition of fast object storage is a key innovation. For decades file storage has been a mainstream option, with scale-out NAS solutions ramping up capacity and performance to support unstructured data, but object storage is becoming increasingly important and well-established, driven by the web and the rise of the cloud.

Files and objects can hold the same types of contents. But, while file systems use a hierarchical directory-based system, object storage uses a ‘flat’ structure with objects assigned an individual identifier and metadata that can be used to contextualise these objects.

Historically, object has been the least-performant storage type and has formed a quite separate product category. This view of object storage is changing, as customers increasingly need to interrogate large amounts of unstructured data that can be in object format as well as file.

Additionally, as applications and use cases evolve from file to object access, organisations require a platform that can support both access methods and ensure investment protection during and after this transition. All these factors have led to the emergence of high-performance storage solutions that combine access to file and object.

Fast file and object benefits

Unlike traditional structured data — such as a database supporting an ERP system — which tends to be fairly static, unstructured data can span many locations and access methods during its lifecycle.

Today’s emerging fast file and object storage products support network file system (NFS) and server message block (SMB) file protocols, which are compatible with the way many existing enterprise applications are written.

Additionally, fast file and object solutions can also handle unstructured data in object-access formats that are the result of their cloud origins, such as Amazon S3. Fast file and object storage is therefore also ideal for hybrid clouds, with unstructured data that can transition between on-site and cloud locations.

Data storage firm Pure Storage witnessed increased use of their service model last year

What do customers need to look for in a fast file and object storage product?

Firstly, capacity. The platform needs to scale to your needs, which for many enterprises could be petabytes. Since unstructured data can grow quickly, scaling the solution also needs to be easy and not involve complex network configuration or manual data rebalancing tasks.

Secondly, it must have file and object storage access, offering the key protocols such as NFS and SMB for file and S3 for object access.

Thirdly, it must be built for fast access and high throughput. Low latency — especially for read and metadata access operations — is required to unlock the potential of AI/ML as well as many modern analytics frameworks. All-flash storage offers this fast access thanks to its solid-state nature.

Speed is key

Whether it is to analyse very large datasets or to perform a massive restore operation after a ransomware attack, unstructured data can require very high access performance. Low latency needs to be coupled with high throughput. For data analytics, this means speeds measured in tens of gigabytes per second. When it comes to restoring systems following an outage or ransomware attack, enterprise customers should look for throughput numbers that get close to 300TB per hour, to limit downtime and the financial and reputational damage that comes with this.

Additionally, high performance both from a latency and throughput point of view must be provided by the platform automatically and without tuning. The world of unstructured data and modern analytics is evolving so quickly that it is difficult to predict what tools, file format, data set size or access methods will be required tomorrow. Any storage solution that requires manual configuration or tuning to deliver high performance for a given use case will stifle innovation and delay projects.

The world of data storage is truly embarking on the Roaring Twenties. The explosive growth of modern analytics, machine learning, video and image intelligence as well as ransomware attacks will require storage solutions built for large volumes of unstructured data, with roaring performance levels and flexibility in terms of access methods. Fast file and object storage platforms are the answer to the data challenges of both today and tomorrow, and are designed to support enterprises as they look at unlocking the value of unstructured data.