Network Storage
eBook - ePub

Network Storage

Tools and Technologies for Storing Your Company's Data

  1. 280 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Network Storage

Tools and Technologies for Storing Your Company's Data

Book details
Book preview
Table of contents
Citations

About This Book

Network Storage: Tools and Technologies for Storing Your Company's Data explains the changes occurring in storage, what they mean, and how to negotiate the minefields of conflicting technologies that litter the storage arena, all in an effort to help IT managers create a solid foundation for coming decades.

The book begins with an overview of the current state of storage and its evolution from the network perspective, looking closely at the different protocols and connection schemes and how they differentiate in use case and operational behavior. The book explores the software changes that are motivating this evolution, ranging from data management, to in-stream processing and storage in virtual systems, and changes in the decades-old OS stack.

It explores Software-Defined Storage as a way to construct storage networks, the impact of Big Data, high-performance computing, and the cloud on storage networking. As networks and data integrity are intertwined, the book looks at how data is split up and moved to the various appliances holding that dataset and its impact.

Because data security is often neglected, users will find a comprehensive discussion on security issues that offers remedies that can be applied. The book concludes with a look at technologies on the horizon that will impact storage and its networks, such as NVDIMMs, The Hybrid Memory Cube, VSANs, and NAND Killers.

  • Puts all the new developments in storage networking in a clear perspective for near-term and long-term planning
  • Offers a complete overview of storage networking, serving as a go-to resource for creating a coherent implementation plan
  • Provides the details needed to understand the area, and clears a path through the confusion and hype that surrounds such a radical revolution of the industry

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Network Storage by James O'Reilly in PDF and/or ePUB format, as well as other popular books in Ciencia de la computación & Redes de computadoras. We have over one million books available in our catalogue for you to explore.

Information

Chapter 1

Why Storage Matters

Abstract

The impact of storage on information technology is increasing, as the speed and rate of change catch up with Moore's law. This is timely, since we are starting a storage “explosion” that will increase both capacity and network load by large factors each year for the next decade or more.

Keywords

All-flash arrays; Fiber channel; HDD; IOPS; RAID; Redundant array of inexpensive disks; Solid-state drives; SSD performance; Storage appliance
In the big picture of information technology (IT), why is storage demanding so much more attention than in the past? In part, the change of emphasis in IT is a result of ending some three decades of functional stagnation in the storage industry. While we saw interfaces change and drive capacities grow nicely, the fundamentals of drive technology were stuck in a rut.
The speed of hard drives barely changed over that 30-year period. Innovations in caching and seek optimization netted perhaps a 2× performance gain, measured in IO operations per second (IOPS). This occurred over a period where CPU horsepower, following Moore’s law, improved by roughly 1 million times (Fig. 1.1).
We compensated for the unchanging performance by increasing the number of drives, using RAID (redundant array of inexpensive disks) to stripe data for better access speeds. The ultimate irony was reached in the late 1990s when the CTO of a large database company recommended using hundreds of 9-GB drives, with only the outside 2 GB active, to keep up with the system. At the prices the storage industry giants charged for those drives, this made for a very expensive storage farm!
In the last few years, we’ve come a long way towards remediating the performance problem. Solid-state drives (SSD) have changed IO patterns forever. Instead of a maximum of 300 IOPS for a hard drive, we are seeing numbers ranging from 40,000 to 400,000 IOPS per SSD, and some extreme performance drives are achieving 1+ million random IOPS [1].
The result is radical. Data can really be available when a program needs it, rather than after seconds of latency. The implications of this continue to ripple through the industry. The storage industry itself is going through a profound change in structure, of a level that can fairly be called a “storage revolution,” but the implications impact all of the elements of a datacenter and reach deeply into how applications are written. Moreover, new classes of IT workload are a direct result of the speed impact of SSD. Big Data analytics would be impossible with low-speed hard drive arrays, for example, and the Internet of Things implicitly requires the storage to have SSD performance levels.
Since SSD first hit the market in 2007, the hard disk drive (HDD) vendors have hidden behind a screen of price competitiveness. SSDs cost more per gigabyte than HDD, though the gap has closed every year since the introduction of SSDs into the market. They were helped in their story by decades of comparing drives essentially on their cost per gigabyte—one thing you learn in storage is that industry opinion is quite conservative—rather than comparing say 300 of those (really slow) 9-GB drives with one SSD on an IOPS basis.
image

Figure 1.1 Relative performance of CPUs compared with hard drives over three decades. HDD, hard disk drive.
Today, SSD prices from distributors are lower than enterprise HDD, taking much of the steam out of the price per capacity argument. In fact, it looks like we’ll achieve parity on price per gigabyte [2] even with commodity bulk drives in the 2017 timeframe. At that point, only the most conservative users will continue with HDD array buys, and effectively HDDs will be obsolete and being phased out.
I phrased that last sentence carefully. HDDs will take a while to fade away, but it won’t be like magnetic tape, still going 25 years after its predicted demise. There are too many factors against HDDs, beyond the performance question. SSDs use much less power than HDDs and require less cooling air to operate safely. It looks like SSDs will be smaller too, with 15 TB 2.5 inch SSDs already competing with 3.5 inch 10 TB HDDs, and that appeals to anyone trying to avoid building more datacenter space. We are also seeing better reliability curves for SSDs, which don’t appear to suffer from the short life cycles that occasionally plague batches of HDDs.
The most profound changes, however, are occurring in the rest of the IT arena. We have appliances replacing arrays, with fewer drives but much higher IOPS ratings. These are more compact and are Ethernet-based instead of the traditional fiber channel (FC). This has spurred tremendous growth in Ethernet speeds. The old model of 10× improvement every 10 years went out the window a couple of years back, and we now have 10 GbE challenged by 25 GbE and 40 GbE solutions. Clearly Ethernet is winning the race against the fiber channel SAN (storage area network), and with all the signs that FCoE (Fiber-Channel-over-Ethernet) has lost the war [3] with other Ethernet protocols, we are effectively at the end of the SAN era.
That’s a profound statement in its own right, since it means that the primary storage architecture used in most datacenters is obsolescent, at best [4]. This has major implications for datacenter roadmaps and buying decisions. It also, by implication, means that the large traditional SAN vendors will have to scramble to new markets and products to keep their revenues up.
In a somewhat different incarnation, all-flash arrays (AFA), the same memory technology used in SSD is being deployed with extreme performance in the millions of IOPS range. This has triggered a rethink of the tiering of storage products in the datacenter.
We are moving from very expensive “Enterprise” HDD primary arrays (holding hot, active data) and expensive “Nearline” secondary arrays (holding colder data) to lower capacity and much faster AFA as primary storage and cheap commodity hard drive boxes for cold storage. This is a result of new approaches to data integrity, such as erasure coding, but primarily to the huge performance gain in the AFA.
The new tiering is often applied to existing SANs, giving them an extra boost in life. Placing one or two AFAs inline between the HDD arrays and the servers gives the needed speed boosts, and those obsolescent arrays can be used as secondary storage. This is a cheaper upgrade than new hybrid arrays with some SSDs.
Other datacenter hardware and networking are impacted by all of this too. Servers can do more in less time and that means fewer servers. Networks need to be architected to give more throughput to the storage farms, as well as getting more data bandwidth to hungry servers. Add in virtualization, and especially the cloud, and the need for automated orchestration extends across into the storage farm, propelling us inexorably towards software-defined storage.
Applications are impacted too. Code written with slow disks in mind won’t hold up against SSD. Think of it like the difference between a Ferrari and a go-kart! That extra speed, low latencies, and new protocols and tiering all profoundly affect the way apps should be written to get best advantage from the new storage world.
Examples abound. Larry Ellison at Oracle was (rightly) proud enough to boast at Oracle World in 2014 that in-memory databases, with clustered solutions that distributed storage and caching, had achieved a 100× performance boost [5] over the traditional systems approach to analytics.
Even operating systems are being changed drastically. The new NVMe protocol is designed to reduce the software overhead associated with very high IOPS rates, so that the CPUs can actually do much more productive work.
Another major force in storage today is the growth of so-called “Big Data” [6]. This is typically data from numerous sources and as a result lacks a common structure, hence the other name “unstructured data.” This data is profoundly important to the future both of storage and IT. We can expect to see unstructured data outgrow current structured data by somewhere between 10 and 50 times over the next decade, fueled in part by the sensor explosion of the Internet of Things.
Unstructured data is pushing the industry towards highly scalable storage appliances and away from the traditional RAID arrays. Together with open-sourced storage software [7], these new boxes use commodity hardware and drives so as a result they are very inexpensive compared with traditional RAID ar...

Table of contents

  1. Cover image
  2. Title page
  3. Table of Contents
  4. Copyright
  5. Acknowledgment
  6. Introduction
  7. Chapter 1. Why Storage Matters
  8. Chapter 2. Storage From 30,000 Feet
  9. Chapter 3. Network Infrastructure Today
  10. Chapter 4. Storage Software
  11. Chapter 5. Software-Defined Storage
  12. Chapter 6. Today’s Hot Issues
  13. Chapter 7. Tuning the Network
  14. Chapter 8. Big Data
  15. Chapter 9. High-Performance Computing
  16. Chapter 10. The Cloud
  17. Chapter 11. Data Integrity
  18. Chapter 12. Data Security
  19. Chapter 13. On the Horizon
  20. Chapter 14. Just Over the Horizon
  21. Conclusion
  22. A Brief History of Storage Networking
  23. Glossary
  24. Index