Welcome!

@DXWorldExpo Authors: Zakia Bouachraoui, Yeshim Deniz, Liz McMillan, Elizabeth White, Pat Romanski

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Cloud Security, @DXWorldExpo, SDN Journal

@CloudExpo: Blog Post

The Paradox of Ephemeral Cloud Storage | @CloudExpo [#Cloud]

The moral of the story here is simple: if you put anything beyond your base OS on ephemeral storage, you are at great risk

The very name is kind of ridiculous, don't you think? The word "ephemeral" means it can go away. It's temporary. Fleeting, even. So why would I want to depend on storing something in a medium that can disappear without warning? And why am I forced to buy more of it when all I want is more CPUs or RAM?

Welcome to the paradox of ephemeral storage from cloud computing providers.

Origins and Explanations
Ephemeral storage exists only because of how first-generation cloud providers chunk up servers. The business model is simple: they buy a physical server and try to sell as many virtual machines (VMs) as possible on top of that physical server. Since the VMs are trapped on physical machines in this approach, first-generation providers dictate cookie-cutter sizes that make that stacking game easier for themselves.

In the process, though, these providers can't do anything to improve the redundancy of the disk on the physical servers, and are thus unable to offer guarantees on its availability. Instead they tell you not to trust it. It can evaporate. "Code around it instead" is what we are told.

If I can't trust it, how come I'm forced to buy more of it when I want bigger VM dimensions in other places, seeing as I probably only need 10GB for my operating system anyway? Consider the sizing chart below from PlanForCloud:

Take a look at that largest size. Who wants a 1.6 TB cloud storage liability?

Google Compute Engine and ProfitBricks Bring Sanity
One of the great features of Google Compute Engine is its approach to ephemeral storage. Google refers to this as Scratch Storage and in many cases limits each machine to 10 GB of it. That's just enough to build a base operating system upon, and that's obviously on purpose. Kudos to them.

ProfitBricks takes this a step further by not offering ephemeral storage at all. Instead, the physical servers housing the CPU cores and the RAM are on a separate pool of resources from the disk array that provides the block storage. Good IOPS is maintained by connecting the two with an 80 Gbps InfiniBand network. In the ProfitBricks model, all storage is akin to highly-available redundant block storage.

What You Really Want Is Block Storage
One of the things that public cloud noobs have a hard time getting their heads around at first is the difference between ephemeral storage and block storage. The latter, which every IaaS vendor offers, has some level of redundancy built into it and is where data should really be stored. Below are examples of how several vendors approach that redundancy, with better resulting availability:

Vendor

Block Volume Redundancy

Max Volume Size

AWS

"multiple servers in an Availability Zone"

1 TB

Azure

Offer both locally redundant and geographically redundant

1 TB

GCE

"replicated for additional redundancy"

10 TB

ProfitBricks

Double redundant RAID 10 across two Availability Zones

16 TB

Lessons Learned
The moral of the story here is simple: if you put anything beyond your base OS on ephemeral storage, you are at great risk. That data could go away at any time. You can't depend on it, so don't use it unless you add in an additional form of redundancy at your own engineering expense. Data you care about belongs on block storage: it has built-in redundancy and improved availability, which ensure that the data you care about will be there when you need it.

More Stories By Pete Johnson

Pete Johnson is senior director of product marketing at CliQr Technologies, where he focuses on the support of applications running on OpenStack based clouds. He is interested in the long-term management of applications in public and private clouds, and avoiding vendor lock-in. Prior to joining CliQr, Pete was senior director of platform evangelism at ProfitBricks after spending 19 years with HP as a heads-down developer, technical lead and chief architect.

Comments (1) View Comments

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Most Recent Comments
cabronito 07/17/13 04:21:00 PM EDT

If you used a cloud like backupthat, you wouldn't need to worry about it being ephemeral. All of your files would be backed up into your email.

DXWorldEXPO Digital Transformation Stories
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...