Archives for category: Cloud

Storage performance is core to application performance and data access.  When we talk about storage performance, we typically talk about IOPS and throughput, but there is a third variable, latency.  Latency measures the time it takes for storage to respond to a request or instruction issued by the CPU.    The lowest latency is achieved by delivering data from memory at the speed of memory.  The typical latency is at 1us.  If data could be delivered at such latency, we would have a highly efficient server architecture, but  there are a number of factors that prevent out ability to see latency at that level.

  • Application latency – the inherent architecture of the application may make it impossible to achieve microsecond latency.  Typical operations add to the overall latency.
  • Local file system – since DRAM is volatile, data that requires persistence must be committed to persistent media before an acknowledgement is returned.  The local file system is responsible for taking blocks off DRAM and copying them to other media on the I/O bus.  A common Linux file sytsem such as XFS or EXT4 add as much at 250us.  Even with the replacement of DRAM with NVDIMM (persistent memory), the latencies remain at minimum at 250us.  Though 250us may seem like nothing, in a typical database environment the reduction of 250us alone would increase IOPS and throughput per core by 350% and 410% respectively.
  • Network – When data travels over the network, whether it is FC or IP, there are added latencies.   Most all SSD/Flash arrays deliver performance at 1ms or more latency.  If SSD/Flash is sitting on the PCIe bus, that latency may be reduced to  a range between 500 and 800 microseconds.  Recently, a new protocol has been developed to allow shared storage (SAN) to deliver the same latency as storage on the PCIe.   This is the NVMe standard.
  • Drive media – Flash has a lower latency profile than HDD; it is not surprising since HDD is a mechanical device where the speed with which the platters spin correlates to the time it takes for the data to be pulled off the drive.  Flash is not a mechanical media and doesn’t have the same delays built in.

Of course we can’t leave out IOPS and throughput.  IOPS measures how many operations can be performed per second while throughput is how much data can be transferred through a given pipe.  Depending on the application, one of the other of these metrics will be more relevant.

For applications that stream data sequentially require more bandwidth and are therefore more concerned with throughput.  Thoughput may be calculated by the total bandwidth of the drives in a given system, the controllers, and the network.  Even if you have a system capable of delivering gigabytes of data, it still needs the network to carry the data.  There is often an imbalance between the network and system capabilities.  Recently a client expressed concern exemplifying this issue.  As a research institution there is a lot of data created by the labs and then processed by the investigators.  The challenge they are facing is that the amount of data being created and moved to a centralized location is much greater than what the network can handle.  As a result, they are unable to transfer data over the wire; some use tape or don’t move data at all.

IOPS  measures the number of operations a drive or a system can perform.  We have seen huge gains with the adoption of SSD/Flash.  Where a 15K RPM drive has the ability to deliver around 180 IOPS, a flash drive has the ability to deliver thousands of IOPS.  About 10-15 years ago storage administrators would be forced to over-provision capacity in order to get enough drives in a RAID set to deliver required number of IOPS.  As an example:  if your application needed 1 TB of data and 1,500 IOPS, using 15K drives at 300GB of capacity each an administrator would have to provision 4 drives to reach required capacity and 9 drives to reach the required IOPS.  Today,  capacity and IOPS can be balanced.

Not all applications require microsecond latency, thousands of IOPS and gigabytes of throughput, but with higher performance, when properly designed, the system can perform at a much higher level of efficiency, both operational and financial.  Next time we talk about performance, let’s make sure we are clear what performance we need.

 

One of my biggest challenges every day is to cut through the industry noise and get to the bottom of what vendors are selling and what customers are buying.  It is a challenge because vendors message to what they think customers want to buy (not necessarily what they have to sell) and customers want to buy what they are hearing from the industry as what they need.  The reality is a lot simpler; what customers want to buy hasn’t changed in decades.

Enterprises want to leverage their IT resources to drive more business, more revenue, more profitability.  This means that IT must be more efficient, effective, differentiating, agile, and responsive.  These are the high level wants and needs.  Each organization translates these requirements into technical specification based on some criteria such as performance, scalability, cost, simplicity, risk, etc.  How these are prioritized depends mostly on the person/organization making the decision.

The noise complicates the conversation.

Enterprise need to become operationally more efficient and cut costs.  This doesn’t mean they want to buy cheap stuff.   It is about the price only when all other variables are equal.  The industry has instilled in the users the idea that cloud is cheaper and more flexible; you pay only for what you use.  There are many ways to define what cloud is, but if we take cloud infrastructure offerings, once you really look, they may not be cheaper or more flexible.  Here are two examples to demonstrate:

  • Company XYZ needs to store 1PB of data for 7 years.  It is not clear whether data will be accessed regularly or not, but there is a need for it to be secure.  Option 1 is to use cloud storage (S3, Glacier, Google Nearline).  A single location of public storage cloud is average $0.01 per GB per month.  Without accounting for egress and transaction costs, that equates to $123 per TB per year.  Over 4 years, the cost of keeping a PB in the lowest tier of cloud, in a single location would be $503,808.  Keep in mind that depending on where the cloud data center is located, you might need to concern yourself with mother nature.  If you store two copies for geographic distribution, your cost doubles to over a million in 4 years.  Conversely, you may procure an object storage system to host 1 PB of data for $400 per TB over 4 years.  The total cost of this solution would be $409,600.  Some object storage vendors support geo-dispersal which allows you to stretch the system across 3 sites with ability to sustain site failure without data loss.  The cost of such deployment would not be different than already stated $410k.  The facility costs may be off set by the lack of egress and transaction costs.

 

  • Another Example is company ABC is running a marketing campaign and requires compute and storage resources for the duration of the program, which is 9 months.  Provisioning a decent server in the cloud with a few TB of data and snapshots may cost $210/ month.  This equates to $1,890 for the duration of the project.  You might need to add a backup client for the data, but that could be another few hundred dollars.  If you had to purchase a server, it could cost you 4,500.

No one wakes up and says, I want to go cloud.  What they really want is faster and simpler way to deploy IT resources and applications, to pay for resources that they consume only and not have to pay forward, and to simplify management of their infrastructure.  Some will be willing to pay more to achieve these results, others may not.

There is a way for some to achieve these goals on premise or in a hybrid configuration.  First, identify applications that are not core to your business and can be better served via a service provider.  This could be CRM, email, or SCM.  Then evaluate your environment for places where resources can be shared among departments.  The more an organization centralizes IT services, the more efficiency can be achieved and the greater opportunity for flexibility in how resources are assigned and consumed.  The private cloud concept is exactly this, centralized IT services where end users can select what resources they need and an orchestration and management layer that simplifies provisioning and allocation of resources and tracking of consumption.

Though there are many variables that go into any buying decision, the conversation has to start with what does the business need.  Messaging the market that cloud is the only way, cloud is cheaper and faster, all SSD or Flash is the answer to all your prayers, or that you need 32 Gbps FC when you can barely fill an 8 Gbps pipe doesn’t help users make good decisions.  Instead of the hype and the noise, let’s build, package and deliver products and services that will move the enterprise forward. I seem to have an idealistic view of the world, but a girl can dream.

I spend a lot of time talking to end users about their needs, what is working and what is not.  What surprises me often is the view they have of the cloud.  Cloud is cheaper, it is more agile, it is deployed instantly…..There is no argument that conceptually, using a public cloud is easier than provisioning servers on premise, though outsourcing an application to a SaaS provider is even easier.  And yet, there are gotchas in each scenario.  Here are a few things I learned recently:

  • SaaS providers today provide application availability SLAs, not data integrity or availability SLA.  This means that data loss or accidental anything has no affect on the service provider’s compliance with their promises.  In other words, if the data is that important to you, you need to back it up.  Seems like a simple concept, except that you don’t have a dedicated server or an application instance; this is a multi-tenant environment and there is nothing to put an agent on for a backup.
  • Putting data in the cloud seems like the safest place for it to be.  The cloud provider says so.  You pay $x per GB per month and the provider stores your data.  Data placed in the cloud is stored either in a RAID, mirroring, or erasure coded configuration within the chosen data center location.  If you used to replicate your data between sites so you have some business continuity or disaster recovery…well,  you don’t automatically get it with cloud.  The providers only store in a single location and if you want to have your data in a separate location, you have to pay a separate fee.  This means if you are paying $0.01/GB/Mo, which is about $120/TB/Year, only applies to one data center. If you want a second location, that will be an additional $120/TB/Year.
  • We love the idea that we can provision whatever resources we need, both compute and storage.  Sounds really good; I can provision what I want and need and it is available to me immediately unlike when I have to ask my IT folks to give me a virtual machine.  That is not exactly how it works.  Most cloud providers offer a variety of templates that can be selected.  These are machines that have been already designed with CPU, memory, cache, and storage.  If you need more of something and less of the other, you just have to use what is given to you.  At times, this means that your machines may be either over-provisioned in some areas or under-provisioned in others.  Though there is always a cost attached to each resource, it might be insignificant to the value the end user sees in the service.
  • We often look at other companies using cloud services and say to ourselves, well, if they are using it for all their IT needs, why shouldn’t I.  One common example is Netflix.  Here is a question to ask one self, what is my business model and what are the dependencies and drivers of my business.  This is a really important question because whether you can benefit economically and operationally from the cloud will depend on your business.  As an example:  if you are Netflix and you are providing a streaming service, you need to support as many streams as possible for a single asset for many different assets.  If we equate each stream is a user and each user presents a revenue amount, paying on the fly for more resources is covered by the value creation of such resources.  On the other hand, a less dynamic business like pharma or oil and gas conduct numerous studies that may become revenue producing over time.  Their investment must go as far as possible in order to contain investment costs.  The business driver for Netflix is agility; the business driver for oil and gas is cost containment. Speaking of costs, did you know that IaaS is not less expensive than infrastructure on premise?

It may not seem like I am a fan of cloud, but I am.  I remember back in 2000 when we were trying to figure out how to better utilize resources by sharing them across departments and even organizations.  We didn’t have the right technology then, but we are on our way to having it now.  What cloud really offers is the promise of even greater efficiency than just virtualization and with greater efficiency, lower cost and more productivity per dollar spent.  If we change the conversation from cloud first to what drives my business, then we can come up with an architecture that consists of on premise and cloud environments where the decision to use or the other will be based on what serves my needs in the most cost effective, relevant way.

My job requires me to be at the intersection of customer buying products and services and the industry creating and bringing to market technology.  I have found that there is a great disconnect between what the industry is hyping and what is really possible.

For a number of years now we have been touting the cloud as the answer to all our infrastructure aches and pains.  “If you go cloud, you will have more flexible, just what you need, less expensive services” the trade magazines and pundits claim.  The reality though is, “it depends”.

The concept of utility computing has been around for some time. Back in the dot.com boom there were a number of companies attempting to provide storage as a service, shared infrastructure, etc.  I actually worked for one of these companies, Genuity.  What really defines utility based services is the delivery of a service just in time and the payment for such service based on consumption.  That is how the electrical services work.  And if we all needed the same exact service varying only in the quantity of it, then we would be set, but application infrastructure doesn’t run that way. If you poll organizations that have standardized on VMware as an example.  They all may have and even run application such as MSSQL Server or MySQL or another common applications, but the demands of these applications on the infrastructure will be different in every situation.

When customers ask me about cloud or how to get there, usually because someone higher up has decided that cloud is the way to go, I first ask them what it means to them.  I then try to understand the drivers behind wanting to go to the cloud.  Here are some reasons that make sense:  spikes in demand, seasonal applications or projects, don’t want to manage my application, don’t have a secondary site for my backups or DR.  The most common way to embrace the cloud actually aligns with some of the traditional business concepts such as ‘focus on your core competency and outsource secondary services’.  This means if an application or service is not core to your organizations business objectives, then consider outsourcing it.  Best examples of this include email outsourcing (Office365, gmail, other email services), email archiving, CRM, telephony and conferencing, backups, and file sharing.  It also makes sense that if you need some resources for a short period of time, it is more likely to be cost effective to go to the cloud than to procure it in house.

Of course we should keep in mind that not all clouds are the same and that not all applications are the same.  The traditional enterprise applications are highly dependent on the underlying infrastructure to perform while newer cloud-centric applications have build much of those dependencies into the application it self.  This means that your Oracle db may not work well in EC2 but your MongoDB will have no issues.

Finally, if we are talking about utility we are talking about operational costs.  If the goal is to achieve OPex rather than CAPex, cloud is not the only answer.  There are traditional outsourcing offerings in the market that allow you to consume as an OPex, even if the infrastructure is dedicated to you.  There are specialty service providers that offer services for specific applications where the infrastructure is shared but the application is yours and yours alone.  At the bottom of all these options is operational leasing.

I am not saying that cloud is not great and that it is not reality.  What I am saying is that we have to be careful when we refer to cloud.  We need to qualify what we mean, what, expect.  The technology continues to evolve; there is a lot of innovation in the industry today and we are making great progress to making cloud more ubiquitous.  Part of it designing and building applications that run better on commodity infrastructure; part is enabling quality of service and custom service delivery in a multu-tenant environment.  If you think you want cloud, just make sure you have a clear idea of what that means to you.