At first glance, cloud costs look simple. Vendors publish transparent rate cards, you pick what you want, and get an itemized bill. But real life is more complicated…
I downloaded the rate card of one hyperscaler last week and it contained over 200,000 different rates. Each service had numerous variants, and rates differed according to region, tier, and volume.
Then every month, the provider kindly sends you an itemized bill, with tens of thousands of detailed line items on it. Most organizations have 2-3 public cloud providers, so they’re getting 2-3 of these bills, and the terminology is different on each one.
For example, do you know the difference in cost of these two Fast Healthcare Interoperability Resources (FHIR)?
- API for FHIR – 100 Provisioned Throughput RU/s – CH North
- API for FHIR – Standard – Structured De Identification – NO West
Off hand, neither do I.
So, what should you do? Just pay the bill? Of course, the overall bill will be accurate – the providers are honest. But suppose you see one line item going up unexpectedly month after month? Any responsible organization will want to understand what’s going on. Who requested this resource? Do they really need it? Are they actually using it, or only some of it?
Excessive cloud costs are easily incurred
One of the main reasons cloud costs can escalate is that the people incurring the cost are often unaware of, or not concerned about, the costs they are incurring.
A developer, for instance, is rightly focused on developing software and may not be measured or incentivized to monitor the costs involved. They may write a script that issues thousands of API calls generating cloud resources and incur potentially large costs.
In the old days, to make a significant purchase, you’d have to build an investment case and get the capital expenditure signed off in advance. Now a developer can go directly to a cloud console and spin up a server that costs €1,000/day and it could be the end of the month before finance realizes that an additional service has been purchased.
It’s also very difficult for resource requesters to be sure they’re buying the right resource. If a budget owner wants to build a resource forecast, for instance, when they go to the provider’s console, they’ll immediately be faced with lots of technical questions that make a big difference to cost:
- Do you want a dedicated host or dedicated instances?
- Do you want to pay on demand, or for spot instances, or reserve capacity?
- Do you want elastic block storage variant GP2 or GP3?
As if all this wasn’t enough, organizations can get hit with huge cloud costs if criminals hijack their environment for coin mining – and they have no qualms about using the highest grade, most expensive resources they can access.
I know of one example where hijacked cloud resources were costing the organization €35,000/day and it took them 20 days to realize because security monitoring and financial monitoring were not integrated. In a situation like this, you could ask the provider to waive the bill, but since the resource was used, they’re entitled to insist on payment.
The costs of unnecessary or unused resources
Eliminating waste is easier said than done. Even a well-managed organization with a mature cloud financial management (CFM) practice can have something like 2% waste. I’ve seen large organizations, which are competent in many other ways, wasting 50% of their cloud resources in development environments.
Even assuming staff are cost-aware, they often simply forget to cancel services when they no longer need them. When they launch an app, they fire up a server, but if use of that app declines or stops completely a year later, do they always scale down or cancel the server?
It’s also quite common to find orphaned disc storage, where people cancel the server but forget to cancel the disc. In fact, it is best practice to wait a few weeks after canceling a server before canceling the disc, in case you need that data to recover from an incident. But 50% of people forget to cancel the disc.
Other common ways why unused resources arise are as follows:
- Test resources that aren’t canceled following the test
- Resources being evaluated in a PoC don’t get turned off afterwards
- More capacity than needed is ordered without realizing it
Internal cost allocation isn’t easy either
Actually, at the top level, it needn’t be difficult to allocate some cloud costs by department but it requires the organization to set up separate subscriptions for each department in advance using the company’s own naming protocol. And it relies on the discipline of requesters to follow the protocol.
However, complexity soon kicks in when the organization wants to allocate costs across departments, by region, or business unit. For example, how should you allocate shared resources, like networking, that all departments use? Is a simple linear allocation fair; or would pro rata more accurately reflect actual usage?
It also isn’t easy to prove to paying departments that their cost allocation is fair. Your show-back report will inform them of what they’re going to be charged, but will they understand it?
It’s usually composed of several subsets of those huge incomprehensible provider bills I mentioned before. Unless someone translates it into their business language, the paying department won’t know if they’re being charged correctly.
There is an answer
Cloud cost control may be complex, but there is a way to simplify it. Eviden’s financial management services can help you create a cost-conscious culture in your organization. We can do your cost monitoring for you as a managed service and provide understandable reports, helping you avoid waste and reduce costs.
For a deeper dive, you can connect with me to arrange a free demo.