I never got a surprise bill myself, but reading a few cases like this motivated me to cancel my GCS account and remove my CC. Now if I try to use it it fails immediately with an error.
As author of HashBackup, I know people are using it with GCS, and I'd like to be able to test against it, but not enough to swallow a large surprise Google bill.
Suddenly turning off services when the billing cap is reached is a big reliability risk for customers. All workloads using that billing account would be impacted immediately.
If they weren't turned off at the billing cap, but were given some leeway instead, either that becomes the new hard limit, or GCP will have to give away the difference.
And there's no "middle ground" you could implement that makes sense either - like a "frozen" state. Preventing new writes to a GCS bucket breaks the writer app. Freezing VMs serving web traffic takes the site down.
Even if the service was shut down once the billing limit is hit, how long would GCP wait for the user to add funds or raise the limit? GCP would need to either keep the services in a hidden/frozen state or not turn them off / freeze them at all (in which case GCP would be giving away resources for free).
Maybe GCP can give users a heads-up when they're about to hit the limit? GCP already does - billing alerts do exist. It's just possible to blow past them if your usage is a massive spike.
Moreover, getting the hundreds of GCP services to implement a "frozen" state is difficult. It's hard enough getting everyone to listen to the "billing account disabled" signal, and (soft-)delete the resources (based on the resource, after some time interval). Given these billing overruns happen for smaller customers, it's not really worth solving the problem - which I don't think has a great solution to begin with.
They can wave 18k pretty easily because they'll make it up in 18,000 other overcharges of varying amounts that aren't contested. Or that are corporate clients that just eat it.
Things like this are the exact reason that companies end up having to comply with all kinds of regulations. It's just easier to screw the customer first.
Same reason that you can't cancel a gym membership by walking up to a person and saying "cancel my membership".
Predation, pure and simple.
Google doesn't care, no one is holding them responsible for predatory behavior. It's profitable to steal from your customers and there's no downside to doing so.
Sounds like a great mistake for Google to find a way to repeat. Why innovate when you can abuse users and hide behind "complexity" as "plausible deniability"?
We got a $12,000 bill a few hours ago on a presumably leaked gemini API which I very much doubt, and we are trying to resolve the issue with a real support agent. I think they messed something internally and customers are getting these bills.
The only way I can read that is 'setting a cap does nothing' but reading that tells me that it only turns on email notifications. Not any better really. It's simply not a cap. It's an alarm.
Agreed. I don't agree with the underlying design decisions either, but you literally set a "Budget Alert" on Google Cloud. It's designed to be an alarm, not a cap. I was just trying to point that out
I am fairly sure that this is antipattern on purpose. If you ask a thousand people on the street what a budget means - they will coalesce on - the money I am willing to spend on something.
The fact that google redefine what budget means and put a warning doesn't make it ok.
Even if US clouds will make more money per capita in the US it has more total income and political advantage from the other 95% of people. US law will be as pro-consumer as Delaware law. (And it will be as restricted to the jurisdiction that decides it as IP law.)
No, I think Google should provide easy tools to actually cap spending, instead of recommending you set quota limits on your APIs.
The article, and the comment I was replying to, make it seem like an error in the Google Budget system. I'm simply trying to say this system is working as designed and documented.
I think I read somewhere that calculating and limiting cloud usage costs is a really hard problem. But I feel that if Google were motivated to do it, they can do it. It's hard, not impossible. They just don't care to solve this particular problem.
If they can COUNT it and charge based on that, that means they can count it and react.
If I, not having their budget or engineers, can have pretty much instant Prometheus event reacting to metrics, surely it wouldn't be too hard for them to have triggers like this -- somehow their AI can automatically ban people based on something, can't they do something for the customers?
It's the same fundamental problem as view counters, something Google is famously good at solving. Eventually consistent solutions are well-understood, and wouldn't have these kinds of massive cost-overruns.
It seems hard to believe that a one-hour delay on such a counter is impossible to achieve, and one hour would reduce the risk from "catastrophic" to "serious problem" in most cases.
Also, if implementing a cap is a desired feature that justifies trade-offs to be made, then it is psosible to translate the budget cap (in terms of money) back into service-specific caps that are easier to keep consistent. Such as "autoscale this set of VMs" and "my budget cap is $1000/hour", with the VM type being priced at $10/hour, translated to "autoscale to at most 100 instances". That would need dev work (i.e. this feature being considered important) and would not respect the budget cap in a cross-service way automatically, but still it is another piece in the puzzle.
Yeah, there's an implicit assumption was reasonability.
But a big part of the value in large clouds like GCP is the network's interconnectedness. Plus even if there was some global event that made communications impossible only for the billing service, I'd still expect charges to top out roughly proportional to the number of partitions as they each independently exceed the threshold. GCP only has 120ish zones.
It's more a problem they are incentivized to have. Open Router allows fixed wallets and doesn't run into the same problem, since it would be their money on the line if they let a user overspend their limits.
They charge for a lot of things "by the hour". Things like S3, load balancers, storage.
Deleting those when a customer hits a limit will lose customer data or remove things that might be hard to add back. The "I hit my AWS limit and they deleted all my data" headlines will result.
and excluding those things makes the limit soft again..
I mean yes, look at Corey Quinn [1] for example. He has built an entire career out of the fact that cloud billing trips people up.
(Generally, tech seems to skate by on creating insanely complicated things, knowing that given enough pain, people will start blogging about their solutions, ie effectively outsourcing the cost and effort of doing something about it.)
I never got a surprise bill myself, but reading a few cases like this motivated me to cancel my GCS account and remove my CC. Now if I try to use it it fails immediately with an error.
As author of HashBackup, I know people are using it with GCS, and I'd like to be able to test against it, but not enough to swallow a large surprise Google bill.
Why doesn’t GCP provide a way to say “shut down all my services if my cap is reached”?
Suddenly turning off services when the billing cap is reached is a big reliability risk for customers. All workloads using that billing account would be impacted immediately.
If they weren't turned off at the billing cap, but were given some leeway instead, either that becomes the new hard limit, or GCP will have to give away the difference.
And there's no "middle ground" you could implement that makes sense either - like a "frozen" state. Preventing new writes to a GCS bucket breaks the writer app. Freezing VMs serving web traffic takes the site down.
Even if the service was shut down once the billing limit is hit, how long would GCP wait for the user to add funds or raise the limit? GCP would need to either keep the services in a hidden/frozen state or not turn them off / freeze them at all (in which case GCP would be giving away resources for free).
Maybe GCP can give users a heads-up when they're about to hit the limit? GCP already does - billing alerts do exist. It's just possible to blow past them if your usage is a massive spike.
Moreover, getting the hundreds of GCP services to implement a "frozen" state is difficult. It's hard enough getting everyone to listen to the "billing account disabled" signal, and (soft-)delete the resources (based on the resource, after some time interval). Given these billing overruns happen for smaller customers, it's not really worth solving the problem - which I don't think has a great solution to begin with.
> Suddenly turning off services when the billing cap is reached is a big reliability risk for customers.
Those customers that would rather get a surprise $18,000 bill than have their servers stopped or data nuked, can simply not enable the spending limit.
Well they can give nuclear option and freeze everything. And they can send email every day till 7 days?
There is better option here, but google won't as it will hurt their incentive to take more money from customer.
They have a $18,000 reason for that.
Well since they waived the fees, it sounds to me like they have an $18,000 reason to stop this kind of thing from happening in the first place.
I understand that $18k is probably a drop in the bucket, but surely there's a middle ground here.
They can wave 18k pretty easily because they'll make it up in 18,000 other overcharges of varying amounts that aren't contested. Or that are corporate clients that just eat it.
Things like this are the exact reason that companies end up having to comply with all kinds of regulations. It's just easier to screw the customer first.
Same reason that you can't cancel a gym membership by walking up to a person and saying "cancel my membership".
Predation, pure and simple.
Google doesn't care, no one is holding them responsible for predatory behavior. It's profitable to steal from your customers and there's no downside to doing so.
"Google automatically upgraded the tier" this is a google scam. don't pay and sue them.
Do yourself a favour and automate unlinking the billing account. Set it to fire when your budget is hit or whatever your risk tolerance is.
Yes nuclear option, but I’ll take an hour down time over a $100k unexpected bill
You still owe a debt, even with no card on file.
And Google owes honoring the budget limits no?
Sounds like a great mistake for Google to find a way to repeat. Why innovate when you can abuse users and hide behind "complexity" as "plausible deniability"?
You'll all keep using them either way.
We got a $12,000 bill a few hours ago on a presumably leaked gemini API which I very much doubt, and we are trying to resolve the issue with a real support agent. I think they messed something internally and customers are getting these bills.
That is quite hostile to their consumers, no matter how they spin it. If you put a budget on something it should be capped.
I am the last person to defend Google Cloud and it's awful UX.
With that said, when you go to set a budget it warns you "Setting a budget does not cap resource or API consumption. Learn more." with a hyperlink to https://docs.cloud.google.com/billing/docs/how-to/budgets?_g...
The only way I can read that is 'setting a cap does nothing' but reading that tells me that it only turns on email notifications. Not any better really. It's simply not a cap. It's an alarm.
Yes, there is no way to set a budget for Google Cloud. And alarms are delayed up to two hours (!)
Agreed. I don't agree with the underlying design decisions either, but you literally set a "Budget Alert" on Google Cloud. It's designed to be an alarm, not a cap. I was just trying to point that out
I am fairly sure that this is antipattern on purpose. If you ask a thousand people on the street what a budget means - they will coalesce on - the money I am willing to spend on something.
The fact that google redefine what budget means and put a warning doesn't make it ok.
Even if US clouds will make more money per capita in the US it has more total income and political advantage from the other 95% of people. US law will be as pro-consumer as Delaware law. (And it will be as restricted to the jurisdiction that decides it as IP law.)
Click here to let the puppy life*
* By clicking here you agree to kill it
And you're defending that?
No, I think Google should provide easy tools to actually cap spending, instead of recommending you set quota limits on your APIs.
The article, and the comment I was replying to, make it seem like an error in the Google Budget system. I'm simply trying to say this system is working as designed and documented.
I think I read somewhere that calculating and limiting cloud usage costs is a really hard problem. But I feel that if Google were motivated to do it, they can do it. It's hard, not impossible. They just don't care to solve this particular problem.
If they can COUNT it and charge based on that, that means they can count it and react.
If I, not having their budget or engineers, can have pretty much instant Prometheus event reacting to metrics, surely it wouldn't be too hard for them to have triggers like this -- somehow their AI can automatically ban people based on something, can't they do something for the customers?
They can, just don't want to.
In the article it states that this person had an account that would have been limited to $2000 in usage.
And the system automatically upgraded them to higher spending limits when they crossed the $1000 in usage costs.
They could definitely make that an opt-in feature.
Yea, makes no sense for it to be opt out. Otherwise it just means there are no limits.
It's the same fundamental problem as view counters, something Google is famously good at solving. Eventually consistent solutions are well-understood, and wouldn't have these kinds of massive cost-overruns.
Depends on latency. 24 hour delays on an eventually consistent counter used for billing absolutely would cause this problem.
It seems hard to believe that a one-hour delay on such a counter is impossible to achieve, and one hour would reduce the risk from "catastrophic" to "serious problem" in most cases.
Also, if implementing a cap is a desired feature that justifies trade-offs to be made, then it is psosible to translate the budget cap (in terms of money) back into service-specific caps that are easier to keep consistent. Such as "autoscale this set of VMs" and "my budget cap is $1000/hour", with the VM type being priced at $10/hour, translated to "autoscale to at most 100 instances". That would need dev work (i.e. this feature being considered important) and would not respect the budget cap in a cross-service way automatically, but still it is another piece in the puzzle.
Yeah, there's an implicit assumption was reasonability.
But a big part of the value in large clouds like GCP is the network's interconnectedness. Plus even if there was some global event that made communications impossible only for the billing service, I'd still expect charges to top out roughly proportional to the number of partitions as they each independently exceed the threshold. GCP only has 120ish zones.
It's more a problem they are incentivized to have. Open Router allows fixed wallets and doesn't run into the same problem, since it would be their money on the line if they let a user overspend their limits.
It’s hard on AWS as well, but I agree. There’s just no incentive for the billing experience to be better.
aws, gcp, azure (the ones I work with), they don't provide a off the shelf solution to block after some budget ammount. This is not aceptable.
They charge for a lot of things "by the hour". Things like S3, load balancers, storage.
Deleting those when a customer hits a limit will lose customer data or remove things that might be hard to add back. The "I hit my AWS limit and they deleted all my data" headlines will result.
and excluding those things makes the limit soft again..
I mean yes, look at Corey Quinn [1] for example. He has built an entire career out of the fact that cloud billing trips people up.
(Generally, tech seems to skate by on creating insanely complicated things, knowing that given enough pain, people will start blogging about their solutions, ie effectively outsourcing the cost and effort of doing something about it.)
[1] https://www.lastweekinaws.com/