Back to Videos

Kubernetes Maturity, Cost Optimization, Automation & AI with Viktor Farcic

Date May 13, 2025
Duration 39:16
Kubernetes Viktor Farcic Cost Optimization AI
TL;DR

Viktor Farcic, a noted Kubernetes thought leader and self-described "technology critic," explains why Kubernetes has matured into a consolidation phase where the ecosystem is narrowing rather than expanding. He argues that resource management should be automated rather than human-specified, highlights the fundamental incompatibility between VPA and HPA, and shares his recent shift from AI skeptic to cautious optimist with the emergence of agents and MCPs.

Key Takeaways

Summary

The State of Kubernetes Maturity

Viktor observes that KubeCon has fundamentally changed. In the early years (5-7 years), he would leave conferences with months of work exploring new ideas. Now, the ecosystem has entered a consolidation phase. Rather than seeing 10-50 competing solutions for problems like service mesh, the industry is asking "which one should we actually use?"

Kubernetes is no longer a novelty - it's mainstream. The challenge now is making it "boring" in the sense that not everyone needs to become a Kubernetes ninja to use it. Viktor compares this to building PCs: in the old days, you'd buy individual components and assemble them yourself. Today, only gaming enthusiasts do that.

The Missing Abstraction Layer

A key unsolved problem: Kubernetes hasn't expanded beyond operations teams. In a company of thousands, only about five people truly understand Kubernetes deeply. Viktor argues we need services on top of Kubernetes that hide the complexity - similar to how Azure Container Apps or Heroku work. Developers should be able to say "here's my image, run it" without understanding the underlying infrastructure.

Helm and GitOps

Viktor's nuanced take on Helm: it's "the best solution we have for creating templates" because it outputs structured YAML data. However, he's "very negative towards the Go templating" - preferring alternatives like KCL, YT, or CUE for defining your own configurations.

For GitOps, Viktor stores pure YAML (the output of Helm template) rather than Helm charts directly. This ensures the desired state is human-readable without mental calculations about conditionals and value extrapolation.

The VPA/HPA Incompatibility Problem

VPA (Vertical Pod Autoscaler) isn't widely used for a critical reason: it fundamentally conflicts with HPA (Horizontal Pod Autoscaler). If VPA changes the metric that HPA scales on, you get uncontrollable scaling loops. Since organizations must choose one, they pick HPA for elasticity and hope they guessed CPU/memory correctly.

"Nobody knows how much memory and CPU and other resources applications need. I never met that person. I met people who confidently say that they know, but I never met people who really actually do know."

The Real Cost Problem

People are "very aware" that Kubernetes doesn't magically overcommit resources like VMware did - that's why they allocate 5x more than needed. This manifests as massive cloud bills with 7-8% utilization. The result? Companies conclude "cloud is too expensive" and consider repatriation, when the real problem is over-provisioning.

Viktor's AI Evolution

Viktor describes himself as a former "AI hater" (or "critic" - like the Muppets on the balcony). What changed his mind? The emergence of agents and MCPs:

He now uses AI like a junior developer: "Hey, it would be nice if you can do this, but don't push it directly to mainline - I need to see that PR." It gets him to a point faster, but still requires review.

AI's Memory Problem

Viktor's biggest frustration with current AI: it doesn't learn between sessions. Every new conversation starts from scratch. In observability scenarios, AI will detect the same Jenkins restart issue every day without remembering that "we know this, it's not a problem." He wants AI that maintains internal memory and doesn't repeat mistakes.

"It's like that movie 50 First Dates... or Groundhog Day. Every morning it's reset."

GPU Costs Amplify Everything

The same over-provisioning problems become catastrophic with GPUs. Viktor argues this actually makes cloud more valuable - being able to scale from 0 to 50 GPUs based on actual need is something only cloud providers can offer. The irony: people hoard GPUs because of supply constraints, defeating the elasticity promise.

Notable Quotes

"I think that Kubernetes is mature and now we're in consolidation phase. In the past, for every given problem like service mesh, we would get 10, 20, 30, 50 different solutions all competing with different ideas. Now we're in a phase of 'which one should we use really?'"

"I hope to see the future in which actually there is no option to specify memory and CPU. Maybe we shouldn't be dealing with those things at all."

"The machine is far more qualified to figure out what it needs. Especially because Kubernetes is a really complicated system with a lot of different workloads from different users."

"With AI, I'm acting more and more as a manager rather than developer. I still don't trust it - most of the time it almost never does everything 100% right. But it does help me get to some point faster."

"Of course cloud is expensive. If I buy five cars and drive only one, of course I'm going to spend much more money than I need. But the price of the car is not the problem - it's me buying five of them."

References