Kubernetes is hard

  • > Kubernetes is complex and I think they are partially right

    Kubernetes is a distributed centralized operating system which itself depends on a distributed decentralized database, and has a varying network topology, permissions system, plugins, scheduler, storage, and much more, depending on how & where it was built, and runs applications as independent containerized environments (often deeply dependent on Linux kernel features) which can all have their own base operating systems. All of which must be maintained, upgraded, patched, and secured, separately, and frequently.

    Kubernetes is literally the most complex single system that almost anyone in the world will ever use. It is the Katamari Damacy of the cloud.

    > It allows dev teams to not worry about all these things; all they must do is to write a simple YAML file.

    cackles, then sobs

    > More importantly, teams no longer need to ask DevOps/infra folks to add DNS entry and create a Load Balancer just to expose a service.

    more sobbing

    > Should you use Kubernetes?

    Should you change a tire with a crowbar, a can of WD40, and a lighter? Given an alternative, the alternative is usually better, but sometimes you don't have an alternative.

  • I agree with the point that production is hard. There's so many things you just don't think about as a developer that end up being important. Log storage, certificate renewal, etc.

    I think how "hard" kubernetes is depends on how deep you go. If you're building a cluster from scratch, on your own hardware, setting up the control plane yourself etc. it's very very hard. On the other hand, if you're using a hosted service like EKS and you can hand off the hardware and control plane management to someone else, IMO it's actually very easy to use; I actually find it a lot easier than working with the constellation of services amazon has to offer for instance.

    I do think there are parts of it where "best practices" are still being worked out though, like managing YAML files. There's also definitely some rough edges. Like, Helm charts are great... to use. They're an absolute nightmare to write, and there's all sorts of delightful corner cases like not being able to reliably upgrade things that use StatefulSet (last I used anyway). It's not perfect, but honestly if you learn the core concepts and use a hosted service you can get a lot out of it.

  • Cost aside, I wonder how far you can get with something like a managed newsql database (Spanner, CockroachDB, Vitess, etc.) and serverless.

    Most providers at this point offer ephemeral containers or serverless functions.

    Does a product focused, non infra startup even need k8s? In my honest opinion people should be using Cloud Run. It’s by far Google’s best cloud product.

    Anyway, going back to the article - k8s is hard if you’re doing hard things. It’s pretty trivial to do easy things using k8s, which only leads to the question - why not use the cloud equivalents of all the “easy”things? Monitoring, logging, pub/sub, etc. basically all of these things have cloud equivalents as services.

    The question is, cost aside, why use k8s? Of course, if you are cost constrained you might do bare metal, or a cheaper collocation, or maybe even a cheap cloud like DigitalOcean. Regardless, you will bear the cost one way or another.

    If it were really so easy to use k8s to productionize services to then offer as a SaaS, everyone would do it. Therefore I assert, unless those things are your service, you should use the cloud services. Don’t use cloud vms, use cloud services, and preserve your sanity. After all, if you’re not willing to pay someone else to be oncall, that implies the arbitrage isn’t really there enough to drive the cost down enough for you to pay, which might imply it isn’t worth your time either (infra companies aside).

  • 37signals is not like the typical large-scale startup. They have an extremely small team (around 30 people?), and just a couple of products.

    Large-scale startups use dynamic-scheduled cloud services in part to reduce coupling between teams. Every service --- and there are dozens --- is scheduled independently, and new teams can get spun up to roll out new services without too much intervention from other teams.

    When you've got a couple products that have been in maintenance mode for 10+ years and then 2 just two primary products, both of which are on the same stack, and you can predict your workloads way out into the future (because you charge money for your services, don't do viral go-to-markets, and don't have public services), there simply isn't much of a win to dynamic scheduling. You can, in fact, just have a yaml file somewhere with all your hosts in it, and write some shell-grade tooling to roll new versions of your apps out.

    A lot of the reflexive pushback to not using k8s seemed like it came from people that either didn't understand that 37signals was doing something closer to static scheduling, or that don't understand that most of what makes k8s complicated is dynamic scheduling.

  • I think the post title should be called “Production is hard” (as the author talks about later on). Pick up any technology out there: from Python, to C++, to K8s, to Linux… Do the analogous “Hello world” using such technologies and run the program on your laptop. Easy. You congratulate yourself and move on.

    Production is another story. Suddenly your program that wasn’t checking for errors, breaks. The memory that you didn’t manage properly becomes now a problem. Your algorithm doesn’t cut it anymore. Etc.

  • Kubernetes is hard because it's over-complicated and poorly designed. A lot of people don't want to hear that because it was created by The Almighty Google and people have made oodles of money being k8s gurus.

    After wasting two years chasing config files, constant deprecations, and a swamp of third-party dependencies that were supposedly "blessed" (all of which led to unnecessary downtime and stress), I swapped it all out with a HAProxy load balancer server in front of some vanilla instances and a few scripts to handle auto-scaling. Since then: I've had zero downtime and scaling is region-specific and chill (and could work up to an infinite number of instances). It just works.

    The punchline: just because it's popular, doesn't mean it's the best way to do it.

  • I am consistently confused by all of the talk about how "hard" Kubernetes is.

    We spin up EKS. We install the newrelic and datadog log ingestion pods onto it, provided in a nice "helm" format.

    We install a few other resources via helm, like external secrets, and external dns, and a few others.

    Kubernetes EKS runs like a champ. My company saves 100k/mo by dynamically scaling our cloud services, all of which are running on Kubernetes, to more efficiently use compute classes.

    My company has over 50 million unique users monthly. We have massive scale. Kubernetes just works for us and we only have 2 people maintaining it.

    What we gain is a unified platform with a consistent API for developing our services. And if we wanted to migrate elsewhere, it is one less thing to worry about.

    ¯\_(ツ)_/¯

    Feels like some kind of hipster instinct to dislike the "cool new thing"... even though k8 has been around for years now and has been battle tested to the bone.

  • While I understand where the author is coming from, my opinion of Kubernetes (and production deployment in general) isn't that it is hard per se, but that it involves many components.

    I liken it to Lego. Each component separately isn't hard to work with, and once you figure out how to connect it to other components, you can do it 100 times easily. And like Lego, a typical Kubernetes environment may consist of several dozen or hundred pieces.

    So, I wouldn't describe Kubernetes as hard - I would describe it as large (i.e., comprised of multiple interconnected components). And by being large, there is a fair amount of time and effort necessary to learn it and maintain it, which may make it seem hard. But in the end, it's just Lego.

  • As an Infra person reading k8s posts on hackernews has got to be one of the most frustrating and pointless things to read on here. You all just regurgitate the same thing every post. It's even the same people, over and over again.

    30% of you are developers who think K8s is the devil and too complex and difficult, 30% of you like it and enjoy using it, and another 20% of you have never touched it but have strong opinions on it.

  • I would not use k8s unless we are convinced it will benefit us in the long run (think about constant effort that needs to put in to get things running). k8s is not magic. I would just stick with docker-compose or digital ocean for small startup. OR rent a VM on Azure OR if you really really need k8s use a managed k8s.

  • > [K8s] allows dev teams to not worry about all these things; all they must do is to write a simple YAML file. More importantly, teams no longer need to ask DevOps/infra folks to add DNS entry and create a Load Balancer just to expose a service. They can do it on their own, in a declarative manner, if you have an operator do to it.

    Yeah, as opposed to Cloudformation or Terraform, where you...uhhh...

    Don't get me wrong, it requires work to set up your corporate infrastructure in your Favourite Cloud Provider(tm) to make those things available for developers to manage. But it takes work in k8s too - even the author says "if you have an operator to do it". Kubernetes is great for what it's great for, but these are terrible arguments in favour of it.

  • Kuberenetes has been a total failure at defining a simple "just works" devops workflow, but I don't think that is due to any deficiencies in the product itself. The basic premise behind its common use case – automating away the SRE/ops role at a company – is what is flawed. Companies that blindly make the switch are painfully finding out that the job of their system operator wasn't just to follow instruction checklists but apply reasoning and logic to solve problems, similar to that of any software engineer. And that's not something you can replace with Kubernetes or any other such tool.

    On the other hand, there's still a lot of value in having a standard configuration and operating language for a large distributed system. It doesn't have to be easy to understand or use. Even if you still have to hire the same number of SREs, you can at least filter on Kubernetes experience rather than having them onboard to your custom stack. And on the other side, your ops skills and years of experience are now going to be a lot more transferrable if you want to move on from the company.

  • Luckily its been a few years since I had to work directly with Kubernetes. But ...

    > Forget the hype, there’s a reason why Kubernetes is being adopted by so many companies.

    I've never worked with it because it was the right solution, but only because some senior engineer or management bought into the hype.

    > It allows dev teams to not worry about all these things; all they must do is to write a simple YAML file.

    I've never found these yaml files simple.

  • I think setting up something similar without k8s is like, 100 times harder? I was never deeply into DevOps but a single short video and few doc pages told me how to bring up highly available, load-balanced 2-node cluster, and how to rollout new services and versions in minutes with zero downtime. I also can precisely control it, monitor all the logs and resources without leaving my working terminal for a minute. I would never be able to set-up an infra like this without kube in a timespan of one day with little prior DevOps knowledge. The complexity beast it tames into structure is just mind-blowing, and it's a virtue that it came out just being a bit "hard".

  • I'm working on making it easier, or at least providing the tools to make working with it easier!

    https://github.com/TorbFoundry/torb

    "Torb is a tool for quickly setting up best practice development infrastructure on Kubernetes along with development stacks that have reasonably sane defaults. Instead of taking a couple hours to get a project started and then a week to get your infrastructure correct, do all of that in a couple minutes."

  • Perhaps put more simply, operating in production has a lot of intrinsic complexity that will probably surface in the tooling, and if you constantly reinvent to "fix" the complexity you'll eventually end up putting up it back.

    That's how you end up with the modern javascript tooling hellscape where it looks like no one was around to tell a bright young developer "no"

  • the good new is, for the 95% of projects that can tolerate it, aws the good parts are actually both simple and easy[1].

    it’s hard to find things you can’t build on s3, dynamo, lambda, and ec2.

    if either compliance or a 5% project demand it, complicated solutions should be explored.

    1. https://github.com/nathants/libaws

  • I have a stack which runs with a single docker-compose file, 13 services, nothings too fancy, tried to transform it to kubernettes (using kompose) and, my file was converted into almost 24 yaml files, Im not even lookings at those little gremnlins, and will stuck with simple docker-compose settings and docker swarm.

  • You need a decent sized team to run an on premise k8s infra. If you’re in the cloud use a managed k8s.

    It’s not for everyone in that I agree with the point the author makes. But if you have multiple teams doing app development k8s can be really nice. We do data movement and ML services. AKS has proven great for our use.

  • I feel with AI, maybe in a couple of years it's going to trivial to deploy things on current infra stacks. AI can probably create whole range of Terraform scripts, deploy k8s and dockers and scale them automatically, with maybe a few humans as supervisor.

  • The best reason to use k8s is to take advantage of the wide array of open source k8s resources, including helm charts, custom operators, articles and tutorials etc. It's become an open standard.

    Like with many other technologies, k8s' advantage is its ecosystem.

  • Maybe you need use something like sealos https://github.com/labring/sealos

  • Most implementations are just slower, because they have not been optimised what so ever, and optimising is not even that easy for a senior.

  • Agreed. Reminds me of front end dev where it’s become laughingly complex with every wheel seemingly re-invented so many times.

  • This is the motivation that I need for someone who is currently learning Kubernetes.

  • We've started using Argo CD which I think has helped both dev and prod for us.

  • The funny thing is that blog post rant about mrsk which I thought was a Kubernetes implementation which is not. Docker ecosystem is just a mess.

  • It's not hard.