I really wish managed airflow instances were cheaper for smaller companies. I built my own using spot instances and it's so affordable compared to astronomer and the others.
So far it's been very low maintenance - outside of the few random scares where dag logs filled up my server - but then I researched and found maintenance Dags that prevent a lot of issues.
Wondering when my next outage will be is always fun, but it's been pretty stable so far.
E: I know and appreciate the tech they put into it. It's just too high of a price for me once I get the workers added. I still want migrate mine to fargate workers at some point though.
But why Airflow, it has so many weird things. I hope it is dethroned soon.
There are now at least 4 different implementations of every data/app related technology: the oss/original version, the aws version, the azure version and the gcp version
Is this a good idea? I don’t think so
I wonder how it compares to astronomer.io, and Google's managed airflow thing.
How does this differ from Google's Cloud Composer, which is also managed Airflow?
Want to shout out an alternative Python open source workflow orchestration tool, Prefect https://www.prefect.io/
It has a server/UI component that you can deploy relatively easily on something like Kubernetes. It then makes it easy to configure your flows to run on varying amounts of compute resources.
I continue to be confused about AWS offerings other than their EC2 and data storage (S3/Redshift/Spectrum/Aurora) solutions which are undoubtedly amazing products. The thing with Workflow scheduling and orchestration is that it's complex and non trivial. I'd rather buy a product built by a company with a focus in this area (i.e. Prefect) rather than go with yet another poorly executed but highly marketed AWS product. When I buy a product for my team I buy support and integration first, technology is less important. A lot of AWS products are poorly supported and mostly integrate with other AWS stuff. On top of all of this we are supposed to be running containers and AWS makes all this serverless stuff which defeats the whole purpose of moving applications to containers smoothly running in the cloud.
Thank you AWS. I think they just saved me $85K.
There's no mention of AWS's own existing workflow management systems - Step Functions and AWS Glue. Would be interested to see AWS's own advice on when to use one over the other.
Confluent seems to be thriving ever since AWS launched their own Kafka managed service. With that as an example, this could be a good thing for Astronomer.io.
I just glanced at our own airflow instance in AWS (not on this service). We run 1 t3.xlarge instances 4vCPU for the scheduler and web server and 1 t3.xlarge instance (4vCPU) for the workers. At $0.33 per hour (on demand), this seems to most closely match the resources for their medium or large offering, at $0.74-$0.99 per hour (roughly 3x).
I realize you are buying not just the compute, but the management, but that ends up being something in the cost range of $300-$500 or so per month for the airflow management part of it. Seems a bit steep. $50-$100/mo would be a no brainer for us. For some orgs I can see this being a great solution, but its not really friendly for the little guy (with a min price of $350/mo).