GitHub Actions limitations and gotchas

  • Thanks for taking the time to share your experience. I work on GitHub Actions and am familiar with the limitations you're experiencing. Below is more info on where we're at with each of these issues, using the numbers in your Table of Contents. Also, we have a lot of other new things planned for Actions as you can see in our roadmap. https://github.com/github/roadmap/projects/1?card_filter_que...

    2.1 We're starting work on this in the next couple of months. We plan to ship it in early 2022.

    2.2 We want to speed up the pace of GitHub Enterprise Server releases, but I don't have more info to share.

    2.3 We're looking at ways to not require a GHEC account or "unified" license.

    2.4 The limits are much higher with the GitHub hosted runners, but this is a current limit of self-hosted runners.

    3.1 It's on our backlog. No date to share.

    3.2 I haven't heard this before and thanks for sharing the scenario. We'll think about it.

    3.3 This will ship in October.

    3.4 We're doing some performance optimizations for GHES 3.4 that should fix this.

    3.5 This shipped recently - https://github.blog/changelog/2021-08-25-github-actions-redu...

    3.6 We have a couple API improvements coming later this year.

    3.7 We're looking into this, but no dates to share.

    We're dedicated to making Actions a great experience. As you would assume, I'm very excited about the future of Actions and getting feedback like this helps us make it better.

  • In the company I am working we’ve decided to use Github actions for our CI pipelines instead of deploying any “on prem” solution. I’ve worked a lot in the past with Jenkins and Travis and I also played a bit at home with Github actions. Now that I am using it in real world scenarios I have to say that I am a bit disappointed. In my opinion, as soon as you try to do something a bit complex you end up having to implement some nasty hack. Also I found that the Github Actions marketplace is a bit of a sink. You have to spend a good amount of time browsing it for finding something decent that hasn’t been discontinued, or is a pointless fork or at least is actively maintained. This happens even for basic functionalities.

    I known is a fairly recent platform but I was expecting much more compared to what other services offer.

  • The biggest limitation I'm hitting with Github Actions right now is that there's no real support for queueing jobs up. Github actions do support a 'concurrency group' primitive that will prevent two actions in the same concurrency group from running at the same time, but this only allows you to have one item executing and one item queued up. If you try to queue up another one, the first queued item will be cancelled.

    In our case we've got some cypress tests that we want to run on a specific on-premise server every time we create a pull request. They take about 20 minutes to run, and we're creating a lot of pull requests, so you have to carefully check what github actions are executing before you create a pull request or push new changes to it. I'd love support for proper queues like what teamcity and other CI systems have.

  • The cache action section focuses on GHES which is a shame, since the GitHub.com cache action is also pretty bad.

    - can't manually delete cache (although they're considering changing that[0])

    - only saves cache at the end of a workflow, and only if the workflow succeeds. This could be solved with CircleCI's approach of having a save-cache step and a restore-cache step.

    - Cache is super slow for self-hosted runners, so it makes more sense to have a local cache instead of using the action

    - only 5gb of storage size. This was supposed to be increased via billable cache storage[1], but it's been on the backburner since July 2020

    - In addition to the above, you can't use a different storage backend on the official action (which would allow storing over 5GB of cache via your own storage). The best workaround is to use a user-provided action which utilizes the s3 api[2].

    0: https://github.com/actions/cache/issues/632

    1: https://github.com/github/roadmap/issues/66

    2: https://github.com/actions/cache/issues/354#issuecomment-854...

    2>

  • Wow, reading this and all the comments makes me realise GHA is really far behind Gitlab CI. They've at least been playing catch-up since the MS acquisition, having stagnated for years before that.

    I really wonder why would anyone self-host GitHub. Gitlab has a much more feature rich, mature and cheap ( there's a perfectly usable free version) offering. Yeah, someone might prefer Github's UX, but is it really worth it to pay for a worse product?

  • GitLab team-member here. Obviously coming in with a lot of bias, but I wanted to address how each point relates to GitLab CI/CD’s view of the world. I’m also thinking about writing a longer post with more details as I have a lot of thoughts (™) about this topic.

    2.1 Caching isn’t available: GitLab has this everywhere.

    2.2 GitHub Enterprise Server is behind GitHub Enterprise Cloud: GitLab ships the same code to GitLab.com as it does to our self-managed customers. This was a tough decision but has a lot of benefits...the central being feature parity and scalability for self-managed folks

    2.3 Using Public GitHub.com Actions: This is a symptom more than the problem itself - relying on third-party plugins for build jobs is scary, and leads to many of the same issues we’ve seen in the Jenkins ecosystem - easy to get started, hard to maintain.

    2.4 Dockerhub pull rate limiting: for self-hosted runners, you can use a registry mirror or Dependency Proxy to reduce your number of pulls from Docker Hub. The key is the entire platform has to be there to enable the right workflows.

    3.1 No dropdowns for manually triggered jobs: GitLab also doesn’t have drop downs, but does have the ability to pre-fill these values.

    3.2 Self-hosted runner default labels: I think this is also more of a symptom than a problem. 3.3 Being able to tag and use runners for specific tasks is key - so I understand the frustration and we’ve spent a lot of time on this.

    3.4 You can’t restart a single job of a workflow: You can do this with GitLab.

    3.5 Slow log output: I haven’t seen this be a problem, and is a benefit of our scalability features being built into the self-managed code.

    3.6 You can’t have actions that call other actions: There are lots of ways to relate pipelines (parent/child, triggers. etc.) in GitLab.

    3.7 Metrics and observability: The GitLab runner has Prometheus build in, and the dashboards we use to manage GitLab.com are partially public: https://dashboards.gitlab.com

    3.8 Workflow YAML Syntax can confusing: This can be really hard to get right. I learned to stop worrying and love the YAML long ago, and I know we’ve got through a lot of iterations to try and get this right.

    I'd love to know where folks think I got this assessment wrong. And is there value in writing more about it?

    (edited for line spacing)

  • Also, scheduled jobs are a joke, they are routinely one to several hours late. I have to use the API to manually trigger jobs (workflow_dispatch) whenever I need scheduling that is remotely time sensitive.

  • GitHub Actions are a fantastic experience for serverless applications. I am working on a serverless project where we use GitHub Actions exclusively for CI/CD as well as running automated tests. We rely heavily on Lambda, S3, and DynamoDB. Our client app is static JS files we serve over Cloud Font. GitHub Actions make our piplines accessible to any developer on the team. Since we only pay for what we use with our serverless infrastructure, we can even deploy each pull request to the cloud rather inexpensively and leverage GitHub's environments to help manage the cleanup for us. This allows our team members to review and test changes in their browser before we pull them into our development branch. We additionally can run Playwright E2E tests to verify that none of our critical user workflow scenarios have broken resulting from the PR changes. I love this development experience and would have a hard time going back to anything else.

  • Disclaimer - incoming self promotion.

    Actually, most points in the article are the basis on why we created BuildJet.

    We initially tried to solve these annoyances by creating a CI with speed and the YAML config as a USP. We got 4x speed and a much better YAML config structure, but despite these improvements we noticed that it people had a mental barrier to migrate to a new unknown CI.

    But like OP we always enjoyed the experience of using GitHub Actions, so with this in mind. We decided to build BuildJet for GitHub Action[1] that uses the same infrastructure but plugs right into Github Action as a "self-hosted" runner, which is automatically set up for you with OAauth. This resulted on average a 2x speed improvement for half the cost(due to us being close to the metal). Easy to install and easy to revert.

    [1] https://buildjet.com/for-github-actions

  • I think GH Actions is a pretty cool idea. I don't use them, myself, because, every time I count myself, I keep coming up "1."

    When I left my last job, and started working on my own, I set up things like CI/D, JIRA, Jenkins, etc. These were the bread and butter for development in my old shop.

    But they are "Concrete Galoshes"[0], and work very well for teams, as opposed to ICs. As a single developer, working alone, the infrastructure overhead just slowed me down, and, ironically, interfered with Quality.

    When GH Actions were first announced (I can't remember, but they may have been beta, then), I set up several of them, on my busier projects. They worked great, until I started to introduce some pivots, and I realized that there was actually no advantage to them. I ran the tests manually, anyway, and the Actions just gave me one more thing to tweak. It was annoying, getting the failure messages, when I knew damn well, the project was fine. I'd just forgotten to tweak the Action. I introduce frequent changes, in my work, and that is great.

    [0] https://littlegreenviper.com/miscellany/concrete-galoshes/

  • I really like Github Actions but it still only feels appropriate for small projects. Unless I'm missing something, I didn't see a good mechanism for monorepos. I'm thinking in terms of there being shared pieces that get built/tested and then products that sit on top. The complication comes in test avoidance. I don't remember why I didn't like doing this all in one workflow with jobs (though it was going to require an orchestrator job setting variables to choose which downstream jobs to run). For chaining workflows/pipelines, when I looked, you could only trigger other workflows for master, defeating the point.

    Among my small, open source work, probably my biggest complaint is actions running in forks. Wastes a lot of resources on their side and limits my concurrent runners for projects in my personal space. For companies, depending on the setup, this would eat their compute minutes.

    Also annoying that PR actions can't post to the PR. I can understand there are security limitations but it makes it so a lot of nice features don't exist for most people.

  • "You can’t have actions that call other actions" - I think it's possible to use the repository_dispatch trigger described at https://docs.github.com/en/actions/reference/events-that-tri... for this - you'd need a separate GitHub personal access token, but using that it should be possible to trigger a workflow in any other repository you own from an API call in another action.

  • My experience with GHA is that it can be awesome for things like small projects that want to enforce linters, unit tests, etc.

    Once you get into more complex things - like building docker images, storing into an artifact repository, baking amis, running integration or end to end tests, etc, it can be a pain.

    It was a great place for us to start but we've since moved to BuildKite.

  • The only thing we've ever trusted GH actions with is enforcing check builds pass before a PR is allowed to merge.

    Everything else is managed via a custom tool we use for packaging & deploying our product.

    Even our simple "run this 1 build command and ensure exit code == 0" action seems to have a semi-weekly issue like stuck "waiting for status" and other unexplained failures throughout. We don't want to put any more eggs into that particular basket right now.

  • We've been loving GHA: CI/CD-as-code, pull requests capture deploy history (CI, stage/restage, deploy), and labeling a PR with 'release' is enough to generate our on-prem + multi-cloud artifacts.

    Our main gotchas are roughly:

    - GH-hosted runners have too little RAM/HD for big docker software. They push you to self-hosted runners for that, which is fine in theory, but GHA/Azure doesn't actually support serverless runners, so that falls flat in practice. We don't want to be turning machines on/off, that's GHA's job. We experimented with GHA -> Packer -> Azure for serverless, but it was slow and Packer frequently leaves zombie machines, so we went back to tweaking the low-RAM runners provided by our enterprise plan.

    - Security: We want contactors etc. to be able to run limited GHA CI jobs and use that quota, but not higher-trust GHA CD ones. This is tricky at a configuration level. Ex: It seems like we'd need to funny things like main repo for CI w/ CI secrets, and a separate repo for CD w/ CD secrets, and only give untrusted folks access to the CD-cred repo. We've thought of other possibilities as well, but in general, it's frightening.

    - Big Docker images: We do spend more time than I'd like messing with optimizing Docker caching as GPU containers are embarrassingly big (we use dockerhub vs github's due to sizes/pricing/etc), think both multi-stage containers + multi-step jobs (monorepo/microservices). I think they're in a good position to speed that up!

    I'm optimistic about these, but tricky to align with MS/GH PM personal team priorities :)

  • I've been testing out Github Actions for a few weeks now, for the most part I really like it, there are a few features missing but I think the fundamentals of the product are solid with the public catalogue of actions being the killer feature.

    The biggest issue I have is around self-hosted runners.

    1. There's no official auto-scaling runner option, so even if you're paying Github (aka Microsoft) for Enterprise - they're not going to support your auto-scaling EKS/GKE/EC2/whatever runners.

    2. You can't register self-hosted runners without a Personal Access Token - the key word being _Personal_. Your automation code for provisioning runners should not rely on an individuals Github access token just to register, they need to have a system like GitLab has where you can generate a registration token per-organisation/team/repo that allows you to programmatically register runners.

  • Surprised there's no mention of the inability to view logs generated before the page was loaded until the job completes. This one drives me crazy when I have long running, silent activity.

    My wishlist item would be more variants of Windows server versions so that we could build Windows containers for more versions of Windows. I realize the fault lies with Windows containers pinning the container base version to the host version, but I'm still stuck with the burden.

    I think GitHub Actions got the model correct, using everything as events to trigger any number of workflows. This is far simpler to maintain than a single workflow with conditionals and wait states that you see with other systems.

  • Build engineer at a large Fortune 500 shop here - the largest impediment to us even enabling GitHub Actions is that you either disable them entirely for your org or allow repo admins to enable them _per repo_. We have several hundred repos in one org alone and we cannot simply enable them for everyone with admin access (for us if you create a repo, you get admin access).

  • > For the vast majority of use cases, the YAML syntax is sane and is similar to other CI systems. It gets super clumsy when you want to assign an output of a step to a variable that you can refer to later.

    The verbosity of accessing output has the added benefit of making it much clearer that the 2 workflow steps are tightly interdependent on each other.

  • > You can’t have actions that call other actions

    As of August 25, you can! https://github.blog/changelog/2021-08-25-github-actions-redu...

  • I’ve been loving GH Actions a ton but one issue I ran into is that I have a couple tests that get run and then if it’s a merge to prod branch, an image build action.

    The problem is that I cannot make the image build action happen IFF both testing actions pass. I had to combine all three actions into one.

    It works. But now there’s a “skipped” step that’s skipped 99% of the time and makes no sense for a lot of PRs. It also means I have a Frankenstein monster action that does three long lists of very different things. All just so I can make 3 depend on 1 and 2.

    The other problem is that to develop and Test an action, I have to just push to origin a thousand times. My kingdom for a CI system that _trivially_ enables me to install a single one-liner program that lets me locally test my actions at near 1:1 compatibility.

  • I was planning to migrate to Github from Gitlab, but hearing about all these missing basic features (no caching, no restarting individual jobs?) I think I’ll stick with Gitlab for the forseeable future.

  • One extremely painful "gotcha" we encountered was trying to push code to a protected branch from inside a workflow.

    With the default GITHUB_TOKEN, you can't push to protected branches. If you decide to use personal access tokens, you can push to protected branches, BUT that will trigger other workflows. That can cause an infinite loop of workflows.

    We still couldn't figure out how to push code to a protected branch without triggering the same/other workflows.

  • Is it possible yet to (without hoops) have an action workflow that runs on pull requests to <default_branch> without having to name that branch explicitly so that you can have a write-once-run-everywhere-forever process across repos that might change their default branch name? Last I saw there was a template parameter, but those get populated at creation not at runtime.

  • I've yet to come across a complete tutorial for setting up cached docker builds (one involving a rust compilation) within a Github Actions workflow. I've been figuring it out from pieces of info scattered across blog posts and github repos. How is anyone managing this today? How/where are you persisting cached objects?

  • Has anyone got experience or resources for running a big iOS application on GHA? Is it possible, or is it pretty much a toy right now? Say you have a big CI pipeline like;

    * Full clean build including dependencies (support Carthage, Cocoapods, SPM)

    * Running multiple test suites that takes maybe 5+ hours for full suite?

    * Running simulators for screenshot testing

  • If you use Jenkins and want to try actions then check out https://github.com/DontShaveTheYak/jenkins-std-lib

    It let's you run actions on top of Jenkins.

  • Does anyone know if on GHE I can bypass actions and just use the builtin git recieve hooks? I really enjoy playing with post-commit scripts.

  • > migrate every engineering team at Venmo to GitHub Actions

    Oh jesus christ. I feel for you dude.

    We evaluated GHA, and we still are trying to use it, but there is a barrage of problems and limitations, including cost, lack of functionality, and technical issues. It's really only suitable (at scale) for linting, or generating Changelogs, or something else trivial. I use it in my OSS projects to run tests, and it's okay for that (though impossible to just tail a build log when it's large)

    Drone.io is still an amazingly effective system that matches GHA (and has _crazy_ features like build parameters) but is more flexible. Of course you'll have to pay for commercial licenses, but if it's between paying for GHA or Drone, I highly recommend Drone instead. Drone is stupidly easy to maintain (infrastructure-wise).

  • We use GitHub Actions and we like it. Though the current version feels like a very lukewarm compromise from the original one they had before the Microsoft acquisition. I am rather disappointed to see it become "just another CI" when the original release was so much more. I know, I know, I know, Microsoft bought GitHub so what should I expect, but either way, disappointed when the original had such amazing potential.