Fine-tune Google's Gemma 3

  • I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.

    RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.

    How much effort is required to turn code into something one can use for fine-tuning?

  • Is there a version of Gemma 3 that has tool calling? Google's blog claimed it supports tools but it doesn't seem like it actually does.

  • Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes / gpus? I've been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?

  • is anyone outside of the research labs fine tuning models for production use cases? I have been seeing more people just using foundational models off the shelf especially in light of a new advancement that seems to come every few months

  • Instead of versions, these things should be labeled by their release date, since this kind of training is based on started at a dataset snapshot in time, colloquially called knowledge-cutoff date which isnt really accurate

    we are optimizing these on different dimensions at once, and multiple branches of evolution from each model

    so a successor version name doesn't really convey that

  • Great article, but I didn't see anything about the costs.

    I'm particularly interested in this aspect because we're considering fine-tuning Gemma 3, but our budget is tight. We're looking into (real-world) cost estimates for this approach.

  • It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).

  • Have anyone used those small models in any production environment?

    If yes, what they are good and bad at?

  • [dead]

  • Please try to enjoy each Gemma tuning equally, and not show preference for any over the others