Hacker News

Fine-tune Google's Gemma 3

by tomdekanon 3/19/2025, 4:34:45 PM with 10 comments

by smokelon 3/19/2025, 8:51:52 PM
I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.
RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.
How much effort is required to turn code into something one can use for fine-tuning?
by zkon 3/20/2025, 5:24:26 AM
Is there a version of Gemma 3 that has tool calling? Google's blog claimed it supports tools but it doesn't seem like it actually does.
by bryan0on 3/19/2025, 7:39:09 PM
Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes / gpus? I've been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?
by rockwotjon 3/19/2025, 6:44:22 PM
is anyone outside of the research labs fine tuning models for production use cases? I have been seeing more people just using foundational models off the shelf especially in light of a new advancement that seems to come every few months
by yieldcrvon 3/19/2025, 7:07:55 PM
Instead of versions, these things should be labeled by their release date, since this kind of training is based on started at a dataset snapshot in time, colloquially called knowledge-cutoff date which isnt really accurate
we are optimizing these on different dimensions at once, and multiple branches of evolution from each model
so a successor version name doesn't really convey that
by huqedatoon 3/19/2025, 11:14:40 PM
Great article, but I didn't see anything about the costs.
I'm particularly interested in this aspect because we're considering fine-tuning Gemma 3, but our budget is tight. We're looking into (real-world) cost estimates for this approach.
by siliconc0won 3/19/2025, 6:54:07 PM
It likely makes sense to use more expensive frontier models as teachers or architects for smaller fine-tuned ones that generate the majority of tokens (though possibly against the ToS).
by admiralrohanon 3/19/2025, 9:44:03 PM
Have anyone used those small models in any production environment?
If yes, what they are good and bad at?
by yash2401on 3/20/2025, 7:27:40 AM
[dead]
by dhooperon 3/20/2025, 11:03:43 AM
Please try to enjoy each Gemma tuning equally, and not show preference for any over the others