Ask HN: Cheaper or similar setup like Asus ROG G16 for local LLM development?

  • So running models on a M1 Mac with 32gb+ works very well as the CPU and GPU RAM are shared so you can run some really significant models with with.

    Earlier this year, I also went down the path of looking into building a machine with dual 3090s. Doing it for <$1,000 is fairly challenging once you add case, motherboard, CPU, RAM, etc.

    What I ended up doing was getting a used rackmount server that is capable of handling dual GPUs and two nVidia Telsa P40s.

    Examples: https://www.ebay.com/itm/284514545745?itmmeta=01HRJZX097EGBP... https://www.ebay.com/itm/145655400112?itmmeta=01HRJZXK512Y3N...

    The total here was ~$600 and there was essentially no effort building / assembling the machine, except I needed to order some molex power adapters, which were cheap.

    The server is definitely compact, but it can get LOUD when it's running heavy load, so that might be a consideration.

    It's probably not the right machine for training models, but it runs inference on GGUF (using ollama) quite well. I have been running Mixtral at zippy token rates and smaller models even faster.