Top
New
🌕
agcat
joined
10/17/2022, 9:35:34 AM
has
18
karma
Recent Posts
Unboxing Yahboom Robotic Arm with Jetson Orin Nano
by
agcat
on 6/16/2025, 4:42:02 PM with
1
comment
Ask HN: Anyone using Cloudflare Container platform in production?
by
agcat
on 6/11/2025, 9:00:40 PM with
1
comment
Three-tier storage architecture to accelerate model loading for LLM Inference
by
agcat
on 6/5/2025, 5:16:13 PM with
0
comments
AI Models Benchmarking for Education
by
agcat
on 5/26/2025, 7:37:23 PM with
2
comments
Qwen2-7B-Instruct with TensorRT-LLM: consistently high tokens/SEC
by
agcat
on 9/5/2024, 11:18:34 PM with
1
comment
LLM Wrapper Make Deployment with Nvidia Triton Inference Server Easier
by
agcat
on 7/31/2024, 11:21:59 PM with
1
comment
Show HN: Open-source tool that writes Nvidia Triton Inference Glue code for you
by
agcat
on 7/10/2024, 10:54:33 PM with
1
comment
Open Source CLI Tool to Generate Code for Nvidia Triton Deployment
by
agcat
on 7/4/2024, 2:37:28 AM with
1
comment
Real-Time Streaming Apps with Nvidia Open Source Triton Inference
by
agcat
on 6/5/2024, 12:25:25 AM with
1
comment
Fast Cold-starts for Serverless GPU Inference is becoming a reality
by
agcat
on 5/29/2024, 11:28:45 PM with
1
comment
LLMs Tokens/Second Benchmark ( Mistral, Llama2, Gemma) – Independent Research
by
agcat
on 3/25/2024, 7:18:12 PM with
1
comment
Show HN: Scale PDF Q&A App to 10K Users with GPUs – <$250/Mo
by
agcat
on 3/4/2024, 7:09:36 PM with
2
comments
Finetune Phi-2 with DPO
by
agcat
on 2/1/2024, 1:41:02 AM with
1
comment