Ask HN: Benchmarks for models other than LLMs