OpenCode Benchmark Dashboard: Find the Best Local LLM for Your Computer

OpenCode Benchmark Dashboard: Find the Best Local LLM for Your Computer

If you're looking to run large language models (LLMs) locally, finding the right balance between accuracy and speed can be challenging. OpenCode Benchmark Dashboard is a new tool that helps you compare different LLM performances directly on your hardware.

What is OpenCode Benchmark Dashboard?

OpenCode Benchmark Dashboard is a tool that allows developers to test and compare various local and remote large language models. The main goal is to help users find the best LLM for their specific use case and hardware setup.

Key Features

  • Comprehensive Testing: Test both local and remote LLM models
  • Accuracy vs Speed Analysis: Visual charts showing the tradeoff between performance metrics
  • Useful Metrics: Goes beyond just "tokens per second" to show actual problem-solving capability
  • Interactive Dashboard: Filter and compare models with an easy-to-use interface

Why Tokens Per Second Isn't Everything

The video emphasizes that token per second isn't always the most relevant metric. Some models spend significant time "reasoning" without finding solutions faster. The dashboard shows "useful tokens" to give a more accurate picture of real performance.

Best Local Models Tested

Top Performers

Model Parameters Performance
Qwen 3.5 35B 3B active Most accurate & fast
Nemotron Nano 30B 30B Excellent accuracy
GPT OSS 20B 20B Very accurate

Good for Data Extraction

  • Qwen 2.5 4B (quantized to 4KM): Acceptable accuracy with good speed

Remote Models Comparison

The dashboard also allows comparing local models with remote providers:

  • Step 3.5 Flash: Top performer in accuracy and speed
  • OpenCode GPT5 Nano: Good performance on tests

Key Takeaways

  1. Smaller quantized models like Qwen 3.5 35B can outperform larger models in accuracy
  2. Remote models (OpenRouter) often perform better than their quantized local counterparts
  3. The best model depends on your use case: coding, data extraction, or general knowledge tasks

How to Use It

  1. Install dependencies (Bun runtime)
  2. Configure OpenCode on your system
  3. Add your preferred models to ~/.config/opencode/opencode.json
  4. Run tests with bun run answer <model-name>
  5. Evaluate results with bun run evaluate
  6. View the dashboard with bun run dashboard

Conclusion

OpenCode Benchmark Dashboard is an essential tool for developers who want to optimize their local AI setup. Whether you're using a CPU-only system or have more powerful hardware, this tool helps you make informed decisions about which LLM to use.

GitHub - grigio/opencode-benchmark-dashboard: Benchmark system for testing opencode with various LLM models, measuring speed (latency) and correctness (accuracy).
Benchmark system for testing opencode with various LLM models, measuring speed (latency) and correctness (accuracy). - grigio/opencode-benchmark-dashboard