6
Models Tested
0.0%
Success Rate
59s
Avg Duration
45s - 1m 38s
Duration Range
| Score | Model | Duration | Session (KB) | test.sh |
|---|---|---|---|---|
| 0.0% | openrouter/openai/gpt-oss-120b | 46s | 38.0 | ❌ |
| 0.0% | openrouter/qwen/qwen3-coder | 48s | 26.6 | ❌ |
| 0.0% | openrouter/openai/gpt-oss-20b | 1m 38s | 1115.2 | ❌ |
| 0.0% | openrouter/deepseek/deepseek-v3.1-terminus | 1m 0s | 66.7 | ❌ |
| 0.0% | litellm/GLM-4.5-Air-FP8-dev | 54s | 83.8 | ❌ |
| 0.0% | openrouter/qwen/qwen3-14b | 45s | 70.2 | ❌ |