6
Models Tested
0.0%
Success Rate
52s
Avg Duration
23s - 1m 35s
Duration Range
Score Model Duration Session (KB) test.sh
0.0% openrouter/openai/gpt-oss-120b 1m 1s 822.5
0.0% openrouter/qwen/qwen3-coder 53s 66.2
0.0% openrouter/openai/gpt-oss-20b 23s 158.9
0.0% openrouter/deepseek/deepseek-v3.1-terminus 49s 78.3
0.0% litellm/GLM-4.5-Air-FP8-dev 1m 35s 105.8
0.0% openrouter/qwen/qwen3-14b 31s 70.6