ruby-skill-bench 1.0.1
ruby-skill-bench orchestrates evaluation runs of AI coding agents inside isolated git sandboxes, then scores the results using deterministic and LLM-powered judges.
ruby-skill-bench orchestrates evaluation runs of AI coding agents inside isolated git sandboxes, then scores the results using deterministic and LLM-powered judges.