Define qualitative evaluation criteria and let an LLM judge if responses pass. Perfect for testing AI agents, comparing models, and evaluating subjective qualities.

Required Ruby Version

>= 3.3.0

Authors

Eric Stiens

Versions

  1. 0.2.0 June 29, 2026 (144 KB)
  2. 0.1.2 April 16, 2026 (85.5 KB)
  3. 0.1.1 January 05, 2026 (86.5 KB)
  4. 0.1.0 December 26, 2025 (54 KB)
  5. 0.0.1 December 25, 2025 (50 KB)

Pushed by

SHA 256 checksum