Evaluation
MLOps
Evaluation
Frames model, prompt, and system evaluation as a reproducible experiment with baselines, datasets, and explicit metrics.
Enabled by defaultBuilt In
CLI install command
elephant skills install evaluationBundled with the packaged Elephant Agent CLI as a built-in procedural skill.
Already ships inside the packaged Elephant Agent bundle. Use `elephant skills install evaluation` only when you want an explicit local materialization record.