# Evaluation Prompts for This Skill Use these to verify an agent is applying CLI best practices (not just "making something that runs"). ## Scoring rubric - *Pass*: - Produces a clear CLI contract (commands, flags, IO, exit codes, examples). - Explicitly addresses stdout/stderr, exit codes, help behavior, and interactivity. - Includes stable scripting output modes (`--json`/`--plain`) when relevant. - Avoids secret leaks (no secrets via flags/env). - *Strong pass*: - Uses the checklist and/or the audit script. - Highlights trade-offs and backwards-compatibility risks. - Provides ready-to-ship help output and error message patterns. ## Prompt: design a new CLI - Task: - "Design a CLI called `logship` that tails logs from multiple sources (local files and HTTP endpoints), filters by regex, and outputs either human-friendly colored logs or machine-readable JSON." - Must include: - Subcommands or flags decision (and rationale) - `stdout` vs `stderr` behavior - `--json` output definition (shape) - Color behavior (`NO_COLOR`, `--no-color`, TTY detection) - Timeouts for HTTP, progress/status messages - Example-first help outline - Exit codes ## Prompt: review a flawed CLI help output - Task: - "Here's the current `--help` output for `acmectl`. It's 200 lines of flags, no examples, and no description. Rewrite it to be discoverable." - Must include: - Concise default help vs full help structure - Examples near the top - Group common flags first - Support path / docs link ## Prompt: fix stdout/stderr separation - Task: - "This command prints progress bars to stdout and the JSON result to stderr. Fix the output contract." - Must include: - Machine output on stdout - Human/progress on stderr - Behavior when piped/captured (no animations) ## Prompt: safe destructive action - Task: - "Add a `delete` command that can delete remote projects. Make it safe for humans but scriptable." - Must include: - Confirmation levels (moderate vs severe) - `--dry-run` - `--force` and/or `--confirm="exact-name"` - `--no-input` behavior ## Prompt: secret handling - Task: - "Add auth to the CLI. It currently accepts `--token ` and reads `MYAPP_TOKEN` env var. Fix the design." - Must include: - `--token-file` and/or `--token-stdin` - Recommendation for OS keychain / secret manager - Explain why flags/env are unsafe ## Prompt: run the audit script - Task: - "Run `scripts/cli_audit.py` against `./mycli` and address the FAIL/WARN items." - Must include: - Interpreting the audit output - Fixing highest severity issues first - Updating help text and/or flags accordingly