603 lines
20 KiB
Markdown
603 lines
20 KiB
Markdown
---
|
|
name: llm-council
|
|
description: "Run Fireworks-hosted open-weight model councils that compare responses and synthesize a final answer."
|
|
allowed-tools: Read, Write, Bash, AskUserQuestion
|
|
category: "ai-agents"
|
|
risk: "safe"
|
|
source: "official"
|
|
source_repo: "dair-ai/dair-academy-plugins"
|
|
source_type: "official"
|
|
date_added: "2026-06-19"
|
|
author: "DAIR.AI"
|
|
license: "MIT"
|
|
license_source: "https://github.com/dair-ai/dair-academy-plugins/blob/main/README.md#license"
|
|
tags:
|
|
- dair-academy
|
|
- ai
|
|
- workflow
|
|
tools:
|
|
- claude-code
|
|
- codex-cli
|
|
- cursor
|
|
---
|
|
|
|
# LLM Council (Fireworks AI)
|
|
|
|
## When to Use
|
|
|
|
Use when this workflow matches the user request: Use this skill for its documented workflow.
|
|
|
|
|
|
_Source: [dair-ai/dair-academy-plugins](https://github.com/dair-ai/dair-academy-plugins) (MIT)._
|
|
|
|
This skill implements Karpathy's LLM Council concept where multiple open-weight LLMs deliberate on a query, powered entirely by Fireworks AI:
|
|
|
|
1. **Phase 1**: All models respond to the query independently (parallel)
|
|
2. **Phase 2**: Models rank each other's anonymized responses
|
|
3. **Phase 3**: A Chairman LLM synthesizes the final answer
|
|
|
|
All inference runs through **Fireworks AI** using open-weight models. The speed and pricing of Fireworks makes it practical to run multi-model deliberation that would be slow or expensive on other providers.
|
|
|
|
## CRITICAL RULES
|
|
|
|
1. **ALWAYS use AskUserQuestion** to let the user select council models (multiselect) and the Chairman model
|
|
2. **ALWAYS save raw responses to files** - never summarize or truncate API outputs
|
|
3. **ALWAYS show full transparency** - display all individual responses, all rankings, AND the final synthesis
|
|
4. **NEVER skip the ranking phase** - it is essential to the council deliberation process
|
|
5. **Read from files for display** - ensures content is shown unmodified
|
|
6. **ALWAYS display the final output to the user** after Phase 3 completes
|
|
|
|
## Pre-flight Check
|
|
|
|
Before running any phase, verify the Fireworks API key is set:
|
|
|
|
```bash
|
|
if [ -z "$FIREWORKS_API_KEY" ]; then
|
|
echo "ERROR: FIREWORKS_API_KEY is not set."
|
|
echo "Create a Fireworks AI account at: https://fireworks.ai/"
|
|
echo "Then export it in your shell profile (~/.zshrc or ~/.bashrc):"
|
|
echo ' export FIREWORKS_API_KEY="your_api_key_here"'
|
|
exit 1
|
|
fi
|
|
echo "FIREWORKS_API_KEY is set."
|
|
```
|
|
|
|
## Available Models
|
|
|
|
Present these options to the user via AskUserQuestion (multiselect):
|
|
|
|
| Model | Fireworks ID | Provider |
|
|
|-------|-------------|----------|
|
|
| GLM 5 | accounts/fireworks/models/glm-5 | Z.ai |
|
|
| DeepSeek V3.1 | accounts/fireworks/models/deepseek-v3p1 | DeepSeek |
|
|
| DeepSeek V3.2 | accounts/fireworks/models/deepseek-v3p2 | DeepSeek |
|
|
| MiniMax M2.1 | accounts/fireworks/models/minimax-m2p1 | MiniMax |
|
|
| Kimi K2.5 | accounts/fireworks/models/kimi-k2p5 | Moonshot |
|
|
| Qwen3 235B | accounts/fireworks/models/qwen3-235b-a22b | Alibaba |
|
|
| Llama 4 Maverick | accounts/fireworks/models/llama4-maverick-instruct-basic | Meta |
|
|
|
|
## Workflow
|
|
|
|
### Step 1: Gather User Input
|
|
|
|
Use AskUserQuestion to get:
|
|
1. The query/question for the council (or accept it from the conversation)
|
|
2. Which models to include (multiselect, recommend 3-5 models)
|
|
3. Which model should be the Chairman (single select)
|
|
|
|
Note: AskUserQuestion supports max 4 options per question. Since there are 7 models, split model selection across two questions, or show the most popular 4 and let the user type "Other" for the rest. A good default is to show 4 models in the first question and note the others are available via "Other". Rotate which models are shown based on variety.
|
|
|
|
Example AskUserQuestion for model selection (show 4, mention others):
|
|
```
|
|
question: "Which models should participate in the LLM Council? (Also available via Other: Llama 4 Maverick, Qwen3 235B, GLM 5)"
|
|
header: "Models"
|
|
multiSelect: true
|
|
options:
|
|
- label: "DeepSeek V3.2"
|
|
description: "DeepSeek's newest and most capable model"
|
|
- label: "MiniMax M2.1"
|
|
description: "MiniMax's strong open-weight model"
|
|
- label: "Kimi K2.5"
|
|
description: "Moonshot's strong open-weight model"
|
|
- label: "DeepSeek V3.1"
|
|
description: "DeepSeek's proven reasoning model"
|
|
```
|
|
|
|
Example AskUserQuestion for chairman:
|
|
```
|
|
question: "Which model should be the Chairman (synthesizes the final answer)?"
|
|
header: "Chairman"
|
|
multiSelect: false
|
|
options:
|
|
- label: "DeepSeek V3.2 (Recommended)"
|
|
description: "Newest DeepSeek, strong at comprehensive analysis"
|
|
- label: "GLM 5"
|
|
description: "Strong reasoning for synthesis"
|
|
- label: "Kimi K2.5"
|
|
description: "Strong at structured synthesis"
|
|
- label: "MiniMax M2.1"
|
|
description: "Strong open-weight model for synthesis"
|
|
```
|
|
|
|
### Model Name to ID Mapping
|
|
|
|
Use this mapping to convert user selections to Fireworks model IDs:
|
|
|
|
```python
|
|
MODEL_MAP = {
|
|
"GLM 5": "accounts/fireworks/models/glm-5",
|
|
"DeepSeek V3.1": "accounts/fireworks/models/deepseek-v3p1",
|
|
"DeepSeek V3.2": "accounts/fireworks/models/deepseek-v3p2",
|
|
"MiniMax M2.1": "accounts/fireworks/models/minimax-m2p1",
|
|
"Kimi K2.5": "accounts/fireworks/models/kimi-k2p5",
|
|
"Qwen3 235B": "accounts/fireworks/models/qwen3-235b-a22b",
|
|
"Llama 4 Maverick": "accounts/fireworks/models/llama4-maverick-instruct-basic",
|
|
}
|
|
```
|
|
|
|
### Step 2: Run Phase 1 - Individual Responses
|
|
|
|
After gathering input, run this script to get responses from all selected models in parallel:
|
|
|
|
```bash
|
|
QUERY="USER_QUERY_HERE"
|
|
MODELS='["accounts/fireworks/models/glm-5", "accounts/fireworks/models/deepseek-v3p1"]'
|
|
|
|
python3 << 'PYEOF'
|
|
import os
|
|
import json
|
|
import requests
|
|
import time
|
|
from concurrent.futures import ThreadPoolExecutor, as_completed
|
|
|
|
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
|
|
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
|
|
|
|
QUERY = os.environ.get("QUERY", "")
|
|
MODELS = json.loads(os.environ.get("MODELS", "[]"))
|
|
|
|
# Create session directory
|
|
timestamp = time.strftime("%Y%m%d-%H%M%S")
|
|
SESSION_DIR = f"/tmp/llm-council/{timestamp}"
|
|
os.makedirs(SESSION_DIR, exist_ok=True)
|
|
|
|
# Save config
|
|
config = {"query": QUERY, "models": MODELS, "timestamp": timestamp}
|
|
with open(f"{SESSION_DIR}/config.json", "w") as f:
|
|
json.dump(config, f, indent=2)
|
|
|
|
def call_model(model_id, query):
|
|
"""Call a single model via Fireworks AI"""
|
|
try:
|
|
start = time.time()
|
|
response = requests.post(
|
|
API_URL,
|
|
headers={
|
|
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
|
|
"Content-Type": "application/json"
|
|
},
|
|
json={
|
|
"model": model_id,
|
|
"messages": [
|
|
{"role": "system", "content": "You are participating in an LLM council deliberation. Provide your best, most thoughtful response to the query. Be comprehensive but focused."},
|
|
{"role": "user", "content": query}
|
|
],
|
|
"max_tokens": 4000,
|
|
"temperature": 1
|
|
},
|
|
timeout=120
|
|
)
|
|
response.raise_for_status()
|
|
elapsed = time.time() - start
|
|
data = response.json()
|
|
usage = data.get("usage", {})
|
|
return {
|
|
"success": True,
|
|
"content": data["choices"][0]["message"]["content"],
|
|
"model": model_id,
|
|
"latency_seconds": round(elapsed, 2),
|
|
"tokens": {
|
|
"prompt": usage.get("prompt_tokens", 0),
|
|
"completion": usage.get("completion_tokens", 0),
|
|
"total": usage.get("total_tokens", 0)
|
|
}
|
|
}
|
|
except Exception as e:
|
|
return {
|
|
"success": False,
|
|
"content": f"[ERROR: {str(e)}]",
|
|
"model": model_id,
|
|
"latency_seconds": 0,
|
|
"tokens": {"prompt": 0, "completion": 0, "total": 0}
|
|
}
|
|
|
|
print(f"\n{'='*60}")
|
|
print("PHASE 1: Collecting Individual Responses")
|
|
print(f"{'='*60}")
|
|
print(f"Query: {QUERY[:200]}...")
|
|
print(f"Models: {', '.join([m.split('/')[-1] for m in MODELS])}")
|
|
print(f"Session: {SESSION_DIR}")
|
|
print()
|
|
|
|
# Parallel execution
|
|
results = {}
|
|
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
|
|
futures = {executor.submit(call_model, m, QUERY): m for m in MODELS}
|
|
for future in as_completed(futures):
|
|
model = futures[future]
|
|
result = future.result()
|
|
results[model] = result
|
|
status = "OK" if result["success"] else "FAILED"
|
|
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
|
|
print(f" [{status}] {model.split('/')[-1]} ({latency})")
|
|
|
|
# Save raw results
|
|
with open(f"{SESSION_DIR}/phase1_responses.json", "w") as f:
|
|
json.dump(results, f, indent=2)
|
|
|
|
print(f"\nPhase 1 complete. Results saved to: {SESSION_DIR}/phase1_responses.json")
|
|
print(f"SESSION_DIR={SESSION_DIR}")
|
|
PYEOF
|
|
```
|
|
|
|
### Step 3: Run Phase 2 - Cross-Model Ranking
|
|
|
|
Each model reviews and ranks the anonymized responses from Phase 1:
|
|
|
|
```bash
|
|
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
|
|
|
|
python3 << 'PYEOF'
|
|
import os
|
|
import json
|
|
import requests
|
|
import time
|
|
from concurrent.futures import ThreadPoolExecutor, as_completed
|
|
|
|
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
|
|
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
|
|
SESSION_DIR = os.environ.get("SESSION_DIR")
|
|
|
|
# Load Phase 1 results
|
|
with open(f"{SESSION_DIR}/config.json") as f:
|
|
config = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
|
|
phase1_results = json.load(f)
|
|
|
|
QUERY = config["query"]
|
|
MODELS = config["models"]
|
|
|
|
# Create anonymized mapping
|
|
labels = ["A", "B", "C", "D", "E", "F", "G"][:len(MODELS)]
|
|
model_to_label = dict(zip(MODELS, labels))
|
|
label_to_model = {v: k for k, v in model_to_label.items()}
|
|
|
|
# Format anonymized responses
|
|
anonymized_responses = []
|
|
for model_id in MODELS:
|
|
label = model_to_label[model_id]
|
|
content = phase1_results[model_id]["content"]
|
|
anonymized_responses.append(f"=== Response {label} ===\n{content}")
|
|
|
|
anonymized_text = "\n\n".join(anonymized_responses)
|
|
|
|
def get_rankings(model_id, query, anonymized, own_label):
|
|
"""Get rankings from a single model"""
|
|
ranking_prompt = f"""You are evaluating responses from multiple AI models to this query:
|
|
|
|
QUERY: {query}
|
|
|
|
Here are the anonymized responses:
|
|
|
|
{anonymized}
|
|
|
|
Please rank these responses from BEST to WORST. For each ranking:
|
|
1. State the response letter (A, B, C, etc.)
|
|
2. Give a brief reason (1-2 sentences)
|
|
3. You may skip ranking your own response (labeled {own_label}) or rank it fairly
|
|
|
|
Format your response EXACTLY as:
|
|
RANKINGS:
|
|
1. [Letter] - [Brief reason]
|
|
2. [Letter] - [Brief reason]
|
|
3. [Letter] - [Brief reason]
|
|
..."""
|
|
|
|
try:
|
|
start = time.time()
|
|
response = requests.post(
|
|
API_URL,
|
|
headers={
|
|
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
|
|
"Content-Type": "application/json"
|
|
},
|
|
json={
|
|
"model": model_id,
|
|
"messages": [
|
|
{"role": "system", "content": f"You are ranking AI responses objectively. Your own response is labeled '{own_label}'."},
|
|
{"role": "user", "content": ranking_prompt}
|
|
],
|
|
"max_tokens": 1000,
|
|
"temperature": 1
|
|
},
|
|
timeout=90
|
|
)
|
|
response.raise_for_status()
|
|
elapsed = time.time() - start
|
|
return {
|
|
"success": True,
|
|
"content": response.json()["choices"][0]["message"]["content"],
|
|
"model": model_id,
|
|
"latency_seconds": round(elapsed, 2)
|
|
}
|
|
except Exception as e:
|
|
return {
|
|
"success": False,
|
|
"content": f"[ERROR: {str(e)}]",
|
|
"model": model_id,
|
|
"latency_seconds": 0
|
|
}
|
|
|
|
print(f"\n{'='*60}")
|
|
print("PHASE 2: Cross-Model Ranking")
|
|
print(f"{'='*60}")
|
|
print(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()})}")
|
|
print()
|
|
|
|
# Collect rankings from all models in parallel
|
|
rankings = {}
|
|
with ThreadPoolExecutor(max_workers=len(MODELS)) as executor:
|
|
futures = {
|
|
executor.submit(get_rankings, mid, QUERY, anonymized_text, model_to_label[mid]): mid
|
|
for mid in MODELS
|
|
}
|
|
for future in as_completed(futures):
|
|
model = futures[future]
|
|
result = future.result()
|
|
rankings[model] = result
|
|
status = "OK" if result["success"] else "FAILED"
|
|
latency = f"{result['latency_seconds']}s" if result["success"] else "N/A"
|
|
print(f" [{status}] {model.split('/')[-1]} ({latency})")
|
|
|
|
# Save rankings
|
|
output = {
|
|
"label_mapping": label_to_model,
|
|
"model_to_label": model_to_label,
|
|
"rankings": rankings
|
|
}
|
|
with open(f"{SESSION_DIR}/phase2_rankings.json", "w") as f:
|
|
json.dump(output, f, indent=2)
|
|
|
|
print(f"\nPhase 2 complete. Rankings saved to: {SESSION_DIR}/phase2_rankings.json")
|
|
PYEOF
|
|
```
|
|
|
|
### Step 4: Run Phase 3 - Chairman Synthesis
|
|
|
|
The Chairman model receives all responses and rankings, then produces the final synthesis:
|
|
|
|
```bash
|
|
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
|
|
CHAIRMAN_MODEL="accounts/fireworks/models/glm-5"
|
|
|
|
python3 << 'PYEOF'
|
|
import os
|
|
import json
|
|
import requests
|
|
import time
|
|
|
|
FIREWORKS_API_KEY = os.environ.get("FIREWORKS_API_KEY")
|
|
API_URL = "https://api.fireworks.ai/inference/v1/chat/completions"
|
|
SESSION_DIR = os.environ.get("SESSION_DIR")
|
|
CHAIRMAN_MODEL = os.environ.get("CHAIRMAN_MODEL")
|
|
|
|
# Load all previous results
|
|
with open(f"{SESSION_DIR}/config.json") as f:
|
|
config = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
|
|
phase1 = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
|
|
phase2 = json.load(f)
|
|
|
|
QUERY = config["query"]
|
|
label_to_model = phase2["label_mapping"]
|
|
model_to_label = phase2["model_to_label"]
|
|
|
|
# Format responses with model names revealed
|
|
responses_text = []
|
|
for model_id, result in phase1.items():
|
|
label = model_to_label.get(model_id, "?")
|
|
model_name = model_id.split("/")[-1]
|
|
responses_text.append(f"=== {label}: {model_name} ===\n{result['content']}")
|
|
|
|
# Format rankings
|
|
rankings_text = []
|
|
for model_id, result in phase2["rankings"].items():
|
|
model_name = model_id.split("/")[-1]
|
|
rankings_text.append(f"[{model_name}'s Rankings]\n{result['content']}")
|
|
|
|
synthesis_prompt = f"""You are the Chairman of an LLM Council. Your task is to synthesize the best possible answer from multiple AI responses.
|
|
|
|
ORIGINAL QUERY:
|
|
{QUERY}
|
|
|
|
INDIVIDUAL RESPONSES:
|
|
{chr(10).join(responses_text)}
|
|
|
|
MODEL RANKINGS:
|
|
{chr(10).join(rankings_text)}
|
|
|
|
As Chairman, produce a FINAL SYNTHESIS that:
|
|
1. Incorporates the strongest elements from the best-ranked responses
|
|
2. Resolves any contradictions between responses
|
|
3. Addresses aspects that multiple models agreed on
|
|
4. Corrects any errors identified through cross-ranking
|
|
5. Provides the most complete, accurate, and helpful answer
|
|
|
|
Begin your synthesis:"""
|
|
|
|
print(f"\n{'='*60}")
|
|
print("PHASE 3: Chairman Synthesis")
|
|
print(f"{'='*60}")
|
|
print(f"Chairman: {CHAIRMAN_MODEL.split('/')[-1]}")
|
|
print()
|
|
|
|
try:
|
|
start = time.time()
|
|
response = requests.post(
|
|
API_URL,
|
|
headers={
|
|
"Authorization": f"Bearer {FIREWORKS_API_KEY}",
|
|
"Content-Type": "application/json"
|
|
},
|
|
json={
|
|
"model": CHAIRMAN_MODEL,
|
|
"messages": [
|
|
{"role": "system", "content": "You are the Chairman of an LLM Council. Synthesize multiple AI perspectives into a definitive, comprehensive response."},
|
|
{"role": "user", "content": synthesis_prompt}
|
|
],
|
|
"max_tokens": 4000,
|
|
"temperature": 1
|
|
},
|
|
timeout=180
|
|
)
|
|
response.raise_for_status()
|
|
elapsed = time.time() - start
|
|
synthesis = response.json()["choices"][0]["message"]["content"]
|
|
|
|
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
|
|
f.write(synthesis)
|
|
|
|
print(f"Phase 3 complete ({elapsed:.2f}s). Synthesis saved to: {SESSION_DIR}/phase3_synthesis.txt")
|
|
|
|
except Exception as e:
|
|
print(f"ERROR: {e}")
|
|
synthesis = f"[ERROR: {str(e)}]"
|
|
with open(f"{SESSION_DIR}/phase3_synthesis.txt", "w") as f:
|
|
f.write(synthesis)
|
|
|
|
# Update config with chairman
|
|
config["chairman"] = CHAIRMAN_MODEL
|
|
with open(f"{SESSION_DIR}/config.json", "w") as f:
|
|
json.dump(config, f, indent=2)
|
|
PYEOF
|
|
```
|
|
|
|
### Step 5: Display Full Results
|
|
|
|
Read all saved files and display the complete council deliberation:
|
|
|
|
```bash
|
|
SESSION_DIR="/tmp/llm-council/TIMESTAMP_HERE"
|
|
|
|
python3 << 'PYEOF'
|
|
import os
|
|
import json
|
|
|
|
SESSION_DIR = os.environ.get("SESSION_DIR")
|
|
|
|
# Load all data
|
|
with open(f"{SESSION_DIR}/config.json") as f:
|
|
config = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase1_responses.json") as f:
|
|
phase1 = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase2_rankings.json") as f:
|
|
phase2 = json.load(f)
|
|
with open(f"{SESSION_DIR}/phase3_synthesis.txt") as f:
|
|
synthesis = f.read()
|
|
|
|
model_to_label = phase2["model_to_label"]
|
|
label_to_model = phase2["label_mapping"]
|
|
|
|
# Build formatted output
|
|
output = []
|
|
output.append("=" * 70)
|
|
output.append(" LLM COUNCIL DELIBERATION")
|
|
output.append(" Powered by Fireworks AI")
|
|
output.append("=" * 70)
|
|
output.append("")
|
|
output.append(f"QUERY: {config['query']}")
|
|
output.append(f"COUNCIL: {', '.join([m.split('/')[-1] for m in config['models']])}")
|
|
output.append(f"CHAIRMAN: {config.get('chairman', 'N/A').split('/')[-1]}")
|
|
output.append("")
|
|
|
|
# Phase 1: Individual Responses
|
|
output.append("-" * 70)
|
|
output.append(" PHASE 1: INDIVIDUAL RESPONSES")
|
|
output.append("-" * 70)
|
|
output.append("")
|
|
|
|
for model_id, result in phase1.items():
|
|
model_name = model_id.split("/")[-1]
|
|
label = model_to_label.get(model_id, "?")
|
|
latency = result.get("latency_seconds", "N/A")
|
|
tokens = result.get("tokens", {})
|
|
output.append(f"[{label}] {model_name} (latency: {latency}s, tokens: {tokens.get('total', 'N/A')})")
|
|
output.append("-" * 40)
|
|
output.append(result["content"])
|
|
output.append("")
|
|
|
|
# Phase 2: Cross-Model Rankings
|
|
output.append("-" * 70)
|
|
output.append(" PHASE 2: CROSS-MODEL RANKINGS")
|
|
output.append("-" * 70)
|
|
output.append("")
|
|
output.append(f"Label mapping: {json.dumps({v: k.split('/')[-1] for k, v in model_to_label.items()}, indent=2)}")
|
|
output.append("")
|
|
|
|
for model_id, result in phase2["rankings"].items():
|
|
model_name = model_id.split("/")[-1]
|
|
output.append(f"[{model_name}'s Rankings]")
|
|
output.append(result["content"])
|
|
output.append("")
|
|
|
|
# Phase 3: Chairman Synthesis
|
|
output.append("-" * 70)
|
|
output.append(" PHASE 3: CHAIRMAN'S SYNTHESIS")
|
|
output.append("-" * 70)
|
|
output.append("")
|
|
chairman_name = config.get("chairman", "Chairman").split("/")[-1]
|
|
output.append(f"[{chairman_name} - Chairman]")
|
|
output.append("")
|
|
output.append(synthesis)
|
|
output.append("")
|
|
output.append("=" * 70)
|
|
output.append(f"Session files: {SESSION_DIR}/")
|
|
|
|
# Save formatted output
|
|
final_output = "\n".join(output)
|
|
with open(f"{SESSION_DIR}/final_output.md", "w") as f:
|
|
f.write(final_output)
|
|
|
|
print(final_output)
|
|
print(f"\nFull output saved to: {SESSION_DIR}/final_output.md")
|
|
PYEOF
|
|
```
|
|
|
|
## Important Notes
|
|
|
|
1. **Session Directory**: Each run creates a unique session in `/tmp/llm-council/{timestamp}/`
|
|
2. **Raw Data Preserved**: All API responses are saved as-is to JSON files for full transparency
|
|
3. **Cost**: Fireworks pricing is per-token. More models and longer queries cost more. Check current pricing at https://fireworks.ai/pricing
|
|
4. **Latency Tracking**: Each API call tracks latency so you can see Fireworks' speed in action
|
|
5. **Token Usage**: Phase 1 responses include token counts for cost awareness
|
|
6. **Rate Limits**: If you hit rate limits, wait briefly and retry
|
|
7. **Model Availability**: Check https://app.fireworks.ai/ for current model status
|
|
|
|
## Setup
|
|
|
|
1. Create a Fireworks AI account at https://fireworks.ai/ and grab your API key from the dashboard
|
|
2. Export it in your shell profile:
|
|
```bash
|
|
export FIREWORKS_API_KEY="your_api_key_here"
|
|
```
|
|
3. Restart your terminal or run `source ~/.zshrc`
|
|
4. Invoke this skill when you want multiple open-weight AI perspectives on a question
|
|
|
|
|
|
## Limitations
|
|
|
|
- Requires the upstream tool, account, API key, or local setup when the workflow names one.
|
|
- Does not authorize destructive, production, paid, or external-message actions without explicit user approval.
|
|
- Validate generated artifacts or recommendations against the user's real sources before treating them as final.
|