playbook/antigravity-awesome-skills/skills/hugging-face-jobs/references/hub_saving.md

353 lines
7.5 KiB
Markdown

# Saving Results to Hugging Face Hub
**⚠️ CRITICAL:** Job environments are ephemeral. ALL results are lost when a job completes unless persisted to the Hub or external storage.
## Why Persistence is Required
When running on Hugging Face Jobs:
- Environment is temporary
- All files deleted on job completion
- No local disk persistence
- Cannot access results after job ends
**Without persistence, all work is permanently lost.**
## Persistence Options
### Option 1: Push to Hugging Face Hub (Recommended)
**For models:**
```python
from transformers import AutoModel
model.push_to_hub("username/model-name", token=os.environ.get("HF_TOKEN"))
```
**For datasets:**
```python
from datasets import Dataset
dataset.push_to_hub("username/dataset-name", token=os.environ.get("HF_TOKEN"))
```
**For files/artifacts:**
```python
from huggingface_hub import HfApi
api = HfApi(token=os.environ.get("HF_TOKEN"))
api.upload_file(
path_or_fileobj="results.json",
path_in_repo="results.json",
repo_id="username/results",
repo_type="dataset"
)
```
### Option 2: External Storage
**S3:**
```python
import boto3
s3 = boto3.client('s3')
s3.upload_file('results.json', 'my-bucket', 'results.json')
```
**Google Cloud Storage:**
```python
from google.cloud import storage
client = storage.Client()
bucket = client.bucket('my-bucket')
blob = bucket.blob('results.json')
blob.upload_from_filename('results.json')
```
### Option 3: API Endpoint
```python
import requests
requests.post("https://your-api.com/results", json=results)
```
## Required Configuration for Hub Push
### Job Configuration
**Always include HF_TOKEN:**
```python
hf_jobs("uv", {
"script": "your_script.py",
"secrets": {"HF_TOKEN": "$HF_TOKEN"} # ✅ Required for Hub operations
})
```
### Script Configuration
**Verify token exists:**
```python
import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN required for Hub operations!"
```
**Use token for Hub operations:**
```python
from huggingface_hub import HfApi
# Auto-detects HF_TOKEN from environment
api = HfApi()
# Or explicitly pass token
api = HfApi(token=os.environ.get("HF_TOKEN"))
```
## Complete Examples
### Example 1: Push Dataset
```python
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["datasets", "huggingface-hub"]
# ///
import os
from datasets import Dataset
from huggingface_hub import HfApi
# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Process data
data = {"text": ["Sample 1", "Sample 2"]}
dataset = Dataset.from_dict(data)
# Push to Hub
dataset.push_to_hub("username/my-dataset")
print("✅ Dataset pushed!")
""",
"flavor": "cpu-basic",
"timeout": "30m",
"secrets": {"HF_TOKEN": "$HF_TOKEN"}
})
```
### Example 2: Push Model
```python
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["transformers"]
# ///
import os
from transformers import AutoModel, AutoTokenizer
# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Load and process model
model = AutoModel.from_pretrained("base-model")
tokenizer = AutoTokenizer.from_pretrained("base-model")
# ... process model ...
# Push to Hub
model.push_to_hub("username/my-model")
tokenizer.push_to_hub("username/my-model")
print("✅ Model pushed!")
""",
"flavor": "a10g-large",
"timeout": "2h",
"secrets": {"HF_TOKEN": "$HF_TOKEN"}
})
```
### Example 3: Push Artifacts
```python
hf_jobs("uv", {
"script": """
# /// script
# dependencies = ["huggingface-hub", "pandas"]
# ///
import os
import json
import pandas as pd
from huggingface_hub import HfApi
# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"
# Generate results
results = {"accuracy": 0.95, "loss": 0.05}
df = pd.DataFrame([results])
# Save files
with open("results.json", "w") as f:
json.dump(results, f)
df.to_csv("results.csv", index=False)
# Push to Hub
api = HfApi()
api.upload_file("results.json", "results.json", "username/results", repo_type="dataset")
api.upload_file("results.csv", "results.csv", "username/results", repo_type="dataset")
print("✅ Results pushed!")
""",
"flavor": "cpu-basic",
"timeout": "30m",
"secrets": {"HF_TOKEN": "$HF_TOKEN"}
})
```
## Authentication Methods
### Method 1: Automatic Token (Recommended)
```python
"secrets": {"HF_TOKEN": "$HF_TOKEN"}
```
Uses your logged-in Hugging Face token automatically.
### Method 2: Explicit Token
```python
"secrets": {"HF_TOKEN": "hf_abc123..."}
```
Provide token explicitly (not recommended for security).
### Method 3: Environment Variable
```python
"env": {"HF_TOKEN": "hf_abc123..."}
```
Pass as regular environment variable (less secure than secrets).
**Always prefer Method 1** for security and convenience.
## Verification Checklist
Before submitting any job that saves to Hub, verify:
- [ ] `secrets={"HF_TOKEN": "$HF_TOKEN"}` in job config
- [ ] Script checks for token: `assert "HF_TOKEN" in os.environ`
- [ ] Hub push code included in script
- [ ] Repository name doesn't conflict with existing repos
- [ ] You have write access to the target namespace
## Repository Setup
### Automatic Creation
If repository doesn't exist, it's created automatically when first pushing (if token has write permissions).
### Manual Creation
Create repository before pushing:
```python
from huggingface_hub import HfApi
api = HfApi()
api.create_repo(
repo_id="username/repo-name",
repo_type="model", # or "dataset"
private=False, # or True for private repo
)
```
### Repository Naming
**Valid names:**
- `username/my-model`
- `username/model-name`
- `organization/model-name`
**Invalid names:**
- `model-name` (missing username)
- `username/model name` (spaces not allowed)
- `username/MODEL` (uppercase discouraged)
## Troubleshooting
### Error: 401 Unauthorized
**Cause:** HF_TOKEN not provided or invalid
**Solutions:**
1. Verify `secrets={"HF_TOKEN": "$HF_TOKEN"}` in job config
2. Check you're logged in: `hf_whoami()`
3. Re-login: `hf auth login`
### Error: 403 Forbidden
**Cause:** No write access to repository
**Solutions:**
1. Check repository namespace matches your username
2. Verify you're a member of organization (if using org namespace)
3. Check token has write permissions
### Error: Repository not found
**Cause:** Repository doesn't exist and auto-creation failed
**Solutions:**
1. Manually create repository first
2. Check repository name format
3. Verify namespace exists
### Error: Push failed
**Cause:** Network issues or Hub unavailable
**Solutions:**
1. Check logs for specific error
2. Verify token is valid
3. Retry push operation
## Best Practices
1. **Always verify token exists** before Hub operations
2. **Use descriptive repo names** (e.g., `my-experiment-results` not `results`)
3. **Push incrementally** for large results (use checkpoints)
4. **Verify push success** in logs before job completes
5. **Use appropriate repo types** (model vs dataset)
6. **Add README** with result descriptions
7. **Tag repos** with relevant tags
## Monitoring Push Progress
Check logs for push progress:
**MCP Tool:**
```python
hf_jobs("logs", {"job_id": "your-job-id"})
```
**CLI:**
```bash
hf jobs logs <job-id>
```
**Python API:**
```python
from huggingface_hub import fetch_job_logs
for log in fetch_job_logs(job_id="your-job-id"):
print(log)
```
**Look for:**
```
Pushing to username/repo-name...
Upload file results.json: 100%
✅ Push successful
```
## Key Takeaway
**Without `secrets={"HF_TOKEN": "$HF_TOKEN"}` and persistence code, all results are permanently lost.**
Always verify both are configured before submitting any job that produces results.