playbook/antigravity-awesome-skills/skills/hugging-face-jobs/references/hub_saving.md

7.5 KiB

Saving Results to Hugging Face Hub

⚠️ CRITICAL: Job environments are ephemeral. ALL results are lost when a job completes unless persisted to the Hub or external storage.

Why Persistence is Required

When running on Hugging Face Jobs:

  • Environment is temporary
  • All files deleted on job completion
  • No local disk persistence
  • Cannot access results after job ends

Without persistence, all work is permanently lost.

Persistence Options

For models:

from transformers import AutoModel
model.push_to_hub("username/model-name", token=os.environ.get("HF_TOKEN"))

For datasets:

from datasets import Dataset
dataset.push_to_hub("username/dataset-name", token=os.environ.get("HF_TOKEN"))

For files/artifacts:

from huggingface_hub import HfApi
api = HfApi(token=os.environ.get("HF_TOKEN"))
api.upload_file(
    path_or_fileobj="results.json",
    path_in_repo="results.json",
    repo_id="username/results",
    repo_type="dataset"
)

Option 2: External Storage

S3:

import boto3
s3 = boto3.client('s3')
s3.upload_file('results.json', 'my-bucket', 'results.json')

Google Cloud Storage:

from google.cloud import storage
client = storage.Client()
bucket = client.bucket('my-bucket')
blob = bucket.blob('results.json')
blob.upload_from_filename('results.json')

Option 3: API Endpoint

import requests
requests.post("https://your-api.com/results", json=results)

Required Configuration for Hub Push

Job Configuration

Always include HF_TOKEN:

hf_jobs("uv", {
    "script": "your_script.py",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Required for Hub operations
})

Script Configuration

Verify token exists:

import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN required for Hub operations!"

Use token for Hub operations:

from huggingface_hub import HfApi

# Auto-detects HF_TOKEN from environment
api = HfApi()

# Or explicitly pass token
api = HfApi(token=os.environ.get("HF_TOKEN"))

Complete Examples

Example 1: Push Dataset

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["datasets", "huggingface-hub"]
# ///

import os
from datasets import Dataset
from huggingface_hub import HfApi

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Process data
data = {"text": ["Sample 1", "Sample 2"]}
dataset = Dataset.from_dict(data)

# Push to Hub
dataset.push_to_hub("username/my-dataset")
print("✅ Dataset pushed!")
""",
    "flavor": "cpu-basic",
    "timeout": "30m",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Example 2: Push Model

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["transformers"]
# ///

import os
from transformers import AutoModel, AutoTokenizer

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Load and process model
model = AutoModel.from_pretrained("base-model")
tokenizer = AutoTokenizer.from_pretrained("base-model")
# ... process model ...

# Push to Hub
model.push_to_hub("username/my-model")
tokenizer.push_to_hub("username/my-model")
print("✅ Model pushed!")
""",
    "flavor": "a10g-large",
    "timeout": "2h",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Example 3: Push Artifacts

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["huggingface-hub", "pandas"]
# ///

import os
import json
import pandas as pd
from huggingface_hub import HfApi

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Generate results
results = {"accuracy": 0.95, "loss": 0.05}
df = pd.DataFrame([results])

# Save files
with open("results.json", "w") as f:
    json.dump(results, f)
df.to_csv("results.csv", index=False)

# Push to Hub
api = HfApi()
api.upload_file("results.json", "results.json", "username/results", repo_type="dataset")
api.upload_file("results.csv", "results.csv", "username/results", repo_type="dataset")
print("✅ Results pushed!")
""",
    "flavor": "cpu-basic",
    "timeout": "30m",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Authentication Methods

"secrets": {"HF_TOKEN": "$HF_TOKEN"}

Uses your logged-in Hugging Face token automatically.

Method 2: Explicit Token

"secrets": {"HF_TOKEN": "hf_abc123..."}

Provide token explicitly (not recommended for security).

Method 3: Environment Variable

"env": {"HF_TOKEN": "hf_abc123..."}

Pass as regular environment variable (less secure than secrets).

Always prefer Method 1 for security and convenience.

Verification Checklist

Before submitting any job that saves to Hub, verify:

  • secrets={"HF_TOKEN": "$HF_TOKEN"} in job config
  • Script checks for token: assert "HF_TOKEN" in os.environ
  • Hub push code included in script
  • Repository name doesn't conflict with existing repos
  • You have write access to the target namespace

Repository Setup

Automatic Creation

If repository doesn't exist, it's created automatically when first pushing (if token has write permissions).

Manual Creation

Create repository before pushing:

from huggingface_hub import HfApi

api = HfApi()
api.create_repo(
    repo_id="username/repo-name",
    repo_type="model",  # or "dataset"
    private=False,  # or True for private repo
)

Repository Naming

Valid names:

  • username/my-model
  • username/model-name
  • organization/model-name

Invalid names:

  • model-name (missing username)
  • username/model name (spaces not allowed)
  • username/MODEL (uppercase discouraged)

Troubleshooting

Error: 401 Unauthorized

Cause: HF_TOKEN not provided or invalid

Solutions:

  1. Verify secrets={"HF_TOKEN": "$HF_TOKEN"} in job config
  2. Check you're logged in: hf_whoami()
  3. Re-login: hf auth login

Error: 403 Forbidden

Cause: No write access to repository

Solutions:

  1. Check repository namespace matches your username
  2. Verify you're a member of organization (if using org namespace)
  3. Check token has write permissions

Error: Repository not found

Cause: Repository doesn't exist and auto-creation failed

Solutions:

  1. Manually create repository first
  2. Check repository name format
  3. Verify namespace exists

Error: Push failed

Cause: Network issues or Hub unavailable

Solutions:

  1. Check logs for specific error
  2. Verify token is valid
  3. Retry push operation

Best Practices

  1. Always verify token exists before Hub operations
  2. Use descriptive repo names (e.g., my-experiment-results not results)
  3. Push incrementally for large results (use checkpoints)
  4. Verify push success in logs before job completes
  5. Use appropriate repo types (model vs dataset)
  6. Add README with result descriptions
  7. Tag repos with relevant tags

Monitoring Push Progress

Check logs for push progress:

MCP Tool:

hf_jobs("logs", {"job_id": "your-job-id"})

CLI:

hf jobs logs <job-id>

Python API:

from huggingface_hub import fetch_job_logs
for log in fetch_job_logs(job_id="your-job-id"):
    print(log)

Look for:

Pushing to username/repo-name...
Upload file results.json: 100%
✅ Push successful

Key Takeaway

Without secrets={"HF_TOKEN": "$HF_TOKEN"} and persistence code, all results are permanently lost.

Always verify both are configured before submitting any job that produces results.