7.5 KiB

Raw Blame History

Saving Results to Hugging Face Hub

⚠️ CRITICAL: Job environments are ephemeral. ALL results are lost when a job completes unless persisted to the Hub or external storage.

Why Persistence is Required

When running on Hugging Face Jobs:

Environment is temporary
All files deleted on job completion
No local disk persistence
Cannot access results after job ends

Without persistence, all work is permanently lost.

Persistence Options

Option 1: Push to Hugging Face Hub (Recommended)

For models:

from transformers import AutoModel
model.push_to_hub("username/model-name", token=os.environ.get("HF_TOKEN"))

For datasets:

from datasets import Dataset
dataset.push_to_hub("username/dataset-name", token=os.environ.get("HF_TOKEN"))

For files/artifacts:

from huggingface_hub import HfApi
api = HfApi(token=os.environ.get("HF_TOKEN"))
api.upload_file(
    path_or_fileobj="results.json",
    path_in_repo="results.json",
    repo_id="username/results",
    repo_type="dataset"
)

Option 2: External Storage

S3:

import boto3
s3 = boto3.client('s3')
s3.upload_file('results.json', 'my-bucket', 'results.json')

Google Cloud Storage:

from google.cloud import storage
client = storage.Client()
bucket = client.bucket('my-bucket')
blob = bucket.blob('results.json')
blob.upload_from_filename('results.json')

Option 3: API Endpoint

import requests
requests.post("https://your-api.com/results", json=results)

Required Configuration for Hub Push

Job Configuration

Always include HF_TOKEN:

hf_jobs("uv", {
    "script": "your_script.py",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}  # ✅ Required for Hub operations
})

Script Configuration

Verify token exists:

import os
assert "HF_TOKEN" in os.environ, "HF_TOKEN required for Hub operations!"

Use token for Hub operations:

from huggingface_hub import HfApi

# Auto-detects HF_TOKEN from environment
api = HfApi()

# Or explicitly pass token
api = HfApi(token=os.environ.get("HF_TOKEN"))

Complete Examples

Example 1: Push Dataset

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["datasets", "huggingface-hub"]
# ///

import os
from datasets import Dataset
from huggingface_hub import HfApi

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Process data
data = {"text": ["Sample 1", "Sample 2"]}
dataset = Dataset.from_dict(data)

# Push to Hub
dataset.push_to_hub("username/my-dataset")
print("✅ Dataset pushed!")
""",
    "flavor": "cpu-basic",
    "timeout": "30m",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Example 2: Push Model

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["transformers"]
# ///

import os
from transformers import AutoModel, AutoTokenizer

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Load and process model
model = AutoModel.from_pretrained("base-model")
tokenizer = AutoTokenizer.from_pretrained("base-model")
# ... process model ...

# Push to Hub
model.push_to_hub("username/my-model")
tokenizer.push_to_hub("username/my-model")
print("✅ Model pushed!")
""",
    "flavor": "a10g-large",
    "timeout": "2h",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Example 3: Push Artifacts

hf_jobs("uv", {
    "script": """
# /// script
# dependencies = ["huggingface-hub", "pandas"]
# ///

import os
import json
import pandas as pd
from huggingface_hub import HfApi

# Verify token
assert "HF_TOKEN" in os.environ, "HF_TOKEN required!"

# Generate results
results = {"accuracy": 0.95, "loss": 0.05}
df = pd.DataFrame([results])

# Save files
with open("results.json", "w") as f:
    json.dump(results, f)
df.to_csv("results.csv", index=False)

# Push to Hub
api = HfApi()
api.upload_file("results.json", "results.json", "username/results", repo_type="dataset")
api.upload_file("results.csv", "results.csv", "username/results", repo_type="dataset")
print("✅ Results pushed!")
""",
    "flavor": "cpu-basic",
    "timeout": "30m",
    "secrets": {"HF_TOKEN": "$HF_TOKEN"}
})

Authentication Methods

Method 1: Automatic Token (Recommended)

"secrets": {"HF_TOKEN": "$HF_TOKEN"}

Uses your logged-in Hugging Face token automatically.

Method 2: Explicit Token

"secrets": {"HF_TOKEN": "hf_abc123..."}

Provide token explicitly (not recommended for security).

Method 3: Environment Variable

"env": {"HF_TOKEN": "hf_abc123..."}

Pass as regular environment variable (less secure than secrets).

Always prefer Method 1 for security and convenience.

Verification Checklist

Before submitting any job that saves to Hub, verify:

secrets={"HF_TOKEN": "$HF_TOKEN"} in job config
Script checks for token: assert "HF_TOKEN" in os.environ
Hub push code included in script
Repository name doesn't conflict with existing repos
You have write access to the target namespace

Repository Setup

Automatic Creation

If repository doesn't exist, it's created automatically when first pushing (if token has write permissions).

Manual Creation

Create repository before pushing:

from huggingface_hub import HfApi

api = HfApi()
api.create_repo(
    repo_id="username/repo-name",
    repo_type="model",  # or "dataset"
    private=False,  # or True for private repo
)

Repository Naming

Valid names:

username/my-model
username/model-name
organization/model-name

Invalid names:

model-name (missing username)
username/model name (spaces not allowed)
username/MODEL (uppercase discouraged)

Troubleshooting

Error: 401 Unauthorized

Cause: HF_TOKEN not provided or invalid

Solutions:

Verify secrets={"HF_TOKEN": "$HF_TOKEN"} in job config
Check you're logged in: hf_whoami()
Re-login: hf auth login

Error: 403 Forbidden

Cause: No write access to repository

Solutions:

Check repository namespace matches your username
Verify you're a member of organization (if using org namespace)
Check token has write permissions

Error: Repository not found

Cause: Repository doesn't exist and auto-creation failed

Solutions:

Manually create repository first
Check repository name format
Verify namespace exists

Error: Push failed

Cause: Network issues or Hub unavailable

Solutions:

Check logs for specific error
Verify token is valid
Retry push operation

Best Practices

Always verify token exists before Hub operations
Use descriptive repo names (e.g., my-experiment-results not results)
Push incrementally for large results (use checkpoints)
Verify push success in logs before job completes
Use appropriate repo types (model vs dataset)
Add README with result descriptions
Tag repos with relevant tags

Monitoring Push Progress

Check logs for push progress:

MCP Tool:

hf_jobs("logs", {"job_id": "your-job-id"})

CLI:

hf jobs logs <job-id>

Python API:

from huggingface_hub import fetch_job_logs
for log in fetch_job_logs(job_id="your-job-id"):
    print(log)

Look for:

Pushing to username/repo-name...
Upload file results.json: 100%
✅ Push successful

Key Takeaway

Without secrets={"HF_TOKEN": "$HF_TOKEN"} and persistence code, all results are permanently lost.

Always verify both are configured before submitting any job that produces results.

7.5 KiB Raw Blame History

Saving Results to Hugging Face Hub

Why Persistence is Required

Persistence Options

Option 1: Push to Hugging Face Hub (Recommended)

Option 2: External Storage

Option 3: API Endpoint

Required Configuration for Hub Push

Job Configuration

Script Configuration

Complete Examples

Example 1: Push Dataset

Example 2: Push Model

Example 3: Push Artifacts

Authentication Methods

Method 1: Automatic Token (Recommended)

Method 2: Explicit Token

Method 3: Environment Variable

Verification Checklist

Repository Setup

Automatic Creation

Manual Creation

Repository Naming

Troubleshooting

Error: 401 Unauthorized

Error: 403 Forbidden

Error: Repository not found

Error: Push failed

Best Practices

Monitoring Push Progress

Key Takeaway

7.5 KiB

Raw Blame History