07 Sep 2025 6 min read GenAI

Getting Started with Hugging Face

Here is how I explain Hugging Face (https://huggingface.co) when friends ask. It is the open library and social hub for modern AI. You get a giant catalog of models and datasets that you can pull into your code with a couple of imports, plus a place to publish and version the stuff you build. Why it matters is the network effect. A standard way to package models, a standard API to run them, clear model cards, reproducibility with commit hashes, and a sharing flow that feels like Git for ML. The result is faster experiments, clearer governance, and a portfolio you can point to when someone asks what you have shipped.

Let’s get a hands on feel on a Mac using Terminal. First sign in locally so you can pull and later push. Create a Free Hugging Face account, then in Terminal run:

brew install git-lfs
python3 -m pip install --upgrade pip
pip install "huggingface_hub[cli]" transformers datasets torch accelerate safetensors diffusers

huggingface-cli login
...
 To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) y
Token is valid (permission: fineGrained).
The token `Personal` has been saved to /Users/doronkatz/.cache/huggingface/stored_tokens
Your token has been saved in your configured git credential helpers (osxkeychain).
Your token has been saved to /Users/doronkatz/.cache/huggingface/token
Login successful.
The current active token is: `Personal`

Now we are going to try a library to convert text-to-image using the diffusers library where you can generate images from text prompts using models like Stable Diffusion. I used this model: https://huggingface.co/CompVis/stable-diffusion-v1-4

It feels magical because you just type a description, and a model paints it for you. Create a new python file, titled cool_t2i_share.py.

import os
import argparse
import time
from pathlib import Path

import torch
from diffusers import AutoPipelineForText2Image

def slugify(text):
    """Convert text to a filename-safe string."""
    return "".join(c if c.isalnum() or c in "._-" else "_" for c in text)[:80]

def load_pipeline(model_id, device, torch_dtype):
    """Load and configure the text-to-image pipeline."""
    pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch_dtype)
    if device == "cuda":
        pipe = pipe.to("cuda")
        pipe.enable_attention_slicing()
        try:
            pipe.enable_xformers_memory_efficient_attention()
        except Exception:
            pass
    return pipe

def generate(pipe, prompt, steps, guidance, width, height, seed):
    """Generate an image using the pipeline."""
    generator = None
    if seed is not None:
        generator = torch.Generator(device=pipe.device.type).manual_seed(int(seed))
    result = pipe(
        prompt=prompt,
        num_inference_steps=int(steps),
        guidance_scale=float(guidance),
        width=int(width),
        height=int(height),
        generator=generator,
    )
    if hasattr(result, "nsfw_content_detected") and result.nsfw_content_detected:
        print("Warning: NSFW content was detected by the safety checker")
    return result.images[0]

def main():
    parser = argparse.ArgumentParser(description="Text to image generator (local only)")
    parser.add_argument("--prompt", type=str, required=True, help="Text description of the image")
    parser.add_argument("--model-id", type=str, default="stabilityai/sd-turbo", help="Hugging Face model ID")
    parser.add_argument("--num-inference-steps", type=int, default=4, help="Number of denoising steps")
    parser.add_argument("--guidance-scale", type=float, default=0.0, help="Guidance scale")
    parser.add_argument("--width", type=int, default=512, help="Image width")
    parser.add_argument("--height", type=int, default=512, help="Image height")
    parser.add_argument("--seed", type=int, default=None, help="Random seed")
    parser.add_argument("--output-dir", type=str, default="outputs", help="Output directory")
    args = parser.parse_args()

    if args.width % 8 != 0 or args.height % 8 != 0:
        raise ValueError("Width and height must be multiples of 8")

    device = "cuda" if torch.cuda.is_available() else "cpu"
    dtype = torch.float16 if device == "cuda" else torch.float32
    
    print(f"Using device: {device}")
    print(f"Loading model: {args.model_id}")
    
    pipe = load_pipeline(args.model_id, device, dtype)

    print(f"Generating image for prompt: '{args.prompt}'")
    start_time = time.time()
    image = generate(
        pipe=pipe,
        prompt=args.prompt,
        steps=args.num_inference_steps,
        guidance=args.guidance_scale,
        width=args.width,
        height=args.height,
        seed=args.seed,
    )
    generation_time = time.time() - start_time

    # Create timestamped output directory
    ts = time.strftime("%Y%m%d-%H%M%S")
    run_dir = Path(args.output_dir) / f"{ts}"
    run_dir.mkdir(parents=True, exist_ok=True)
    
    # Generate filename
    name = f"{slugify(args.prompt)}.png" if args.prompt else f"image_{ts}.png"
    out_path = run_dir / name
    image.save(out_path)
    print(f"Saved image to {out_path}")
    print(f"Generation completed in {generation_time:.2f} seconds")

if __name__ == "__main__":
    main()

We then run it, by entering: python3 cool_t2i_share.py --prompt "YOUR_PROMPT". This will download the Stable Diffusion model the first time (a few GB) and then produce an image from your prompt. You can swap out the prompt to anything — “a toy robot surfing a wave,” whatever comes to mind.

I tried the following: an impressionist painting of Seattle. That yielded:

Finally, let's share what we did by pushing your output and code to Hugging Face so others can see and reuse it. You’ll need to set a token once, run one script, and end up with a Hub repo that has your image, the script, and a nice README.

pip install huggingface_hub gradio

# Create a token at https://huggingface.co/settings/tokens with "write" scope
export HF_TOKEN="hf_xxx_your_token_here"
export HF_USERNAME="your-hf-username"

Update your python file as follows:

import os
import argparse
import time
from pathlib import Path

import torch
from diffusers import AutoPipelineForText2Image
from huggingface_hub import HfApi, create_repo

def slugify(text):
    """Convert text to a filename-safe string."""
    return "".join(c if c.isalnum() or c in "._-" else "_" for c in text)[:80]

def load_pipeline(model_id, device, torch_dtype):
    """Load and configure the text-to-image pipeline."""
    pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch_dtype)
    if device == "cuda":
        pipe = pipe.to("cuda")
        pipe.enable_attention_slicing()
        try:
            pipe.enable_xformers_memory_efficient_attention()
        except Exception:
            pass
    return pipe

def generate(pipe, prompt, steps, guidance, width, height, seed):
    """Generate an image using the pipeline."""
    generator = None
    if seed is not None:
        generator = torch.Generator(device=pipe.device.type).manual_seed(int(seed))
    result = pipe(
        prompt=prompt,
        num_inference_steps=int(steps),
        guidance_scale=float(guidance),
        width=int(width),
        height=int(height),
        generator=generator,
    )
    if hasattr(result, "nsfw_content_detected") and result.nsfw_content_detected:
        print("Warning: NSFW content was detected by the safety checker")
    return result.images[0]

def push_folder_to_hub(local_folder, repo_id, repo_type, public):
    """Upload a folder to Hugging Face Hub."""
    token = os.environ.get("HUGGINGFACE_TOKEN") or os.environ.get("HF_TOKEN")
    create_repo(repo_id=repo_id, repo_type=repo_type, private=(not public), exist_ok=True, token=token)
    api = HfApi()
    api.upload_folder(
        folder_path=str(local_folder),
        repo_id=repo_id,
        repo_type=repo_type,
        path_in_repo="",
        commit_message="Upload from cool_t2i_share.py",
        token=token,
    )

def main():
    parser = argparse.ArgumentParser(description="Text to image with optional Hub upload")
    parser.add_argument("--prompt", type=str, required=True, help="Text description of the image")
    parser.add_argument("--model-id", type=str, default="stabilityai/sd-turbo", help="Hugging Face model ID")
    parser.add_argument("--num-inference-steps", type=int, default=4, help="Number of denoising steps")
    parser.add_argument("--guidance-scale", type=float, default=0.0, help="Guidance scale")
    parser.add_argument("--width", type=int, default=512, help="Image width")
    parser.add_argument("--height", type=int, default=512, help="Image height")
    parser.add_argument("--seed", type=int, default=None, help="Random seed")
    parser.add_argument("--output-dir", type=str, default="outputs", help="Output directory")
    parser.add_argument("--push-to-hub", action="store_true", help="Upload to Hugging Face Hub")
    parser.add_argument("--repo-id", type=str, default=None, help="Hub repository ID (USERNAME/repo-name)")
    parser.add_argument("--repo-type", type=str, choices=["model", "dataset"], default="model", help="Repository type")
    parser.add_argument("--public", action="store_true", help="Make the repo public")
    args = parser.parse_args()

    if args.width % 8 != 0 or args.height % 8 != 0:
        raise ValueError("Width and height must be multiples of 8")

    device = "cuda" if torch.cuda.is_available() else "cpu"
    dtype = torch.float16 if device == "cuda" else torch.float32
    
    print(f"Using device: {device}")
    print(f"Loading model: {args.model_id}")
    
    pipe = load_pipeline(args.model_id, device, dtype)

    print(f"Generating image for prompt: '{args.prompt}'")
    image = generate(
        pipe=pipe,
        prompt=args.prompt,
        steps=args.num_inference_steps,
        guidance=args.guidance_scale,
        width=args.width,
        height=args.height,
        seed=args.seed,
    )

    # Create timestamped output directory
    ts = time.strftime("%Y%m%d-%H%M%S")
    run_dir = Path(args.output_dir) / f"{ts}"
    run_dir.mkdir(parents=True, exist_ok=True)
    
    # Generate filename
    name = f"{slugify(args.prompt)}.png" if args.prompt else f"image_{ts}.png"
    out_path = run_dir / name
    image.save(out_path)
    print(f"Saved image to {out_path}")

    # Optionally upload to Hub
    if args.push_to_hub:
        if not args.repo_id:
            raise ValueError("When --push-to-hub is set, you must pass --repo-id USERNAME/repo-name")
        print(f"Pushing folder {run_dir} to {args.repo_id} as {args.repo_type}")
        push_folder_to_hub(run_dir, args.repo_id, args.repo_type, args.public)
        print("Upload completed")

if __name__ == "__main__":
    main()

Let's try running our changes: python3 cool_t2i_share.py --prompt "a serene mountain landscape with a lake". If everything looks good, we can proceed with sharing.

Initialize your project for git (git init) add your files and make an initial commit.
Create a new repository in github and push your code up there. I shared mine here: https://github.com/doronkatz/cool-t2i-share

And that's it, your first foray into Hugging Face.

Sharing Your Work

You might also like...

GenAI is the new Systems Design for TPM

Vector Databases Why Your Recommender System Just Got Smarter

Book a 30-min Meeting