← Back to Marketplace

HuggingFace

Bearer

Run inference on 400,000+ open models. Text, image, audio, and more.

https://api-inference.huggingface.co
OVERVIEW

HuggingFace's Inference API lets you run any model hosted on the Hub without managing infrastructure. Send a POST request with your input, get predictions back. Supports text generation, classification, embeddings, image generation, object detection, translation, summarization, and more.

The same endpoint pattern works for every model. Just change the model ID in the URL. Free tier includes rate-limited access to popular models. For production, Inference Endpoints provide dedicated GPUs.

GET YOUR API KEY

Authentication setup

  1. Go to huggingface.co and sign in
  2. Click your avatar → Settings
  3. Navigate to Access Tokens
  4. Click New token, select Read or Write scope
  5. Copy the token. It starts with hf_
Token format hf_...
Auth type bearer
QUICK CONNECT

Create a connection in one request.

CONNECT HUGGINGFACE
POST /connections Authorization: Bearer $TOKEN Content-Type: application/json { "name": "HuggingFace", "base_url": "https://api-inference.huggingface.co", "auth_type": "bearer", "auth_config": {"token": "hf_..."} } → {"id": "cn_xxxxxxxx", "name": "HuggingFace"}
KEY ENDPOINTS

What you can call via proxy.

MethodPathDescription
POST /models/{model_id} Run inference on any model
POST /models/meta-llama/Llama-3-8b-instruct Text generation with Llama 3
POST /models/sentence-transformers/all-MiniLM-L6-v2 Get text embeddings
POST /models/facebook/bart-large-mnli Zero-shot text classification
POST /models/openai/whisper-large-v3 Transcribe audio
POST /models/stabilityai/stable-diffusion-xl-base-1.0 Generate images
EXAMPLE

Text Generation

PROXY CALL
POST /proxy/cn_xxxxxxxx/models/meta-llama/Llama-3-8b-instruct Authorization: Bearer $TOKEN Content-Type: application/json { "inputs": "The best thing about API proxies is", "parameters": {"max_new_tokens": 50} } [{ "generated_text": "The best thing about API proxies is that they decouple your application from the upstream service, letting you swap providers, add caching, or inject credentials without changing client code." }]
USE CASES

What people build with HuggingFace.

Model Prototyping

Test any of 400K+ models with a single API call before committing to infrastructure.

NLP Pipelines

Chain classification, summarization, and translation models for document processing.

Embedding Generation

Generate embeddings with sentence-transformers for search, clustering, and RAG.

Image Generation

Run Stable Diffusion and other image models for creative and design workflows.

# HuggingFace > Run inference on 400,000+ open models. Text, image, audio, and more. ## Overview HuggingFace's Inference API lets you run any model hosted on the Hub without managing infrastructure. Send a POST request with your input, get predictions back. Supports text generation, classification, embeddings, image generation, object detection, translation, summarization, and more. The same endpoint pattern works for every model. Just change the model ID in the URL. Free tier includes rate-limited access to popular models. For production, Inference Endpoints provide dedicated GPUs. - **Base URL:** https://api-inference.huggingface.co - **Auth type:** bearer - **Token format:** hf_... ## Get Your API Key 1. Go to huggingface.co and sign in 2. Click your avatar → Settings 3. Navigate to Access Tokens 4. Click New token, select Read or Write scope 5. Copy the token. It starts with hf_ ## Quick Connect bashcurl -X POST https://api.liteio.dev/connections \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "name": "HuggingFace", "base_url": "https://api-inference.huggingface.co", "auth_type": "bearer", "auth_config": {"token": "hf_..."} }' ## Key Endpoints MethodPathDescription POST/models/{model_id}Run inference on any model POST/models/meta-llama/Llama-3-8b-instructText generation with Llama 3 POST/models/sentence-transformers/all-MiniLM-L6-v2Get text embeddings POST/models/facebook/bart-large-mnliZero-shot text classification POST/models/openai/whisper-large-v3Transcribe audio POST/models/stabilityai/stable-diffusion-xl-base-1.0Generate images ## Example: Text Generation bashcurl -X POST https://api.liteio.dev/proxy/cn_xxx/models/meta-llama/Llama-3-8b-instruct \ -H "Authorization: Bearer $TOKEN" \ -H "Content-Type: application/json" \ -d '{ "inputs": "The best thing about API proxies is", "parameters": {"max_new_tokens": 50} }' ## Use Cases - **Model Prototyping**: Test any of 400K+ models with a single API call before committing to infrastructure. - **NLP Pipelines**: Chain classification, summarization, and translation models for document processing. - **Embedding Generation**: Generate embeddings with sentence-transformers for search, clustering, and RAG. - **Image Generation**: Run Stable Diffusion and other image models for creative and design workflows. ## Links - Back to Marketplace - Developer Guide - Get API Key