> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpod.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> Deploy and manage Serverless endpoints using the Runpod console or REST API.

<div className="overview-page-wrapper" />

Endpoints are the foundation of Runpod Serverless, serving as the gateway for deploying and managing your [Serverless workers](/serverless/workers/overview). Each endpoint provides a unique URL that accepts [HTTP requests](/serverless/endpoints/send-requests), processes them using your [handler function](/serverless/workers/handler-functions), and returns results.

<CardGroup cols={2}>
  <Card title="Send requests" href="/serverless/endpoints/send-requests" icon="paper-plane" horizontal>
    Learn how to send requests to your endpoints.
  </Card>

  <Card title="Endpoint settings" href="/serverless/endpoints/endpoint-configurations" icon="gear" horizontal>
    Configure scaling, timeouts, and GPU selection.
  </Card>

  <Card title="Job states" href="/serverless/endpoints/job-states" icon="chart-line" horizontal>
    Monitor job status and metrics.
  </Card>

  <Card title="Model caching" href="/serverless/endpoints/model-caching" icon="bolt" horizontal>
    Reduce cold starts with cached models.
  </Card>
</CardGroup>

## Endpoint types

|                       | Queue-based                                | Load balancing                |
| --------------------- | ------------------------------------------ | ----------------------------- |
| **Processing**        | Requests queued and processed sequentially | Direct HTTP access to workers |
| **Execution modes**   | Async (`/run`) or sync (`/runsync`)        | Custom HTTP endpoints         |
| **Retries**           | Automatic retries on failure               | No automatic retries          |
| **Handler required?** | Yes                                        | No (use any HTTP framework)   |
| **Best for**          | Batch jobs, guaranteed execution           | Real-time apps, streaming     |

Learn more about [load balancing endpoints](/serverless/load-balancing/overview).

## Create an endpoint

Before creating an endpoint, ensure you have a [handler function](/serverless/workers/handler-functions) and [Dockerfile](/serverless/workers/create-dockerfile).

<Tabs>
  <Tab title="Web">
    1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless) and click **New Endpoint**.
    2. Choose your deployment source:
       * **Import Git Repository**: See [Deploy from GitHub](/serverless/workers/github-integration)
       * **Import from Docker Registry**: See [Deploy from Docker Hub](/serverless/workers/deploy)
       * **Ready-to-Deploy Repos**: Select a preconfigured endpoint
    3. Configure your endpoint:
       * **Endpoint Name** and **Type** (Queue-based or Load balancer)
       * **GPU Configuration** and worker settings
       * **Model** (optional): Enter a Hugging Face URL for [cached models](/serverless/endpoints/model-caching)
       * **Environment Variables**: See [environment variables](/serverless/development/environment-variables)
    4. Click **Deploy Endpoint**.
  </Tab>

  <Tab title="REST API">
    ```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
    curl --request POST \
      --url https://rest.runpod.io/v1/endpoints \
      --header 'Authorization: Bearer RUNPOD_API_KEY' \
      --header 'Content-Type: application/json' \
      --data '{
        "name": "my-endpoint",
        "templateId": "30zmvf89kd",
        "gpuTypeIds": ["NVIDIA GeForce RTX 4090"],
        "workersMin": 0,
        "workersMax": 3,
        "idleTimeout": 5
      }'
    ```

    See the [Endpoint API reference](/api-reference/endpoints/POST/endpoints) for all parameters.
  </Tab>
</Tabs>

<Tip>
  Optimize cost and availability by specifying multiple GPU types in priority order. Runpod allocates your first choice if available, otherwise uses the next in your list.
</Tip>

After deployment, your endpoint displays a unique API URL: `https://api.runpod.ai/v2/{endpoint_id}/`

## Edit an endpoint

1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless).
2. Click the three dots on your endpoint → **Edit Endpoint**.
3. Update [endpoint settings](/serverless/endpoints/endpoint-configurations) and click **Save Endpoint**.

Changes to GPU types or worker counts may require restarting active workers.

## Delete an endpoint

1. Navigate to the [Serverless section](https://www.console.runpod.io/serverless).
2. Click the three dots on your endpoint → **Delete Endpoint**.
3. Type the endpoint name to confirm.

<Warning>
  Deleting an endpoint permanently removes all configuration, logs, and job history.
</Warning>
