> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpod.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Send API requests to deployed endpoints

> Call your deployed Flash endpoints using HTTP requests for queue-based and load-balanced configurations.

After deploying your Flash app with `flash deploy`, you can call your endpoints directly via HTTP. The request format depends on whether you're using queue-based or load-balanced configurations.

## Authentication

All deployed endpoints require authentication with your Runpod API key:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
export RUNPOD_API_KEY="your_key_here"

curl -X POST https://YOUR_ENDPOINT_URL/path \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"param": "value"}'
```

<Tip>
  Your endpoint URLs are displayed after running `flash deploy`. You can also view them with `flash env get <environment-name>`.
</Tip>

## Queue-based endpoints

Queue-based endpoints (using `@Endpoint(name=..., gpu=...)` decorator) provide two routes for job submission: `/run` (asynchronous) and `/runsync` (synchronous).

### Asynchronous calls (`/run`)

Submit a job and receive a job ID for later status checking:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://api.runpod.ai/v2/abc123xyz/run \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"input": {"prompt": "Hello world"}}'
```

**Response:**

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
    "id": "job-abc-123",
    "status": "IN_QUEUE"
}
```

**Check job status and retrieve results:**

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl https://api.runpod.ai/v2/abc123xyz/status/job-abc-123 \
    -H "Authorization: Bearer $RUNPOD_API_KEY"
```

**When the job completes:**

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
    "id": "job-abc-123",
    "status": "COMPLETED",
    "output": {
        "generated_text": "Hello world from GPU!"
    }
}
```

### Synchronous calls (`/runsync`)

Wait for job completion and receive results directly (with timeout):

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://api.runpod.ai/v2/abc123xyz/runsync \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"input": {"prompt": "Hello world"}}'
```

**Response (after job completes):**

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
    "id": "job-abc-123",
    "status": "COMPLETED",
    "output": {
        "generated_text": "Hello world from GPU!"
    }
}
```

The `/runsync` endpoint has a 60-second client-side timeout by default. If you've configured `execution_timeout_ms` on your endpoint, the client timeout uses that value instead. For jobs that take longer than 60 seconds, set `execution_timeout_ms` to prevent `/runsync` requests from timing out.

<Tip>
  Use `/run` for long-running jobs that you'll check later. Use `/runsync` for quick jobs where you want immediate results (with timeout protection).
</Tip>

### Queue-based request format

Queue-based endpoints expect input wrapped in an `{"input": {...}}` object:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://api.runpod.ai/v2/abc123xyz/runsync \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "input": {
            "param1": "value1",
            "param2": "value2"
        }
    }'
```

The structure inside `"input"` depends on your `@Endpoint` function signature.

### Job status states

| Status        | Description                       |
| ------------- | --------------------------------- |
| `IN_QUEUE`    | Waiting for an available worker   |
| `IN_PROGRESS` | Worker is executing your function |
| `COMPLETED`   | Function finished successfully    |
| `FAILED`      | Execution encountered an error    |

## Load-balanced endpoints

Load-balanced endpoints (using `api = Endpoint(...); @api.post("/path")` pattern) provide custom HTTP routes with direct request/response patterns.

### Calling load-balanced routes

All routes share the same base URL. Append the route path to call specific functions:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# POST route
curl -X POST https://abc123xyz.api.runpod.ai/analyze \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"text": "Hello world from Flash"}'

# GET route
curl -X GET https://abc123xyz.api.runpod.ai/info \
    -H "Authorization: Bearer $RUNPOD_API_KEY"

# Another POST route (same endpoint URL)
curl -X POST https://abc123xyz.api.runpod.ai/validate \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"name": "Alice", "email": "alice@example.com"}'
```

### Load-balanced request format

Load-balanced endpoints accept direct JSON payloads (no `{"input": {...}}` wrapper):

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://abc123xyz.api.runpod.ai/process \
    -H "Authorization: Bearer $RUNPOD_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
        "param1": "value1",
        "param2": "value2"
    }'
```

The payload structure depends on your function signature. Each route can accept different parameters.

### Multiple routes, single endpoint

A single load-balanced endpoint can serve multiple routes:

```python theme={"theme":{"light":"github-light","dark":"github-dark"}}
from runpod_flash import Endpoint

api = Endpoint(name="api-server", cpu="cpu5c-4-8", workers=(1, 5))

# All these routes share one endpoint URL
@api.post("/generate")
async def generate_text(prompt: str): ...

@api.post("/translate")
async def translate_text(text: str): ...

@api.get("/health")
async def health_check(): ...
```

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
# All use the same base URL with different paths
curl -X POST https://abc123xyz.api.runpod.ai/generate -H "..." -d '{...}'
curl -X POST https://abc123xyz.api.runpod.ai/translate -H "..." -d '{...}'
curl -X GET https://abc123xyz.api.runpod.ai/health -H "..."
```

## Quick reference

| Endpoint Type | Routes                             | Request Format      | Response                        |
| ------------- | ---------------------------------- | ------------------- | ------------------------------- |
| Queue-based   | `/run`, `/runsync`, `/status/{id}` | `{"input": {...}}`  | Job ID (async) or result (sync) |
| Load-balanced | Custom paths (e.g., `/process`)    | Direct JSON payload | Direct response                 |

## Response status codes

| Code  | Meaning                                               |
| ----- | ----------------------------------------------------- |
| `200` | Success (load-balanced) or job accepted (queue-based) |
| `400` | Bad request (invalid input format)                    |
| `401` | Unauthorized (invalid or missing API key)             |
| `404` | Route not found                                       |
| `500` | Internal server error                                 |

## Error handling

Queue-based errors appear in the job output:

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
    "id": "job-abc-123",
    "status": "FAILED",
    "error": "Error message from your function"
}
```

Load-balanced errors return HTTP error codes with JSON body:

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
    "error": "Error message from your function",
    "detail": "Additional error context"
}
```

## Using SDKs

For programmatic access, use the Runpod Python SDK:

```python theme={"theme":{"light":"github-light","dark":"github-dark"}}
import runpod

# Set API key
runpod.api_key = "your_api_key"

# Connect to endpoint
endpoint = runpod.Endpoint("YOUR_ENDPOINT_ID")

# Async call (returns job object immediately)
run_request = endpoint.run({"prompt": "Hello world"})
status = run_request.status()  # Check status
output = run_request.output()  # Get result once complete

# Sync call (blocks until complete)
result = endpoint.run_sync({"prompt": "Hello world"})
```

See the [Runpod SDK documentation](/sdks/python/endpoints) for complete SDK usage.

## Next steps

<CardGroup cols={2}>
  <Card title="Deploy apps" href="/flash/apps/deploy-apps" icon="rocket" horizontal>
    Deploy your Flash app to get endpoint URLs.
  </Card>

  <Card title="Configuration reference" href="/flash/configuration/parameters" icon="layer-group" horizontal>
    View all endpoint configuration parameters.
  </Card>

  <Card title="Runpod SDK" href="/sdks/python/endpoints" icon="code" horizontal>
    Use the Python SDK for programmatic access.
  </Card>
</CardGroup>
