> ## Documentation Index
> Fetch the complete documentation index at: https://docs.runpod.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Set up Ollama on a Pod

> Install and run Ollama on a Pod with HTTP API access.

export const PodEnvironmentVariablesTooltip = () => {
  return <Tooltip headline="Environment variables" tip="Key-value pairs that you can set in your Pod template and access within your code, allowing you to configure your application without hardcoding credentials or settings." cta="Learn more about Pod environment variables" href="/pods/templates/environment-variables">environment variables</Tooltip>;
};

export const PodTooltip = () => {
  return <Tooltip headline="Pod" tip="A dedicated GPU or CPU instance for containerized AI/ML workloads." cta="Learn more about Pods" href="/pods/overview">Pod</Tooltip>;
};

This tutorial shows you how to set up [Ollama](https://ollama.com), a platform for running large language models, on a Runpod GPU <PodTooltip />. By the end, you'll have Ollama running with HTTP API access for external requests.

## Requirements

* A Runpod account with credits.

## Step 1: Deploy a Pod

1. Navigate to [Pods](https://www.console.runpod.io/pods) and select **Deploy**.
2. Choose a GPU (for example, A40).
3. Select the latest **PyTorch** template.
4. Under **Pod Template**, select **Edit**:

* Under **Expose HTTP Ports (Max 10)**, add port `11434`.
* Under **<PodEnvironmentVariablesTooltip />**, add a variable with key `OLLAMA_HOST` and value `0.0.0.0`.

5. Click **Set Overrides** and then **Deploy On-Demand**.

## Step 2: Install Ollama

1. Once the Pod is running, click the Pod to open the connection options panel and select **Enable Web Terminal** and then **Open Web Terminal**.

2. Update packages and install dependencies:

   ```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
   apt update && apt install -y lshw zstd
   ```

3. Install Ollama and start the server in the background:

   ```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
   (curl -fsSL https://ollama.com/install.sh | sh && ollama serve > ollama.log 2>&1) &
   ```

## Step 3: Run a model

Download and run a model using the `ollama run` command:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
ollama run llama2
```

Replace `llama2` with any model from the [Ollama library](https://ollama.com/library). You can now interact with the model directly from the terminal.

## Step 4: Make HTTP API requests

With Ollama running, you can make HTTP requests to your Pod from external clients. Try running the following commands, replacing `OLLAMA_POD_ID` with your actual Pod ID:

**List available models:**

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl https://OLLAMA_POD_ID-11434.proxy.runpod.net/api/tags
```

**Generate a response:**

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://OLLAMA_POD_ID-11434.proxy.runpod.net/api/generate -d '{
  "model": "llama2",
  "prompt": "Tell me a story about llamas"
}'
```

Ollama returns streaming responses by default. To get a non-streaming response, add the `stream: false` parameter to the request body:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST https://OLLAMA_POD_ID-11434.proxy.runpod.net/api/generate -d '{
  "model": "llama2",
  "prompt": "Tell me a story about llamas",
  "stream": false
}'
```

<Check>
  Congratulations! You've set up Ollama on a Runpod Pod and made HTTP API requests to it.
</Check>

For more API options, see the [Ollama API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md).

## Next steps

* Learn about [exposing ports](/pods/configuration/expose-ports) on Pods.
* Connect [VSCode to Runpod](https://blog.runpod.io/how-to-connect-vscode-to-runpod/) for remote development.
* Explore more models in the [Ollama library](https://ollama.com/library).
