Zero GPU Pods on restart

When you restart a stopped Pod, you might see a message telling you that there are “Zero GPU Pods.” This is because there are no GPUs available on the where your Pod was running.

Why does this happen?

When you deploy a Pod, it’s assigned to a GPU on a specific physical machine. This creates a link between your Pod and that particular piece of hardware. As long as your Pod is running, that GPU is exclusively reserved for you. When you stop your Pod, you release that specific GPU, allowing other users to rent it. Your Pod’s volume storage remains on the physical machine, but the GPU slot becomes available. If another user rents that GPU while your Pod is stopped, the GPU will be occupied when you try to restart. Because your Pod is still tied to that original machine, it cannot start with a GPU. When this happens, Runpod gives you the option to start the Pod with zero GPUs. This is primarily a data recovery feature, allowing you to access your Pod’s volume disk without access to the GPU.

What are my options?

If you encounter this situation, you have three choices:

Start with zero GPUs for data access: Start the Pod without a GPU to access its local storage. This is useful for retrieving files, but the Pod will have limited CPU resources and is not suitable for compute tasks. You should use this option to back up or transfer your data before terminating the Pod.
Wait and retry: You can wait and try to restart the Pod again later. The GPU may become available if the other user stops their Pod, but there is no guarantee of when that will happen.
Terminate and redeploy: If you need a GPU immediately, terminate the current Pod and deploy a new one with the same configuration. The new Pod will be scheduled on any machine in the Runpod network with an available GPU of your chosen type.

How do I prevent this?

The most effective way to avoid this issue is to use network volumes. Network volumes decouple your data from a specific physical machine. Your /workspace data is stored on a separate, persistent volume that can be attached to any Pod. If you need to terminate a Pod, you can simply deploy a new one and attach the same network volume, giving you immediate access to your data on a new machine with an available GPU.

Get started

Flash

Serverless

Pods

Storage

Public Endpoints

Instant Clusters

Integrations

Hub

Fine-tuning

Reference

Why does this happen?

What are my options?

How do I prevent this?

Get started

Flash

Serverless

Pods

Storage

Public Endpoints

Instant Clusters

Integrations

Hub

Fine-tuning

Reference

​Why does this happen?

​What are my options?

​How do I prevent this?

Why does this happen?

What are my options?

How do I prevent this?