Create a custom endpoint
Learn how to create, deploy, and test your first custom Serverless endpoint.
For an even faster start, you can clone the worker-basic repository to get a pre-configured template for building and deploying Serverless endpoints. After cloning the repository, you can skip to step 6 of this tutorial to deploy and test the endpoint.
What you'll learn
In this tutorial you'll learn how to:
- Set up your development environment.
- Create a handler file.
- Test your endpoint locally.
- Build a Docker image for deployment.
- Deploy and test your endpoint on the RunPod console.
Requirements
- You've created a RunPod account.
- You've installed Python 3.x and Docker on your local machine and configured them for your command line.
Step 1: Create a Python virtual environment
First, set up a virtual environment to manage your project dependencies.
-
Run this command in your local terminal:
# Create a Python virtual environment
python3 -m venv venv -
Then activate the virtual environment:
- macOS/Linux
- Windows
source venv/bin/activate
venv\Scripts\activate
-
Finally, install the RunPod SDK:
pip install runpod
Step 2: Create a handler file
Create a file named rp_handler.py
and add the following code:
import runpod
import time
def handler(event):
"""
This function processes incoming requests to your Serverless endpoint.
Args:
event (dict): Contains the input data and request metadata
Returns:
Any: The result to be returned to the client
"""
# Extract input data
print(f"Worker Start")
input = event['input']
prompt = input.get('prompt')
seconds = input.get('seconds', 0)
print(f"Received prompt: {prompt}")
print(f"Sleeping for {seconds} seconds...")
# You can replace this sleep call with your Python function to generate images, text, or run any machine learning workload
time.sleep(seconds)
return prompt
# Start the Serverless function when the script is run
if __name__ == '__main__':
runpod.serverless.start({'handler': handler })
This is a bare-bones handler that processes a JSON object and outputs a prompt
string contained in the input
object. You can replace the time.sleep(seconds)
call with your own Python code for generating images, text, or running any machine learning workload.
Step 3: Create a test input file
You'll need to create an input file to properly test your handler locally. Create a file named test_input.json
and add the following code:
{
"input": {
"prompt": "Hey there!"
}
}
Step 4: Test your handler locally
Run your handler to verify that it works correctly:
python rp_handler.py
You should see output similar to this:
--- Starting Serverless Worker | Version 1.7.9 ---
INFO | Using test_input.json as job input.
DEBUG | Retrieved local job: {'input': {'prompt': 'Hey there!'}, 'id': 'local_test'}
INFO | local_test | Started.
Worker Start
Received prompt: Hey there!
Sleeping for 0 seconds...
DEBUG | local_test | Handler output: Hey there!
DEBUG | local_test | run_job return: {'output': 'Hey there!'}
INFO | Job local_test completed successfully.
INFO | Job result: {'output': 'Hey there!'}
INFO | Local testing complete, exiting.
Step 5: Create a Dockerfile
Create a file named Dockerfile
with the following content:
FROM python:3.10-slim
WORKDIR /
# Install dependencies
RUN pip install --no-cache-dir runpod
# Copy your handler file
COPY rp_handler.py /
# Start the container
CMD ["python3", "-u", "rp_handler.py"]
Step 6: Build and push your Docker image
-
Build your Docker image, specifying the platform for RunPod deployment, replacing `[YOUR_USERNAME] with your Docker username:
docker build --platform linux/amd64 --tag [YOUR_USERNAME]/serverless-test.
noteWhen building your Docker image, you must specify the platform as
linux/amd64
or it won't work on Serverless. -
Then push the image to your container registry:
docker push yourusername/serverless-test:latest
Step 7: Deploy your endpoint using the web interface
- Go to the Serverless section of the RunPod web interface.
- Click New Endpoint.
- Under Custom Source, select Docker Image, then click Next.
- In the Container Image field, enter your Docker image URL:
docker.io/yourusername/serverless-test:latest
- (Optional) Enter a custom name for your endpoint, or use the randomly generated name.
- Under Worker Configuration, check the box for 16 GB GPUs.
- Leave the rest of the settings at their defaults.
- Click Create Endpoint.
You should be automatically redirected to a dedicated detail page for your new endpoint.
Step 8: Test your endpoint on RunPod
To test your endpoint, click the Requests tab in the endpoint detail page:

On the left you should see the default test request:
{
"input": {
"prompt": "Hello World"
}
}
Leave the default input as is and click Run. It will take some time for your workers to initialize.
When the workers have finished processing your request, you should see output on the right side of the page similar to this:
{
"delayTime": 15088,
"executionTime": 60,
"id": "04f01223-4aa2-40df-bdab-37e5caa43cbe-u1",
"output": "Hello World",
"status": "COMPLETED",
"workerId": "uhbbfre73gqjwh"
}
Congratulations, you've successfully deployed and tested your first Serverless endpoint!
Next steps
Now that you've learned the basics, you're ready to: