If you’re new to Serverless, we recommend learning how to build your first worker before exploring this page.
Understanding job input
Before building a handler function, you should understand the structure of job requests. At minimum, the input will have this format:input
field to process the request data.
To learn more about endpoint requests, see Send requests.
Basic handler implementation
Here’s a simple handler function that processes an endpoint request:handler.py
runpod.serverless.start()
function launches your serverless application with the specified handler.
Local testing
To test your handler locally, you can:- Create a
test_input.json
file:
- Run your handler:
- Or provide test input directly in the command line:
Handler types
You can create several types of handler functions depending on the needs of your application.Standard handlers
The simplest handler type, standard handlers process inputs synchronously and return results directly.Streaming handlers
Streaming handlers stream results incrementally as they become available. Use these when your application requires real-time updates, for example when streaming results from a language model./stream
endpoint. Set return_aggregate_stream
to True
to make outputs available from the /run
and /runsync
endpoints as well.
Asynchronous handlers
Asynchronous handlers process operations concurrently for improved efficiency. Use these for tasks involving I/O operations, API calls, or processing large datasets.When implementing async handlers, ensure proper use of
async
and await
keywords throughout your code to maintain truly non-blocking operations and prevent performance bottlenecks, and consider leveraging the yield
statement to generate outputs progressively over time.Always test your async code thoroughly to properly handle asynchronous exceptions and edge cases, as async error patterns can be more complex than in synchronous code.Concurrent handlers
Concurrent handlers process multiple requests simultaneously with a single worker. Use these for small, rapid operations that don’t fully utlize the worker’s GPU. When increasing concurrency, it’s crucial to monitor memory usage carefully and test thoroughly to determine the optimal concurrency levels for your specific workload. Implement proper error handling to prevent one failing request from affecting others, and continuously monitor and adjust concurrency parameters based on real-world performance. Learn how to build a concurrent handler by following this guide.Error handling
When an exception occurs in your handler function, the Runpod SDK automatically captures it, marks the job status asFAILED
and returns the exception details in the job results.
For custom error responses:
try/except
blocks to avoid unintentionally suppressing errors. Either return the error for a graceful failure or raise it to flag the job as FAILED
.
Advanced handler controls
Use these features to fine-tune your Serverless applications for specific use cases.Progress updates
Send progress updates during job execution to inform clients about the current state of processing:Worker refresh
For long-running or complex jobs, you may want to refresh the worker after completion to start with a clean state for the next job. Enabling worker refresh clears all logs and wipes the worker state after a job is completed. For example:refresh_worker
flag. This flag will be removed before the remaining job output is returned.
Handler function best practices
A short list of best practices to keep in mind as you build your handler function:-
Initialize outside the handler: Load models and other heavy resources outside your handler function to avoid repeated initialization.
- Input validation: Validate inputs before processing to avoid errors during execution.
- Local testing: Test your handlers locally before deployment.
Payload limits
Be aware of payload size limits when designing your handler:/run
endpoint: 10 MB/runsync
endpoint: 20 MB