Logs

Runpod provides comprehensive logging capabilities for Serverless endpoints and workers to help you monitor, debug, and troubleshoot your applications. Understanding the different types of logs and their persistence characteristics is crucial for effective application management.

Endpoint logs

Endpoint logs are retained for 90 days, after which they are automatically removed from the system. If you need to retain logs indefinitely, you can write them to a network volume or an external service.

Endpoint logs are automatically collected from your worker instances and streamed to Runpod’s centralized logging system. These logs include:

Standard output (stdout) from your handler functions.
Standard error (stderr) from your applications.
System messages related to worker lifecycle events.
Framework logs from the Runpod SDK. To learn more about the Runpod logging library, see the Runpod SDK documentation.

Logs are streamed in near real-time with only a few seconds of lag.

If workers generate excessive output, logs may be throttled and dropped to prevent system overload.

To access endpoint logs:

Navigate to your Serverless endpoint in the Runpod console.
Click on the Logs tab.
View real-time and historical logs.
Use the search and filtering capabilities to find specific log entries.
Download logs as text files for offline analysis.

Worker logs

Worker logs are temporary logs that exist only on the specific server where the worker is running. These logs are not throttled, but are not persistent, and are removed when a worker terminates. To access worker logs:

Navigate to your Serverless endpoint in the Runpod console.
Click on the Workers tab.
Click on a worker to view its logs and request history.
Use the search and filtering capabilities to find specific log entries.
Download logs as text files for offline analysis.

Logging levels

Runpod supports standard logging levels to help you control the verbosity and importance of log messages generated by your Serverless workers. Using appropriate logging levels makes it easier to filter and analyze logs, especially when troubleshooting or monitoring your application. The logging levels available for Serverless logs are:

DEBUG: Detailed information, typically of interest only when diagnosing problems.
INFO: Confirmation that things are working as expected.
WARNING: Used for unexpected events or warnings of problems in the near future (e.g., disk space low).
ERROR: Used for more serious problems, where the application has not been able to perform some function.
FATAL: Used for very serious errors, indicating that the program itself may be unable to continue running.

Writing logs to the console

The easiest way to write logs to the Runpod console is using the Python logging library:

import logging
import os
import runpod
import logging.handlers

def setup_logger(log_level=logging.DEBUG):
    """
    Configures and returns a logger that writes only to the console.

    This function should be called once when the worker initializes.
    """
    # Define the format for log messages. We include a placeholder for 'request_id'
    # which will be added contextually for each job.
    log_format = logging.Formatter(
        '%(asctime)s - %(levelname)s - [Request: %(request_id)s] - %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    
    # Get the root logger
    logger = logging.getLogger("runpod_worker")
    logger.setLevel(log_level)
    
    # --- Console Handler ---
    # This handler sends logs to standard output, which Runpod captures as worker logs.
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(log_format)
    
    # Add the console handler to the logger
    # Check if handlers are already added to avoid duplication on hot reloads
    if not logger.handlers:
        logger.addHandler(console_handler)
    
    return logger

# --- Global Logger Initialization ---
# Set up the logger when the script is first loaded by the worker.
# We pass a default request_id that will be used for any logs generated
# outside of the handler function.
# Set log level to DEBUG to capture all levels of logs.
logger = setup_logger(log_level=logging.DEBUG)
logger = logging.LoggerAdapter(logger, {"request_id": "N/A"})

logger.info("Logger initialized. Ready to process jobs.")

def handler(job):
    """
    This is the main handler function for the Serverless worker.
    """
    # Extract the request ID from the job payload for traceability.
    request_id = job.get('id', 'unknown')
    
    # Create a new logger adapter for this specific job. This injects the
    # current request_id into all log messages created with this adapter.
    job_logger = logging.LoggerAdapter(logging.getLogger("runpod_worker"), {"request_id": request_id})
    
    job_logger.info(f"Received job. Now demonstrating all log levels.")
    
    try:
        # --- Demonstrate all log levels sequentially ---
        job_logger.debug("This is a debug message. Use this for detailed information for diagnosing problems.")
        job_logger.info("This is an info message. Use this for general information about program execution.")
        job_logger.warning("This is a warning message. Use this to indicate when something unexpected has occurred, but the program should continue.")
        job_logger.error("This is an error message. Use this for serious but recoverable issues.")
        job_logger.critical("This is a critical message. Use this for very serious and potentially unrecoverable issues.")
        # --- End of demonstration ---

        result = "Successfully demonstrated all log levels."
        job_logger.info(f"Job completed successfully.")
        
        return {"output": result}

    except Exception as e:
        # This block will now only be hit by unexpected errors, not by the demonstration.
        job_logger.error(f"Job failed with an unexpected exception.", exc_info=True)
        return {"error": f"An unexpected error occurred: {str(e)}"}


# Start the Serverless worker
if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})

Persistent log storage

If you need to retain endpoint logs beyond the 90-day period or worker logs beyond the worker lifecycle, you can implement custom logging within your handler functions to write logs to a network volume or an external service (like Elasticsearch or Datadog).

Writing logs to a network volume

The most straightforward approach for persistent logging is to write logs to a network volume attached to your endpoint. Here’s a modified version of the console logging code above that also writes logs to /logs/worker.log on a network volume attached to your endpoint:

import logging
import os
import runpod
import logging.handlers

def setup_logger(log_dir="/runpod-volume/logs", log_level=logging.DEBUG):
    """
    Configures and returns a logger that writes to both the console and a
    file on a network volume.

    This function should be called once when the worker initializes.
    """
    # Ensure the log directory exists on the network volume
    os.makedirs(log_dir, exist_ok=True)

    # Define the format for log messages. We include a placeholder for 'request_id'
    # which will be added contextually for each job.
    log_format = logging.Formatter(
        '%(asctime)s - %(levelname)s - [Request: %(request_id)s] - %(message)s',
        datefmt='%Y-%m-%d %H:%M:%S'
    )
    
    # Get the root logger
    logger = logging.getLogger("runpod_worker")
    logger.setLevel(log_level)
    
    # --- Console Handler ---
    # This handler sends logs to standard output, which Runpod captures as worker logs.
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(log_format)
    
    # --- File Handler ---
    # This handler writes logs to a single file on the persistent network volume.
    # Note: This handler does not rotate logs.
    log_file_path = os.path.join(log_dir, "worker.log")
    file_handler = logging.FileHandler(log_file_path)
    file_handler.setFormatter(log_format)

    # Add both handlers to the logger
    # Check if handlers are already added to avoid duplication on hot reloads
    if not logger.handlers:
        logger.addHandler(console_handler)
        logger.addHandler(file_handler)
    
    return logger

# --- Global Logger Initialization ---
# Set up the logger when the script is first loaded by the worker.
# We pass a default request_id that will be used for any logs generated
# outside of the handler function.

# Set log level to DEBUG to capture all levels of logs.
logger = setup_logger(log_level=logging.DEBUG)
logger = logging.LoggerAdapter(logger, {"request_id": "N/A"})

logger.info("Logger initialized. Ready to process jobs.")


def handler(job):
    """
    This is the main handler function for the Serverless worker.
    It processes a single job and demonstrates all logging levels.
    """
    # Extract the request ID from the job payload for traceability.
    request_id = job.get('id', 'unknown')
    
    # Create a new logger adapter for this specific job. This injects the
    # current request_id into all log messages created with this adapter.
    job_logger = logging.LoggerAdapter(logging.getLogger("runpod_worker"), {"request_id": request_id})
    
    job_logger.info(f"Received job. Now demonstrating all log levels.")
    
    try:
        # --- Demonstrate all log levels sequentially ---
        job_logger.debug("This is a debug message. Use this for detailed information for diagnosing problems.")
        job_logger.info("This is an info message. Use this for general information about program execution.")
        job_logger.warning("This is a warning message. Use this to indicate when something unexpected has occurred, but the program should continue.")
        job_logger.error("This is an error message. Use this for serious but recoverable issues.")
        job_logger.critical("This is a critical message. Use this for very serious and potentially unrecoverable issues.")
        # --- End of demonstration ---

        result = "Successfully demonstrated all log levels."
        job_logger.info(f"Job completed successfully.")
        
        return {"output": result}

    except Exception as e:
        # This block will now only be hit by unexpected errors, not by the demonstration.
        job_logger.error(f"Job failed with an unexpected exception.", exc_info=True)
        return {"error": f"An unexpected error occurred: {str(e)}"}


# Start the Serverless worker
if __name__ == "__main__":
    runpod.serverless.start({"handler": handler})

Accessing stored logs

To access logs stored in network volumes:

Use the S3-compatible API to programmatically access log files.
Use the web terminal or SSH to connect to a Pod with the same network volume attached.

Best practices for persistent logging

Use request IDs: Include the RUNPOD_REQUEST_ID environment variable or job ID in log entries for traceability.
Structured logging: Use JSON format for easier parsing and analysis.
Log rotation: Implement log rotation to prevent disk space issues.
Separate log files: Create separate log files per request or time period for better organization.

Troubleshooting

Missing logs

If logs are not appearing in the Logs tab:

Check log throttling: Excessive logging may trigger throttling.
Verify output streams: Ensure you’re writing to stdout/stderr.
Check worker status: Logs only appear for successfully initialized workers.
Review retention period: Logs older than 90 days are automatically removed.

Log throttling

To avoid log throttling:

Reduce log verbosity in production environments.
Use structured logging to make logs more efficient.
Implement log sampling for high-frequency events.
Store detailed logs in network volumes instead of console output.

Get started

Serverless

Pods

Storage

Hub

Instant Clusters

Fine-tuning

Reference

Endpoint logs

Worker logs

Logging levels

Writing logs to the console

Persistent log storage

Writing logs to a network volume

Accessing stored logs

Best practices for persistent logging

Troubleshooting

Missing logs

Log throttling

Get started

Serverless

Pods

Storage

Hub

Instant Clusters

Fine-tuning

Reference

​Endpoint logs

​Worker logs

​Logging levels

​Writing logs to the console

​Persistent log storage

​Writing logs to a network volume

​Accessing stored logs

​Best practices for persistent logging

​Troubleshooting

​Missing logs

​Log throttling

Endpoint logs

Worker logs

Logging levels

Writing logs to the console

Persistent log storage

Writing logs to a network volume

Accessing stored logs

Best practices for persistent logging

Troubleshooting

Missing logs

Log throttling