> ## Documentation Index > Fetch the complete documentation index at: https://docs.runpod.io/llms.txt > Use this file to discover all available pages before exploring further. # IBM Granite 4.0 > A 32B parameter long-context instruct model for text generation. IBM Granite-4.0-H-Small is a 32B parameter long-context instruct model. It excels at general text generation, instruction following, and conversational AI tasks with support for extended context lengths. Test IBM Granite 4.0 in the Runpod Hub playground. | | | | ------------ | ------------------------------------------------------ | | **Endpoint** | `https://api.runpod.ai/v2/granite-4-0-h-small/runsync` | | **Pricing** | \$10.00 per 1M tokens | | **Type** | Text generation | ## Request All parameters are passed within the `input` object in the request body. Array of message objects with role and content. The role of the message author. Use `system`, `user`, or `assistant`. The content of the message. Maximum number of tokens to generate. Controls randomness in generation. Lower values make output more deterministic. Range: 0.0-1.0. Seed for reproducible results. Set to -1 for random. Restricts sampling to the top K most probable tokens. Nucleus sampling threshold. Range: 0.0-1.0. ```bash cURL theme={"theme":{"light":"github-light","dark":"github-dark"}} curl -X POST "https://api.runpod.ai/v2/granite-4-0-h-small/runsync" \ -H "Authorization: Bearer $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": { "messages": [ { "role": "system", "content": "You are a helpful assistant. Please ensure responses are professional, accurate, and safe." }, { "role": "user", "content": "What is Runpod?" } ], "sampling_params": { "max_tokens": 512, "temperature": 0.7, "seed": -1, "top_k": -1, "top_p": 1 } } }' ``` ```python Python theme={"theme":{"light":"github-light","dark":"github-dark"}} import requests response = requests.post( "https://api.runpod.ai/v2/granite-4-0-h-small/runsync", headers={ "Authorization": f"Bearer {RUNPOD_API_KEY}", "Content-Type": "application/json", }, json={ "input": { "messages": [ { "role": "system", "content": "You are a helpful assistant. Please ensure responses are professional, accurate, and safe.", }, {"role": "user", "content": "What is Runpod?"}, ], "sampling_params": { "max_tokens": 512, "temperature": 0.7, "seed": -1, "top_k": -1, "top_p": 1, }, } }, ) result = response.json() print(result["output"]) ``` ```javascript JavaScript theme={"theme":{"light":"github-light","dark":"github-dark"}} const response = await fetch( "https://api.runpod.ai/v2/granite-4-0-h-small/runsync", { method: "POST", headers: { Authorization: `Bearer ${RUNPOD_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ input: { messages: [ { role: "system", content: "You are a helpful assistant. Please ensure responses are professional, accurate, and safe.", }, { role: "user", content: "What is Runpod?" }, ], sampling_params: { max_tokens: 512, temperature: 0.7, seed: -1, top_k: -1, top_p: 1, }, }, }), } ); const result = await response.json(); console.log(result.output); ``` ## Response Unique identifier for the request. Request status. Returns `COMPLETED` on success, `FAILED` on error. Time in milliseconds the request spent in queue before processing began. Time in milliseconds the model took to generate the response. Identifier of the worker that processed the request. The generation result containing the text and usage information. Array containing the generated text. Cost of the generation in USD. Token usage information. ```json 200 theme={"theme":{"light":"github-light","dark":"github-dark"}} { "id": "sync-a1b2c3d4-e5f6-7890-abcd-ef1234567890-u1", "status": "COMPLETED", "delayTime": 15, "executionTime": 2345, "workerId": "oqk7ao1uomckye", "output": { "choices": [ { "tokens": [ "Runpod is a cloud computing platform that provides GPU resources for AI and machine learning workloads..." ] } ], "cost": 0.00185, "usage": { "input": 35, "output": 150 } } } ``` ```json 400 theme={"theme":{"light":"github-light","dark":"github-dark"}} { "id": "sync-a1b2c3d4-e5f6-7890-abcd-ef1234567890-u1", "status": "FAILED", "error": "Invalid messages format" } ``` ## Cost calculation IBM Granite 4.0 charges \$10.00 per 1M tokens. Example costs: | Tokens | Cost | | ---------------- | ------- | | 1,000 tokens | \$0.01 | | 10,000 tokens | \$0.10 | | 100,000 tokens | \$1.00 | | 1,000,000 tokens | \$10.00 |