curl -X POST "https://api.runpod.ai/v2/cogito-671b-v2-1-fp8-dynamic/runsync" \ -H "Authorization: Bearer $RUNPOD_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "input": { "prompt": "Write a detailed analysis of the economic impacts of renewable energy adoption:", "max_tokens": 1024, "temperature": 0.7 } }'
Copy
{ "delayTime": 45, "executionTime": 8234, "id": "sync-a1b2c3d4-e5f6-7890-abcd-ef1234567890-u1", "output": [ { "choices": [ { "tokens": [ "The economic impacts of renewable energy adoption are multifaceted and far-reaching. Here's a comprehensive analysis:\n\n1. Job Creation and Labor Markets..." ] } ], "cost": 0.0005, "usage": { "input": 20, "output": 980 } } ], "status": "COMPLETED"}
Cogito 671B v2.1
Deep Cogito’s 671B parameter Mixture-of-Experts model with FP8 dynamic quantization for efficient inference.
Cogito 671B v2.1 is Deep Cogito’s massive 671B parameter Mixture-of-Experts (MoE) language model. It uses FP8 dynamic quantization for efficient inference while maintaining high-quality outputs across reasoning, coding, and general knowledge tasks.
Cogito 671B v2.1 is fully compatible with the OpenAI API format. You can use the OpenAI Python client to interact with this endpoint.
Python (OpenAI SDK)
Copy
from openai import OpenAIclient = OpenAI( api_key=RUNPOD_API_KEY, base_url="https://api.runpod.ai/v2/cogito-671b-v2-1-fp8-dynamic/openai/v1",)response = client.chat.completions.create( model="cogito-671b-v2-1-fp8-dynamic", messages=[ { "role": "system", "content": "You are a helpful assistant with expertise in economics and analysis.", }, { "role": "user", "content": "Analyze the economic impacts of renewable energy adoption.", }, ], max_tokens=1024,)print(response.choices[0].message.content)
For streaming responses, add stream=True:
Python (Streaming)
Copy
response = client.chat.completions.create( model="cogito-671b-v2-1-fp8-dynamic", messages=[ {"role": "user", "content": "Explain the principles of machine learning."} ], max_tokens=1024, stream=True,)for chunk in response: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")