Chat API
The Chat API allows you to interact with Fluence's conversational AI models. This endpoint is designed for chat-based interactions and supports various features like streaming, function calling, and more.
Endpoint
POST /v1/chat/completions
Request Format
{
"model": "string",
"messages": [
{
"role": "string",
"content": "string"
}
],
"temperature": number,
"max_tokens": number,
"stream": boolean
}
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | The ID of the model to use |
| messages | array | Yes | Array of message objects |
| temperature | number | No | Controls randomness (0-2) |
| max_tokens | number | No | Maximum tokens to generate |
| stream | boolean | No | Whether to stream the response |
Message Object
{
"role": "string",
"content": "string"
}
Response Format
{
"id": "string",
"object": "chat.completion",
"created": number,
"model": "string",
"choices": [
{
"index": number,
"message": {
"role": "string",
"content": "string"
},
"finish_reason": "string"
}
],
"usage": {
"prompt_tokens": number,
"completion_tokens": number,
"total_tokens": number
}
}
Example Request
curl https://api.fluence.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "fluence-chat",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"temperature": 0.7
}'
Example Response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "fluence-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "I'm doing well, thank you for asking! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
Error Codes
| Status Code | Error Code | Description |
|---|---|---|
| 400 | invalid_request | The request was invalid |
| 401 | authentication_error | Authentication failed |
| 429 | rate_limit_exceeded | Rate limit exceeded |
| 500 | server_error | Internal server error |
Rate Limits
- 100 requests per minute
- 1000 requests per hour
Best Practices
- Always include error handling in your implementation
- Use streaming for real-time responses
- Implement retry logic with exponential backoff
- Cache responses when appropriate
- Monitor your token usage