Skip to main content
POST
/
search
/
qna
Error
A valid request URL is required to generate request examples
{
  "answer": "<answer>",
  "success": true,
  "chunks": [],
  "graph_context": {
    "query_paths": [],
    "chunk_relations": [],
    "chunk_id_to_group_ids": {}
  },
  "model_used": "<string>",
  "timing": {}
}
โš ๏ธ Deprecating: This endpoint is being deprecated.
Hit the Try it button to try this API now in our playground. Itโ€™s the best way to check the full request and response in one place, customize your parameters, and generate ready-to-use code snippets.

Examples

curl --request POST \
  --url https://api.usecortex.ai/search/qna \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "question": "What is Cortex AI",
  "session_id": "chat_session_1234",
  "tenant_id": "tenant_1234",
  "sub_tenant_id": "sub_tenant_4567",
  "highlight_chunks": false,
  "stream": false,
  "search_alpha": 0.8,
  "recency_bias": 0.2,
  "ai_generation": true,
  "user_name": "John Doe",
  "user_instructions": "",
  "multi_step_reasoning": true,
  "auto_agent_routing": true
}'
Ask questions and get AI-generated answers based on your tenantโ€™s knowledge base with conversational responses and citations.
Default Sub-Tenant Behavior: If you donโ€™t specify a sub_tenant_id, the QnA will search within the default sub-tenant created when your tenant was set up. This searches across organization-wide documents and knowledge.

QnA Capabilities

The QnA endpoint provides intelligent question-answering with several powerful features:

AI-Generated Responses

  • Natural Language Processing: Understands complex questions and context
  • Citation-Based Answers: Every answer includes source references with exact locations
  • Conversational Context: Maintains conversation history through session management
  • Multi-Step Reasoning: Can break down complex questions into logical steps

Advanced Search Integration

  • Hybrid Search: Combines semantic and keyword search for optimal results
  • Context-Aware Retrieval: Finds relevant information based on question context
  • Source Highlighting: Identifies and highlights the most relevant content chunks

Customization Options

  • User Instructions: Provide custom instructions to guide AI behavior
  • Metadata Filtering: Filter results by source type, title, or other metadata
  • Streaming Support: Get real-time responses for better user experience
  • Auto-Agent Routing: Automatically route queries to specialized agents

Key Parameters

Core Parameters

Question & Session Management

  • question: The question you want answered (required)
  • session_id: Unique identifier for maintaining conversation context
  • user_name: Optional user identifier for personalized responses

Search Configuration

  • search_alpha: Balance between semantic and keyword search (0.0-1.0)
  • top_n: Number of relevant chunks to retrieve for context
  • recency_bias: Prioritize recent content (0.0-1.0)

AI Generation Control

  • ai_generation: Enable/disable AI response generation
  • multi_step_reasoning: Enable complex reasoning for difficult questions
  • user_instructions: Custom instructions to guide AI behavior
  • auto_agent_routing: Automatically route to specialized agents

Response Formatting

  • stream: Enable streaming responses for real-time output
  • highlight_chunks: Include highlighted relevant chunks in response
  • context_list: Provide additional context for the AI

Advanced Features

Metadata Filtering

Use the metadata object to filter results:
{
  "metadata": {
    "source_title": "Specific Document Title",
    "source_type": "file",
    "custom_field": "value"
  }
}

Session Management

  • Persistent Context: Maintain conversation history across multiple questions
  • Context Accumulation: Build understanding over multiple interactions
  • User Personalization: Adapt responses based on user preferences

Multi-Step Reasoning

When enabled, the AI can:
  • Break down complex questions into smaller parts
  • Analyze multiple sources of information
  • Synthesize information from different documents
  • Provide step-by-step explanations

Use Cases

Customer Support

  • FAQ Automation: Answer common customer questions automatically
  • Product Information: Provide detailed product specifications and features
  • Troubleshooting: Guide users through problem-solving steps

Knowledge Management

  • Document Q&A: Ask questions about uploaded documents and manuals
  • Research Assistance: Find and synthesize information from multiple sources
  • Training Support: Answer questions about company policies and procedures

Content Discovery

  • Information Retrieval: Find specific information within large document collections
  • Contextual Search: Get answers that understand the broader context
  • Citation Tracking: See exactly where information comes from

Best Practices

Question Formulation

  • Be Specific: Ask clear, specific questions for better results
  • Provide Context: Include relevant background information when needed
  • Use Natural Language: Ask questions as you would to a human expert

Session Management

  • Maintain Context: Use consistent session IDs for related questions
  • Build Understanding: Ask follow-up questions to deepen the conversation
  • Reset When Needed: Start new sessions for unrelated topics

Response Optimization

  • Enable Highlighting: Use highlight_chunks to see source relevance
  • Adjust Search Alpha: Experiment with different values for your content type
  • Use Metadata Filtering: Narrow results to specific document types when needed

Response

Returns a JSON object containing the AI-generated answer and supporting source chunks with layout information for creating bounding boxes around cited sources.
{
  "answer": "Based on the uploaded knowledge, here is the answer to your question...",
  "session_id": "session_123",
  "sources": [
    {
      "id": "source_123",
      "url": "https://example.com/document.pdf",
      "title": "Document Title",
      "timestamp": "2024-01-15T10:30:00Z",
      "context": "This is the relevant text chunk from the document...",
      "source": "document",
      "layout": {
        "page": 1,
        "coordinates": {
          "x": 100,
          "y": 200,
          "width": 200,
          "height": 50
        }
      },
      "hybrid_score": 0.85
    }
  ],
  "highlight_chunks": [
    {
      "source_id": "CortexDoc1234",
      "subject": "Document Title",
      "timestamp": "1750697263.2323804",
      "context": "Highlighted text chunk...",
      "source": "document",
      "hybrid_score": 0.85,
      "layout": {
        "page": 1,
        "coordinates": {
          "x": 100,
          "y": 200,
          "width": 200,
          "height": 50
        }
      }
    }
  ],
  "source_id_map": {
    "s0": {
      "chunk_uuid": "05631d4e-4d24-4a9c-9e7a-e61a3100cafb",
      "source_id": "CortexDoc8156c6834c304451ab569d264da5f38a1750697087"
    },
    "s1": {
      "chunk_uuid": "f7c98cae-566b-4bf1-8d66-9bb556c090b0",
      "source_id": "CortexDoc8156c6834c304451ab569d264da5f38a1750697087"
    },
    "s2": {
      "chunk_uuid": "47a30767-0a79-4960-bb16-e817833f42e8",
      "source_id": "CortexDoc1c8e2cdc1b924cd0951b940ae8f511c91750697218"
    },
    "s3": {
      "chunk_uuid": "419b52e2-65c7-43d1-a4ed-109662c0d130",
      "source_id": "CortexDoc1c8e2cdc1b924cd0951b940ae8f511c91750697218"
    },
    "s4": {
      "chunk_uuid": "7af5563f-04c0-4c3a-b633-815945f5cc2e",
      "source_id": "CortexDocc77864b3aba34e40afcced8b0257f26b1750697112"
    }
  }
}

๐Ÿ“ Layout Field

The layout field provides coordinates for creating bounding boxes around cited sources:
Note: For PowerPoint (PPT) and Excel (XLSX) files, the page field will be returned as an empty string since these file formats donโ€™t use traditional page numbering.
  • page (number): The page number where the content appears
  • coordinates (object): Alternative coordinate format with:
    • x (number): Left position
    • y (number): Top position
    • width (number): Width of the bounding box
    • height (number): Height of the bounding box
This layout information enables you to highlight or create visual indicators around the exact location of cited content within documents.
Important Note: Cortex internally uses Cortex Metadata Agent which is an expert at performing metadata-specific search. Using the metadata field as a filter should only be reserved when you want to deterministically fetch results from specific documents based on their metadata.

Error Responses

All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request model for the QnA search API.

tenant_id
string
required

Unique identifier for the tenant/organization

Example:

"tenant_1234"

question
string
required

The question to answer based on indexed sources

Minimum string length: 1
Example:

"What is Cortex AI"

sub_tenant_id
string | null

Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.

Example:

"sub_tenant_4567"

max_chunks
integer
default:10

Maximum number of context chunks to retrieve

Required range: 1 <= x <= 50
Example:

1

mode
enum<string>
default:fast

Retrieval mode: 'fast' for single query, 'accurate' for multi-query with reranking

Available options:
fast,
accurate
alpha
number
default:0.8

Hybrid search alpha (0.0 = sparse/keyword, 1.0 = dense/semantic)

Required range: 0 <= x <= 1
Example:

1

search_mode
enum<string>
default:sources

What to search: 'sources' for documents or 'memories' for user memories

Available options:
sources,
memories
include_graph_context
boolean
default:true

Whether to include knowledge graph context for enhanced answers

Example:

true

extra_context
string | null

Additional context to guide retrieval and answer generation

llm_provider
enum<string>
default:groq

LLM provider for answer generation

Available options:
groq,
cerebras,
openai,
anthropic,
gemini
model
string | null

Specific model to use (defaults to provider's default model)

temperature
number
default:0.3

LLM temperature for answer generation (lower = more focused)

Required range: 0 <= x <= 2
Example:

1

max_tokens
integer
default:4096

Maximum tokens for the generated answer

Required range: 100 <= x <= 16000
Example:

1

Response

Successful Response

Response model for the QnA search API.

answer
string
required

The AI-generated answer based on retrieved context

Example:

"<answer>"

success
boolean
default:true
Example:

true

chunks
VectorStoreChunk ยท object[]

Retrieved context chunks used to generate the answer

Example:
[]
graph_context
GraphContext ยท object

Knowledge graph context (entity paths and chunk relations)

model_used
string | null

The LLM model used for answer generation

timing
Timing ยท object

Timing information (retrieval_ms, answer_generation_ms, total_ms)