Ask a question and get an AI-generated answer based on your indexed sources or memories.
The response includes both the AI answer and the source chunks used to generate it, enabling full transparency and citation capabilities.
Use search_mode to specify what to search:
Use mode to control retrieval quality:
sub_tenant_id, the QnA will search within the default sub-tenant created when your tenant was set up. This searches across organization-wide documents and knowledge.question: The question you want answered (required)session_id: Unique identifier for maintaining conversation contextuser_name: Optional user identifier for personalized responsessearch_alpha: Balance between semantic and keyword search (0.0-1.0)top_n: Number of relevant chunks to retrieve for contextrecency_bias: Prioritize recent content (0.0-1.0)ai_generation: Enable/disable AI response generationmulti_step_reasoning: Enable complex reasoning for difficult questionsuser_instructions: Custom instructions to guide AI behaviorauto_agent_routing: Automatically route to specialized agentsstream: Enable streaming responses for real-time outputhighlight_chunks: Include highlighted relevant chunks in responsecontext_list: Provide additional context for the AImetadata object to filter results:
highlight_chunks to see source relevance
Note: For PowerPoint (PPT) and Excel (XLSX) files, the page field will be returned as an empty string since these file formats donโt use traditional page numbering.
page (number): The page number where the content appearscoordinates (object): Alternative coordinate format with:
x (number): Left positiony (number): Top positionwidth (number): Width of the bounding boxheight (number): Height of the bounding box
Important Note: Cortex internally uses Cortex Metadata Agent which is an expert at performing metadata-specific search. Using the metadata field as a filter should only be reserved when you want to deterministically fetch results from specific documents based on their metadata.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Request model for the QnA search API.
Unique identifier for the tenant/organization
"tenant_1234"
The question to answer based on indexed sources
1"What is Cortex AI"
Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.
"sub_tenant_4567"
Maximum number of context chunks to retrieve
1 <= x <= 501
Retrieval mode: 'fast' for single query, 'accurate' for multi-query with reranking
fast, accurate Hybrid search alpha (0.0 = sparse/keyword, 1.0 = dense/semantic)
0 <= x <= 11
What to search: 'sources' for documents or 'memories' for user memories
sources, memories Whether to include knowledge graph context for enhanced answers
true
Additional context to guide retrieval and answer generation
LLM provider for answer generation
groq, cerebras, openai, anthropic, gemini Specific model to use (defaults to provider's default model)
LLM temperature for answer generation (lower = more focused)
0 <= x <= 21
Maximum tokens for the generated answer
100 <= x <= 160001
Successful Response
Response model for the QnA search API.
The AI-generated answer based on retrieved context
"<answer>"
true
Retrieved context chunks used to generate the answer
[]Knowledge graph context (entity paths and chunk relations)
The LLM model used for answer generation
Timing information (retrieval_ms, answer_generation_ms, total_ms)