QnA
Ask a question and get an AI-generated answer based on your indexed sources or memories.
The response includes both the AI answer and the source chunks used to generate it, enabling full transparency and citation capabilities.
Use search_mode to specify what to search:
- โsourcesโ (default): Search over indexed documents
- โmemoriesโ: Search over user memories
Use mode to control retrieval quality:
- โfastโ (default): Single query, faster response
- โaccurateโ: Multi-query generation with reranking, higher quality
Examples
- API Request
- TypeScript
- Python (Sync)
sub_tenant_id, the QnA will search within the default sub-tenant created when your tenant was set up. This searches across organization-wide documents and knowledge.QnA Capabilities
The QnA endpoint provides intelligent question-answering with several powerful features:AI-Generated Responses
- Natural Language Processing: Understands complex questions and context
- Citation-Based Answers: Every answer includes source references with exact locations
- Conversational Context: Maintains conversation history through session management
- Multi-Step Reasoning: Can break down complex questions into logical steps
Advanced Search Integration
- Hybrid Search: Combines semantic and keyword search for optimal results
- Context-Aware Retrieval: Finds relevant information based on question context
- Source Highlighting: Identifies and highlights the most relevant content chunks
Customization Options
- User Instructions: Provide custom instructions to guide AI behavior
- Metadata Filtering: Filter results by source type, title, or other metadata
- Streaming Support: Get real-time responses for better user experience
- Auto-Agent Routing: Automatically route queries to specialized agents
Key Parameters
Core Parameters
Question & Session Management
question: The question you want answered (required)session_id: Unique identifier for maintaining conversation contextuser_name: Optional user identifier for personalized responses
Search Configuration
search_alpha: Balance between semantic and keyword search (0.0-1.0)top_n: Number of relevant chunks to retrieve for contextrecency_bias: Prioritize recent content (0.0-1.0)
AI Generation Control
ai_generation: Enable/disable AI response generationmulti_step_reasoning: Enable complex reasoning for difficult questionsuser_instructions: Custom instructions to guide AI behaviorauto_agent_routing: Automatically route to specialized agents
Response Formatting
stream: Enable streaming responses for real-time outputhighlight_chunks: Include highlighted relevant chunks in responsecontext_list: Provide additional context for the AI
Advanced Features
Metadata Filtering
Use themetadata object to filter results:
Session Management
- Persistent Context: Maintain conversation history across multiple questions
- Context Accumulation: Build understanding over multiple interactions
- User Personalization: Adapt responses based on user preferences
Multi-Step Reasoning
When enabled, the AI can:- Break down complex questions into smaller parts
- Analyze multiple sources of information
- Synthesize information from different documents
- Provide step-by-step explanations
Use Cases
Customer Support
- FAQ Automation: Answer common customer questions automatically
- Product Information: Provide detailed product specifications and features
- Troubleshooting: Guide users through problem-solving steps
Knowledge Management
- Document Q&A: Ask questions about uploaded documents and manuals
- Research Assistance: Find and synthesize information from multiple sources
- Training Support: Answer questions about company policies and procedures
Content Discovery
- Information Retrieval: Find specific information within large document collections
- Contextual Search: Get answers that understand the broader context
- Citation Tracking: See exactly where information comes from
Best Practices
Question Formulation
- Be Specific: Ask clear, specific questions for better results
- Provide Context: Include relevant background information when needed
- Use Natural Language: Ask questions as you would to a human expert
Session Management
- Maintain Context: Use consistent session IDs for related questions
- Build Understanding: Ask follow-up questions to deepen the conversation
- Reset When Needed: Start new sessions for unrelated topics
Response Optimization
- Enable Highlighting: Use
highlight_chunksto see source relevance - Adjust Search Alpha: Experiment with different values for your content type
- Use Metadata Filtering: Narrow results to specific document types when needed
Response
Returns a JSON object containing the AI-generated answer and supporting source chunks with layout information for creating bounding boxes around cited sources.๐ Layout Field
The layout field provides coordinates for creating bounding boxes around cited sources:
Note: For PowerPoint (PPT) and Excel (XLSX) files, the page field will be returned as an empty string since these file formats donโt use traditional page numbering.
page(number): The page number where the content appearscoordinates(object): Alternative coordinate format with:x(number): Left positiony(number): Top positionwidth(number): Width of the bounding boxheight(number): Height of the bounding box
Important Note: Cortex internally uses Cortex Metadata Agent which is an expert at performing metadata-specific search. Using the metadata field as a filter should only be reserved when you want to deterministically fetch results from specific documents based on their metadata.
Error Responses
All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Request model for the QnA search API.
Unique identifier for the tenant/organization
"tenant_1234"
The question to answer based on indexed sources
1"What is Cortex AI"
Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.
"sub_tenant_4567"
Maximum number of context chunks to retrieve
1 <= x <= 501
Retrieval mode: 'fast' for single query, 'accurate' for multi-query with reranking
fast, accurate Hybrid search alpha (0.0 = sparse/keyword, 1.0 = dense/semantic)
0 <= x <= 11
What to search: 'sources' for documents or 'memories' for user memories
sources, memories Whether to include knowledge graph context for enhanced answers
true
Additional context to guide retrieval and answer generation
LLM provider for answer generation
groq, cerebras, openai, anthropic, gemini Specific model to use (defaults to provider's default model)
LLM temperature for answer generation (lower = more focused)
0 <= x <= 21
Maximum tokens for the generated answer
100 <= x <= 160001
Response
Successful Response
Response model for the QnA search API.
The AI-generated answer based on retrieved context
"<answer>"
true
Retrieved context chunks used to generate the answer
[]Knowledge graph context (entity paths and chunk relations)
The LLM model used for answer generation
Timing information (retrieval_ms, answer_generation_ms, total_ms)