Hybrid Search
Search for relevant content within your indexed sources or user memories.
Results are ranked by relevance and can be customized with parameters like result limits, alpha weighting, and recency preferences.
Use search_mode to specify what to search:
- “sources” (default): Search over indexed documents
- “memories”: Search over user memories (uses inferred content)
Use mode to control retrieval quality:
- “fast” (default): Single query, faster response
- “accurate”: Multi-query generation with reranking, higher quality
Examples
- API Request
- TypeScript
- Python (Sync)
sub_tenant_id, the search will be performed within the default sub-tenant created when your tenant was set up. This searches across organization-wide documents.Search Modes
The Hybrid Search endpoint combines multiple search strategies to provide the most relevant results:Semantic Search
- Purpose: Finds content based on meaning and context, not just exact keywords
- Best for: Conceptual queries, finding related content, understanding intent
- Example: Searching for “machine learning” will also find content about “AI”, “neural networks”, “deep learning”
Keyword Search
- Purpose: Finds content containing specific terms or phrases
- Best for: Exact term matching, technical specifications, proper nouns
- Example: Searching for “TensorFlow 2.0” will find documents mentioning this specific version
Hybrid Approach
- Purpose: Combines semantic understanding with keyword precision
- Best for: Most use cases where you want both relevance and accuracy
- Example: “Python data analysis libraries” finds both semantic matches (pandas, numpy) and exact keyword matches
Search Parameters
Alpha Parameter
Controls the balance between semantic and keyword search:0.0- Pure keyword search only- Best for: Exact term matching, technical specifications
- Use when: You need precise keyword matches
1.0- Pure semantic search only- Best for: Conceptual queries, finding related content
- Use when: You want to discover related concepts
0.8- Default balanced approach (recommended)- Best for: Most general use cases
- Provides optimal balance of precision and recall
"auto"- Intelligent auto-selection- Cortex analyzes your query and chooses the optimal alpha
- Best for: When you’re unsure which approach to use
Recency Bias
Controls how much recent content is prioritized:0.0- No recency bias (default)0.1-0.5- Light to moderate recency preference0.6-1.0- Strong recency preference- Best for: News, documentation updates, time-sensitive information
Max Chunks
Controls the number of results returned:- Range: 1-1001 chunks
- Default: System limit
- Recommendation: Start with 10-20 for most use cases
Personalise Search
Enables personalized search results based on user memories from the corresponding tenant and sub-tenant combination:true- Enable personalized search results- Leverages user memories stored in the tenant/sub-tenant combination
- Provides more relevant and tailored search results
- Considers user’s historical interactions and preferences
false- Standard search without personalization (default)- Returns results based purely on content relevance
- No user-specific context applied
Knowledge Graph Context
Search results are automatically enriched with knowledge graph context, providing entity relationships extracted from your content. What’s included in responses:extra_context.chunk_relations— Entities and relationships found within each chunkextra_graph_context— Additional entity relationships extracted from your query

Learn More: Knowledge Graphs
Search Optimization Tips
For Better Precision
- Use lower alpha values (0.2-0.4) for exact term matching
- Include specific terminology in your queries
- Set higher max_chunks to get more comprehensive results
For Better Recall
- Use higher alpha values (0.6-0.8) for broader semantic matching
- Try synonyms and related terms in your queries
- Use conceptual language rather than specific terms
- Enable recency bias for time-sensitive content
For Complex Queries
- Use “auto” alpha to let Cortex optimize automatically
- Combine specific terms with conceptual language
- Adjust recency bias based on content type
- Experiment with different alpha values to find optimal results
- Enable personalise_search for user-specific contexts and preferences
Alpha Parameter
Thealpha parameter controls the balance between semantic and keyword search:
0.0= keyword search only1.0= semantic search only0.8= default balanced approach"auto"= Cortex intelligently decides the optimal alpha value based on the query
Error Responses
All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Unique identifier for the tenant/organization
"tenant_1234"
Search terms to find relevant content
"Which mode does user prefer"
Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.
"sub_tenant_4567"
Maximum number of results to return
Retrieval mode to use ('fast' or 'accurate')
fast, accurate Search ranking algorithm parameter (0.0-1.0 or 'auto')
Preference for newer content (0.0 = no bias, 1.0 = strong recency preference)
1
Enable personalized search results based on user preferences
true
Enable graph context for search results
true
Additional context provided by the user to guide retrieval
What to search: 'sources' for documents or 'memories' for user memories
sources, memories