Upload Content

Hit the Try it button to try this API now in our playground. It’s the best way to check the full request and response in one place, customize your parameters, and generate ready-to-use code snippets.

Examples

API Request
TypeScript
Python (Sync)

curl -X 'POST' \
'https://api.usecortex.ai/ingestion/upload-content' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"content": {
"contents": [
  {
    "file_id": "string",
    "content": "# Introduction\n\nThis is the document content.",
    "is_markdown": false,
    "tenant_metadata": "",
    "document_metadata": "",
    "relations": false
  }
]
},
"tenant_id": "string",
"sub_tenant_id": "",
"upsert": true
}'

const result = await client.upload.uploadText({
  tenant_id: "tenant_1234",
  sub_tenant_id: "sub_tenant_4567",
  body: {
    content: "Your text content here",
    file_id: "text_doc_123456",
    tenant_metadata: {},
    document_metadata: {}
  }
});

# Async usage is similar, just use async_client and await
result = client.upload.upload_text(
    tenant_id="tenant_1234",
    sub_tenant_id="sub_tenant_4567",
    content="Your text content here",
    file_id="text_doc_123456",
    tenant_metadata={},
    document_metadata={}
)

Upload text content directly to your tenant’s knowledge base. The text will be processed, chunked, and indexed for search and retrieval.

Text Processing Pipeline

When you upload text content, it goes through a streamlined processing pipeline optimized for direct text input:

1. Immediate Upload & Queue

Your text content is immediately accepted and stored securely
It’s added to our processing queue for background processing
You receive a confirmation response with a file_id for tracking

2. Text Processing Phase

Our system automatically handles:

Content Validation: Ensuring text content is properly formatted and accessible
Format Detection: Identifying markdown, plain text, or structured content
Text Normalization: Cleaning and standardizing text formatting

3. Intelligent Chunking

Text is split into semantically meaningful chunks
Chunk size is optimized for both context preservation and search accuracy
Overlapping boundaries ensure no information is lost between chunks
Metadata is preserved and associated with each chunk

4. Embedding Generation

Each chunk is converted into high-dimensional vector embeddings
Embeddings capture semantic meaning and context
Vectors are optimized for similarity search and retrieval

5. Indexing & Database Updates

Embeddings are stored in our vector database for fast similarity search
Full-text search indexes are created for keyword-based queries
Metadata is indexed for filtering and faceted search
Cross-references are established for related content

6. Quality Assurance

Automated quality checks ensure processing accuracy
Content validation verifies text completeness
Embedding quality is assessed for optimal retrieval performance

Processing Time: Text content is typically processed and searchable within 1-3 minutes. Large text blocks (10,000+ words) may take up to 5 minutes. You can check processing status using the document ID returned in the response.

Default Sub-Tenant Behavior: If you don’t specify a sub_tenant_id, the text content will be uploaded to the default sub-tenant created when your tenant was set up. This is perfect for organization-wide content that should be accessible across all departments.

File ID Management: The system uses a priority-based approach for file ID assignment:

First Priority: If you provide a file_id as a direct body parameter, that specific ID will be used

Second Priority: If no direct file_id is provided, the system checks for a file_id in the document_metadata object

Auto-Generation: If neither source provides a file_id, the system will automatically generate a unique identifier

Duplicate File ID Behavior

When you upload text content with a file_id that already exists in your tenant:

Overwrite Behavior: The existing text content with the same file_id will be completely replaced with the new content
Processing: The new text content will go through the full processing pipeline (validation, chunking, embedding generation, indexing)
Search Results: Previous search results and embeddings from the old content will be replaced with the new content
Idempotency: Uploading the same text content with the same file_id multiple times is safe and will result in the same final state

Important: When overwriting existing text content, all previous chunks, embeddings, and search indexes associated with that file_id will be permanently removed and replaced. This action cannot be undone.

Example Success Response for Duplicate File ID:

{
  "message": "Text content uploaded successfully. Existing content with file_id 'text_123456' has been overwritten.",
  "file_id": "text_123456",
  "status": "success"
}

Processing Status & Monitoring

After uploading, you can monitor your text content’s processing status:

Immediate Response

Upon successful upload, you’ll receive:

{
  "message": "Text content uploaded successfully",
  "file_id": "doc_123456"
}

Processing States

Your text content will progress through these states:

queued: Text content is in the processing queue, waiting to be processed
in_progress: Text content is actively being processed (includes validation, chunking, embedding generation, and indexing)
success: Text content is fully processed and searchable
errored: Processing encountered an error (rare occurrence)

In-Progress Details: While the status shows in_progress, the system is actually performing multiple steps: content validation, format detection, intelligent chunking, embedding generation, and database indexing. These happen sequentially but are all part of the single in_progress state.

When Your Text is Ready

Once processing is complete, your text content will be:

✅ Searchable via semantic search and Q&A endpoints
✅ Retrievable through our retrieval APIs
✅ Available for AI-powered applications
✅ Indexed for fast query performance

Important: Don’t attempt to search or retrieve your text content immediately after upload. Wait for processing to complete (typically 1-3 minutes) to ensure optimal results.

Error Responses

All endpoints return consistent error responses following the standard format. For detailed error information, see our Error Responses documentation.

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

content

ContentUploadRequest · object

required

CONTENT_DESCRIPTION

Show child attributes

tenant_id

string

required

Unique identifier for the tenant/organization

Example:

"tenant_1234"

sub_tenant_id

string

default:""

Optional sub-tenant identifier used to organize data within a tenant. If omitted, the default sub-tenant created during tenant setup will be used.

Example:

"sub_tenant_4567"

upsert

boolean

default:true

If true, update existing sources with the same source_id. Defaults to True.

Example:

true

Response

Successful Response

success

boolean

default:true

Example:

true

message

string

default:Upload initiated successfully

results

SourceUploadResultItem · object[]

List of upload results for each source.

Show child attributes

Example:

[]

success_count

integer

default:0

Number of sources successfully queued.

Example:

1

failed_count

integer

default:0

Number of sources that failed to upload.

Example:

1

API Documentation

Tenant Management

Knowledge Ingestion

Query & Retrieval

Knowledge Management

Embeddings

User Memories

Examples

Text Processing Pipeline

1. Immediate Upload & Queue

2. Text Processing Phase

3. Intelligent Chunking

4. Embedding Generation

5. Indexing & Database Updates

6. Quality Assurance

Duplicate File ID Behavior

Processing Status & Monitoring

Immediate Response

Processing States

When Your Text is Ready

Error Responses

Authorizations

Body

Response

API Documentation

Tenant Management

Knowledge Ingestion

Query & Retrieval

Knowledge Management

Embeddings

User Memories

​Examples

​Text Processing Pipeline

​1. Immediate Upload & Queue

​2. Text Processing Phase

​3. Intelligent Chunking

​4. Embedding Generation

​5. Indexing & Database Updates

​6. Quality Assurance

​Duplicate File ID Behavior

​Processing Status & Monitoring

​Immediate Response

​Processing States

​When Your Text is Ready

​Error Responses

Authorizations

Body

Response

Examples

Text Processing Pipeline

1. Immediate Upload & Queue

2. Text Processing Phase

3. Intelligent Chunking

4. Embedding Generation

5. Indexing & Database Updates

6. Quality Assurance

Duplicate File ID Behavior

Processing Status & Monitoring

Immediate Response

Processing States

When Your Text is Ready

Error Responses