Knowledge Base - Convosphere AI

Overview

A knowledge base is a collection of information that your agent uses to answer questions. You can add:

Documents (PDF, Word, Text files)
Web pages and websites
Notion pages and databases

Adding Documents

Supported Formats

PDF (.pdf)
Microsoft Word (.doc, .docx)
Text Files (.txt)
Markdown (.md)

Upload Process

Go to your agent dashboard
Navigate to Knowledge Base or Sources
Click Upload Document
Select one or more files
Wait for processing to complete

Document Processing

Documents go through these stages:

Upload: File is uploaded to storage
Extraction: Text is extracted from the file
Chunking: Content is split into manageable chunks
Embedding: Chunks are converted to embeddings
Indexing: Embeddings are stored in the vector database
Ready: Document is ready for use

Document Status

Pending: Waiting to be processed
Processing: Currently being processed
Indexed: Successfully processed and ready
Failed: Processing failed (check error message)

Adding URLs

Single Page Scraping

Scrape a single web page:

Go to Knowledge Base
Click Add URL
Enter the URL
Leave Crawl Depth at 0
Click Add

Site Crawling

Crawl an entire website:

Go to Knowledge Base
Click Add URL
Enter the homepage URL
Configure options:
- Crawl Depth: How many levels to crawl (1-3 recommended)
- Max Pages: Maximum pages to scrape (50-500)
- Follow External Links: Whether to follow links outside the domain
- Respect robots.txt: Follow robots.txt rules
- Follow Sitemap: Use sitemap.xml if available
Click Add

Advanced Scraping Options

Include Paths: Only crawl URLs matching these patterns
Exclude Paths: Skip URLs matching these patterns
Content Selectors: CSS selectors for main content
Exclude Selectors: CSS selectors to exclude (nav, footer, etc.)
Wait For Selector: Wait for element before scraping
Delay: Delay between requests (milliseconds)
Timeout: Request timeout (milliseconds)

URL Processing

URLs go through the same processing stages as documents:

Pending: Queued for scraping
Scraping: Currently being scraped
Processing: Content is being processed
Indexed: Ready for use
Failed: Scraping or processing failed

Notion Integration

Connecting Notion

Go to Knowledge Base
Click Connect Notion
Authorize the connection in Notion
Select pages and databases to sync
Click Allow

Syncing Notion Content

After connecting:

Automatic Sync: Pages sync automatically when updated
Manual Sync: Click Sync Now to force a sync
Selective Sync: Choose which pages to sync

Notion Pages

Sync individual pages:

Go to Notion Sources
Click Add Page
Select pages to sync
Click Add

Notion Databases

Sync entire databases:

Go to Notion Sources
Click Add Database
Select databases to sync
Configure sync options
Click Add

Managing Knowledge Base

Viewing Documents

Go to Knowledge Base
View all documents with their status
Click on a document to see details

Document Details

Name: Document name
Type: File type or URL
Status: Processing status
Size: File size or page count
Chunks: Number of text chunks
Created: When it was added
Updated: Last update time

Re-processing Documents

If a document fails or needs updating:

Go to document details
Click Re-process or Re-scrape
Wait for processing to complete

Deleting Documents

Go to Knowledge Base
Find the document
Click Delete
Confirm deletion

Deleting a document removes it from the knowledge base permanently.

Best Practices

Document Quality

Clear Structure: Use headings and sections
Accurate Content: Ensure information is correct and up-to-date
Relevant Topics: Focus on topics your agent needs to know
Avoid Duplicates: Don’t add the same content multiple times

URL Scraping

Start Small: Begin with single pages before crawling entire sites
Respect Limits: Be mindful of scraping quotas
Test First: Test with a single page before site crawling
Monitor Status: Check scraping status regularly

Notion Sync

Organize Content: Use clear page and database names
Regular Updates: Keep Notion content updated
Selective Sync: Only sync relevant pages
Monitor Syncs: Check sync status regularly

Training Your Agent

RAG (Retrieval Augmented Generation)

With RAG enabled, your agent:

Receives a user question
Searches the knowledge base for relevant content
Uses that content as context
Generates an answer based on the context

Enabling RAG

Go to agent settings
Set Training Mode to RAG
Ensure knowledge base has content
Save settings

Testing RAG

Go to agent chat
Ask questions related to your knowledge base
Verify agent uses knowledge base content
Review response quality

Limits and Quotas

Document Limits

File Size: Maximum 10MB per file
Total Documents: Varies by plan
Processing Time: Depends on file size and complexity

URL Scraping Limits

Free Plan: 10 pages per month
Starter Plan: 100 pages per month
Business Plan: 1,000 pages per month
Pro Plan: 10,000 pages per month

Notion Sync Limits

Pages per Sync: 1,000 pages
Sync Frequency: Once per hour (automatic)
Manual Syncs: Unlimited

Troubleshooting

Documents Not Processing

Check document format is supported
Verify file size is within limits
Review error messages
Try re-uploading

URLs Not Scraping

Verify URL is accessible
Check scraping quota
Review scraping configuration
Check for robots.txt restrictions

Notion Not Syncing

Verify Notion connection is active
Check selected pages are accessible
Review sync status
Try manual sync

Poor Agent Responses

Ensure knowledge base has relevant content
Verify RAG is enabled
Check document quality
Review knowledge base coverage

Next Steps

Agent Configuration

Learn about agent setup

Notion Integration

Detailed Notion setup

API Reference

Programmatic access

Widget Embed

Deploy your agent

Getting Started

User Guides

Integrations

Pricing

​Overview

​Adding Documents

​Supported Formats

​Upload Process

​Document Processing

​Document Status

​Adding URLs

​Single Page Scraping

​Site Crawling

​Advanced Scraping Options

​URL Processing

​Notion Integration

​Connecting Notion

​Syncing Notion Content

​Notion Pages

​Notion Databases

​Managing Knowledge Base

​Viewing Documents

​Document Details

​Re-processing Documents

​Deleting Documents

​Best Practices

​Document Quality

​URL Scraping

​Notion Sync

​Training Your Agent

​RAG (Retrieval Augmented Generation)

​Enabling RAG

​Testing RAG

​Limits and Quotas

​Document Limits

​URL Scraping Limits

​Notion Sync Limits

​Troubleshooting

​Documents Not Processing

​URLs Not Scraping

​Notion Not Syncing

​Poor Agent Responses

​Next Steps

Agent Configuration

Notion Integration

API Reference

Widget Embed

Overview

Adding Documents

Supported Formats

Upload Process

Document Processing

Document Status

Adding URLs

Single Page Scraping

Site Crawling

Advanced Scraping Options

URL Processing

Notion Integration

Connecting Notion

Syncing Notion Content

Notion Pages

Notion Databases

Managing Knowledge Base

Viewing Documents

Document Details

Re-processing Documents

Deleting Documents

Best Practices

Document Quality

URL Scraping

Notion Sync

Training Your Agent

RAG (Retrieval Augmented Generation)

Enabling RAG

Testing RAG

Limits and Quotas

Document Limits

URL Scraping Limits

Notion Sync Limits

Troubleshooting

Documents Not Processing

URLs Not Scraping

Notion Not Syncing

Poor Agent Responses

Next Steps