Add a source
Open Knowledge -> Websites and click Add Website. You can add a source in three modes:- Crawl links: discover pages by following internal links
- Sitemap: import URLs from your sitemap
- Individual link: index one specific page
sitemap.xml URL, BoundBot detects it and treats it as a sitemap source automatically.
Advanced options
For crawl and sitemap sources, you can also set:- Include path prefix to limit the crawl to part of a site
- Exclude path prefix to skip sections such as
/admin - Max links to control crawl size
- Auto Recrawl if you want the source refreshed on a schedule
Manage a source
Each source card shows:- current status
- number of discovered links
- number of scraped links
- total stored size
- last crawl time
- last error, if one occurred
- Fetch & Crawl to rediscover links and crawl pending pages in one run
- Fetch Links to start the same background discovery-and-crawl workflow from the source card or detail page
- Crawl Pending to scrape only links that are still pending
- Retry Failed to retry links that failed on the last pass
- Retrain Agent to mark every discovered link for recrawl and scrape the whole source again
- Crawl Selected to crawl only the links you choose from the table
Watch your limits
The top summary shows:- website source count
- total links
- stored data size
- crawl credit cost per page
Best practices
- Start with the smallest useful section of your site.
- Exclude duplicate or low-value pages such as admin routes, login pages, and legal archives if they do not help customers.
- Use Fetch & Crawl after major navigation or sitemap changes.
- Use Retrain Agent when existing pages changed heavily and you want the full source scraped again.
- Review errors early so broken pages do not quietly degrade your answers.
Related pages
Knowledge base
See how website sources fit with FAQs, files, products, and MCP tools.
Files
Use uploaded documents when the source of truth does not live on a public site.
Products
Keep structured catalog data separate from crawled website content.
Plans and limits
Review website source, crawl, and storage limits by tier.

