serverSources

Sources are the foundation of your Agent's knowledge. Every URL, text snippet, and file you add here is processed and indexed so the Agent can generate accurate, contextual answers for your users.

Overview

The Sources tab has two sections: Auto retraining frequency at the top and the Sources to train list below.

Dashboard path: Agent > Knowledge > Sourcesarrow-up-right


Auto retraining frequency

Controls how often Jimo automatically re-indexes all your content sources. This keeps the Agent's answers in sync with changes to your documentation.

Choose between three options:

  • Never: Sources are only retrained when you manually trigger a refresh. Best for stable content that rarely changes.

  • Daily: Jimo re-indexes all sources every day. Recommended during active product development or when your docs change frequently.

  • Weekly: A good balance for most teams. Sources are re-indexed once a week.

You can always trigger a one-off retrain for any individual source using the refresh icon (⟳) in the source list.


Sources to train

The main table lists every source you have added. Use the search bar, Type filter, and Status filter to find specific sources quickly.

Column
Description

Content

The name you gave the source (or the URL/filename if no name was set).

Type

The source type: Site Crawl, Individual URLs, Text, or File.

Status

Current training state (see Status reference below).

Updated

When the source was last successfully trained.

Row actions:

  • ⟳ Refresh: Retrain this source immediately without waiting for the next auto-retrain cycle.

  • 🗑 Delete: Permanently remove the source from your knowledge base.

Click any row to open the Content Details drawer, where you can inspect the indexed data chunks and metadata.


Adding a source

Click + Add Content (top-right of the Sources to train section) to open a dropdown with three source types: URL, Text, and File.

URL sources let you feed entire websites or specific pages to the Agent. Choose between two modes: Site Crawl and Individual URLs.

Site Crawl

Crawl all pages starting from a root URL. Jimo follows internal links and extracts text from every reachable page.

Fields:

  • Name (optional): A label for this source (e.g. "Help Center", "Product Docs").

  • Type: Toggle between Site Crawl and Individual URLs.

  • Start URL: The root URL where the crawl begins (e.g. https://help.yourapp.com).

  • Retrieve page: Controls which pages are included in the crawl.

    • All pages from this starting URL (default): Follows every internal link from the start URL.

    • Only filtered URLs: Restricts the crawl to pages matching specific rules.

Filtering with rules

When you select Only filtered URLs, a Rules section appears where you can define path-based filters.

Each rule has two parts:

  • Condition (dropdown): Starts with, Contains, Equals, etc.

  • Path pattern: The URL path segment to match (e.g. /docs/, ?category=help).

You can build rules in two ways:

  • + Add rule manually: Add conditions one by one.

  • ✨ Generate rules with AI: Let Jimo AI analyze the start URL and suggest relevant filtering rules automatically.

Tip: Use path filters to focus the crawl on your help center or product documentation and exclude marketing pages, blog posts, or legal content that could dilute the Agent's answers.

Click Train to start indexing. The source appears in the list with a Training status and flips to Trained once processing is complete.

Individual URLs

Add specific pages manually when you only need a few pages indexed, or when the pages you need are spread across different domains.

Fields:

  • Name (optional): A label for this source.

  • Type: Toggle to Individual URLs.

  • List of URLs: Paste one URL per line. Each URL is fetched and indexed independently.

Click Train to start indexing.


Status reference

Status
Meaning

Trained

Source is fully indexed and the Agent uses it to generate answers.

Training

Source is being processed. The Agent cannot use it yet.

Failed ⚠️

Something went wrong during indexing. Hover the status icon for the error message and retry.


Content Details drawer

Click any source row to open its detail panel and inspect what was indexed.

The drawer shows:

  • Header: Source name, type, starting URL or filename, status, date added, and who added it.

  • Data Chunks: For crawls, a list of every crawled path with its extracted text. For files, each section or page of the document. Expand any chunk to preview the indexed text.

circle-exclamation

Best practices

1

Start broad, then refine.

Begin with a full site crawl of your help center, then review data chunks and add path filters to exclude irrelevant pages.

2

Name your sources clearly.

Labels like "Help Center - Product Docs" or "API Reference v2" make the source list easier to manage as it grows.

3

Use Daily retraining during active development.

Switch to Weekly once your documentation stabilizes.

4

Keep text sources focused.

One topic per text source produces better results than a single massive text block.

5

Avoid noisy content.

Exclude marketing pages, blog posts, customer testimonials, and legal disclaimers unless users frequently ask about them.

6

Monitor statuses.

Check for Failed sources regularly, especially after changing your website structure.

Last updated