Skip to main content

Data Spaces

Data Spaces are named containers in the Data Pool that hold files or URLs. These sources are indexed using vector embeddings so your agents can retrieve and use them through retrieval-augmented generation (RAG).

What makes Data Spaces useful

Once indexed, your sources become part of your agent’s contextual knowledge. This allows agents to answer questions, summarise content, and reference details from those documents in real time.

How to Create a Data Space

  1. Open the Data Pool from the left navigation
  2. Click Add Data Space
  3. Enter a name and confirm
Create Data Space modal

Each Data Space represents a dedicated container of searchable documents.

Space created

The new space appears in your list. It will initially show No data source and Not Indexed.

How to Add Sources

Each Data Space can include one or more sources:

  1. Click the Add Data Source icon next to a space
  2. Enter a label
  3. Choose one of the following:
    • Upload a file (.pdf, .docx, .xml)
    • Add a URL or sitemap
Add Data Source modal in SmythOS

Files and links are transformed into searchable vector embeddings.

Label clearly for easier reference

Use titles like “2024 Pricing Guide” instead of generic names like “Document 1.”

How Indexing Works

Indexing starts automatically once a source is added. You can track progress in the Indexing Status column. Agents can only retrieve from sources marked Indexed.

Behind the scenes

Indexing turns your files into vector embeddings. These are stored in your default or Custom Storage setup and made available to agents via semantic search.

Manage and Organize Spaces

From the Data Pool overview:

  • Click a row to view or edit a Data Space
  • Add or remove sources at any time
  • Duplicate or delete a Data Space
Deleting a space is permanent

Removing a Data Space also deletes all its sources. Agents depending on this data will no longer have access to it.

Use Data Spaces in Agents

To connect a Data Space to an agent:

  1. Open your Project Space
  2. Select or create an agent
  3. Attach one or more Data Spaces

Once connected, the agent uses semantic retrieval to reference indexed content during execution.

Instant RAG support

After attaching a space, your agent can immediately start pulling context from it in responses and workflows.

FAQ

What types of sources are supported?

You can upload .pdf, .docx, and .xml files, or add URLs and full sitemaps.

Can a single space contain both files and URLs?

Yes. Data Spaces can mix multiple file types and links.

Is there a limit to how many sources I can add?

There is no enforced limit. For better performance and organisation, group similar content across separate spaces.

What if indexing fails?

Check file type and size. Retry if needed, or consider using Custom Storage for more control.

What's Next?