Skip to main content

Data Pool

The Data Pool is the central interface for managing indexed data in SmythOS. It allows you to create Data Spaces, upload files or paste raw text, and connect to multiple vector database providers including SmythOS managed Pinecone, your own Pinecone instance, or Milvus. This enables agents to use retrieval-augmented generation (RAG) and answer queries using relevant context from your documents.

What the Data Pool enables

Agents in SmythOS can use your files and text content for precise responses. The Data Pool powers this by indexing data into searchable embeddings across your choice of vector database providers.

How the Data Pool supports RAG

RAG stands for retrieval-augmented generation. When an agent is connected to a Data Space, it can search through indexed content and use the most relevant snippets to generate informed answers. This makes your agents more accurate, grounded, and useful for knowledge-based tasks.

RAG gives your agent real knowledge

Without data, your agent only knows what you prompt it with. With RAG, it retrieves meaningful information from your own sources before generating output.

Key features of the Data Pool

FeatureDescriptionLearn More
Data SpacesContainers where you upload files or paste raw text for indexingCreate Data Spaces
Multi-Provider SupportConnect SmythOS Pinecone, your own Pinecone instance, or Milvus for vector storageSet Up Providers
Data source previewReview uploaded files, chunking configuration, metadata, and content from the interfaceManaged inside each Data Space
Provider ManagementCreate and organize multiple provider connections for flexible infrastructure managementManage Providers
Use one Data Pool, many agents

You can reuse the same Data Spaces across multiple agents. This makes your content modular and efficient to maintain.

Using the Data Pool

To access the Data Pool:

  1. Open the SmythOS Studio sidebar
  2. Click Data Pool
  3. View your list of existing Data Spaces in a table with:
    • Data Space Name – displays the name and embedding model
    • Provider – shows which vector database provider is being used
    • Actions – add data sources or delete the data space
  4. Click Add Data Space to create a new container
  5. Click the actions column to manage data sources or delete spaces
Data Pool table in SmythOS

The Data Pool displays your data spaces, their providers, and quick actions for managing content.

Storage options

The Data Pool supports multiple vector storage providers for embeddings, giving you flexibility to choose what works best for your needs.

Storage TypeHosted BySetup RequiredBest ForDocs
SmythOS Managed PineconeSmythOSNoneQuick setups, default RAG usage, no configurationData Spaces
Your Own PineconeYou (Pinecone)API key, index name, connection setupFull control, enterprise use, scalable vectorsCustom Storage
MilvusYou (Milvus)Address, token, connection setupOpen-source deployment, self-hosted infrastructureCustom Storage
Vector dimension requirements

Ensure your vector database is configured with the correct dimensions for your embedding model. Mismatched dimensions will prevent successful indexing.

Provider flexibility

With Data Pool, you can:

  • Use SmythOS managed Pinecone – No setup needed, start indexing immediately
  • Bring your own Pinecone – Use your existing Pinecone account and indexes for enterprise deployments
  • Deploy with Milvus – Self-host or use managed Milvus for complete infrastructure control
  • Organize multiple providers – Create multiple provider connections from the same service or account for better organization
Mix and match providers

You can use different providers for different Data Spaces. Create one space with SmythOS Pinecone for testing, and another with your own Pinecone for production.

What's Next