Everything under the hood
A deep dive into LoamDB's architecture, tools, and capabilities. Built for developers who need a data layer that agents can reason over.
Multi-model architecture
Four data models on one Postgres instance. Each model has dedicated indexing, query patterns, and tooling.
Relational Model
Fully normalized schemas with Drizzle ORM. Type-safe queries, migrations, and schema evolution. Tables for people, clients, deals, compensation, and custom entities.
- Drizzle ORM with full TypeScript inference
- Automatic migration generation
- Foreign key relationships between all entity types
- Composite indexes for common query patterns
Graph Model
Org charts, reporting lines, and relationship networks stored as adjacency lists. BFS traversal for hierarchy queries with configurable depth limits.
- BFS traversal for manager hierarchy
- Configurable depth limits (default: 10)
- Bidirectional relationship traversal
- Cycle detection in org structures
Vector Model
pgvector extension with HNSW indexing for fast approximate nearest-neighbor search. Embeddings generated at ingest time for all text content.
- pgvector with HNSW indexing
- 1536-dimension embeddings (OpenAI compatible)
- Cosine similarity scoring
- Hybrid search: vector + keyword
Document Model
JSONB columns for flexible metadata storage. Schema.org-aligned ontology detection for automatic field mapping across domains.
- JSONB with GIN indexing
- Schema.org property alignment
- Nested document queries
- Flexible schema per entity type
Pipeline stages
From raw data to agent-ready answers. Each stage is independently measurable and tunable.
| Stage | Description | Latency |
|---|---|---|
| Chunking | Documents split into overlapping chunks optimized for embedding quality and retrieval precision. | ~50ms |
| Embedding | Vector embeddings generated for each chunk and stored alongside the source content. | ~200ms |
| Indexing | HNSW index updated with new vectors. Background process ensures read availability. | ~100ms |
| Retrieval | Multi-strategy retrieval: exact match, semantic similarity, or hybrid keyword+vector. | 20–150ms |
| Reranking | Retrieved results scored and reranked by relevance to the original query. | ~50ms |
| Formatting | Results formatted for the consuming agent's context window with source attribution. | ~10ms |
17 MCP tools
Structured operations exposed via the Model Context Protocol. Your agent uses these tools to read, write, search, and manage data autonomously.
| Tool | Description |
|---|---|
| query_entities | Query any entity type with filters, sorting, and pagination |
| get_entity | Retrieve a single entity by ID with full metadata |
| search_semantic | Vector similarity search across all entity types |
| traverse_graph | BFS/DFS traversal of relationship graphs |
| get_org_chart | Retrieve org structure for a given root entity |
| import_csv | Ingest CSV data with automatic schema detection |
| import_status | Check status of an ongoing import job |
| manage_permissions | Create, update, and delete permission rules |
| check_access | Verify if a user can access a specific entity |
| get_schema | Retrieve the current database schema and entity types |
| list_imports | List all import jobs with status and metadata |
| get_statistics | Database-level stats: row counts, storage, index health |
| create_entity | Insert a new entity with validation and indexing |
| update_entity | Update entity fields with audit logging |
| delete_entity | Soft-delete an entity with cascade handling |
| bulk_operations | Batch create, update, or delete operations |
| export_data | Export query results in CSV, JSON, or Parquet format |
7 function-calling tools
Optimized for LLM tool-use patterns. Minimal token overhead with latency benchmarks for each operation.
| Tool | Description | Latency |
|---|---|---|
| lookup_person | Direct entity lookup by email or name | ~20ms |
| search_knowledge | Semantic search with relevance scoring | ~150ms |
| get_org_chart | Manager hierarchy traversal | ~40ms |
| query_pipeline | Deal pipeline queries with stage filters | ~30ms |
| check_permissions | Verify access before data retrieval | ~5ms |
| get_client | Client profile with deal history | ~25ms |
| generate_report | Full RAG synthesis across multiple sources | ~600ms |
Domain-aware schema mapping
LoamDB auto-detects your data domain and applies the right ontology. Upload a CSV and the system maps columns to known schemas — no manual configuration required.
Sales CRM
Detects people, clients, deals, and pipeline stages. Maps to standard CRM ontology with relationship inference.
- People + Org chart
- Clients + ARR
- Deals + Pipeline stages
Real Estate
Recognizes property listings, agents, transactions, and market data. Schema.org RealEstateListing alignment.
- Properties + Listings
- Agents + Brokerages
- Transactions + Closings
Generic
Falls back to flexible document storage when no domain is detected. JSONB with automatic field indexing.
- Custom entities
- Flexible schemas
- Auto-indexed fields
Isolation by design
Every table has an organizationId column. Row-level security ensures tenants never see each other's data. GATEKEEPER integrates at the query layer for fine-grained access control within a tenant.
- organizationId on every table — no shared rows
- Row-level security policies enforced at the database layer
- GATEKEEPER integration for role-based filtering
- Isolated vector indexes per organization
Analyze, map, deduplicate, import
Upload a CSV and LoamDB handles the rest. Domain detection, column mapping, deduplication, and relationship inference — all automated.
Analyze
Detect domain, identify columns, infer data types and relationships.
Map
Map columns to ontology fields. Auto-detect email, name, currency, date formats.
Deduplicate
Fuzzy match against existing entities. Merge duplicates, flag conflicts for review.
Import
Insert entities, generate embeddings, build relationships, update indexes.