System Architecture

Understanding Almanac's architecture helps you make informed decisions about deployment, scaling, and optimization.

High-Level Overview

┌─────────────────────────────────────────────────────────────┐
│                         Client Layer                         │
│  (Web UI, CLI, SDKs, Custom Applications)                   │
└─────────────────────┬───────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────┐
│                      REST API Server                         │
│  (Express.js, TypeScript, Port 3000)                        │
└─────────┬───────────────────────────────┬───────────────────┘
          │                               │
          ▼                               ▼
┌──────────────────────┐      ┌──────────────────────────────┐
│   MCP Client Manager │      │    Indexing Engine           │
│  (Data Source Layer) │      │  (Vector + Graph Indexing)   │
└──────────┬───────────┘      └────────┬─────────────────────┘
           │                           │
           ▼                           ▼
┌─────────────────────────────────────────────────────────────┐
│                      Storage Layer                           │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │ MongoDB  │  │  Qdrant  │  │Memgraph │  │  Redis   │   │
│  │(Metadata)│  │ (Vectors)│  │ (Graph) │  │ (Cache)  │   │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘   │
└─────────────────────────────────────────────────────────────┘

Core Components

1. REST API Server

Technology: Express.js + TypeScript Port: 3000 (configurable) Responsibilities:

  • Accept query requests

  • Route API calls

  • Manage authentication/authorization

  • Handle rate limiting

  • Coordinate between services

Key Files:

  • packages/server/src/server.ts - Main server

  • packages/server/src/api/ - API routes

2. MCP Client Manager

Purpose: Manages connections to Model Context Protocol servers

Responsibilities:

  • Connect/disconnect MCP servers

  • Execute tools (fetch data)

  • Access resources

  • Handle OAuth flows

  • Cache tool responses

Architecture:

Key Files:

  • packages/server/src/mcp/client.ts

  • packages/server/src/mcp/initialization.ts

3. Indexing Engine

Purpose: Transform raw data into searchable vectors and knowledge graphs

Phases:

  1. Sync Phase

    • Fetch data from MCP servers

    • Store in MongoDB

    • Track sync state

  2. Vector Indexing

    • Generate embeddings

    • Store in Qdrant

    • Enable semantic search

  3. Graph Indexing

    • Extract entities

    • Extract relationships

    • Build knowledge graph in Memgraph

Key Files:

  • packages/indexing-engine/src/ - Core indexing logic

  • packages/server/src/services/indexing/ - Service layer

4. Query Engine (LightRAG)

Purpose: Answer queries using hybrid vector + graph retrieval

Query Modes:

Key Files:

  • packages/server/src/services/search/lightrag-query.ts

  • packages/server/src/services/llm/reranker.ts

Storage Architecture

Why Four Databases?

Each database serves a specific purpose optimized for its access patterns:

MongoDB (Document Database)

Use Case: Primary data storage

What It Stores:

  • Raw synced records

  • MCP server configurations

  • Indexing configurations

  • User settings

  • Metadata

Why MongoDB:

  • Flexible schema (different data sources have different fields)

  • Fast writes for bulk sync operations

  • Rich querying for management operations

  • Horizontal scalability

Collections:

Qdrant (Vector Database)

Use Case: Semantic search via embeddings

What It Stores:

  • Document embeddings (vectors)

  • Text chunks

  • Metadata for filtering

Why Qdrant:

  • Optimized for high-dimensional vectors (3072-d)

  • Sub-50ms search on millions of vectors

  • Advanced filtering capabilities

  • Distributed architecture for scale

Structure:

Memgraph (Graph Database)

Use Case: Knowledge graph for entity/relationship queries

What It Stores:

  • Entities (people, concepts, projects)

  • Relationships (works_on, depends_on, discussed_in)

  • Properties (types, timestamps, scores)

Why Memgraph:

  • Optimized for graph traversal (follow relationships)

  • In-memory for speed

  • Cypher query language

  • Real-time analytics

Structure:

Redis (Cache)

Use Case: Performance optimization

What It Stores:

  • MCP tool responses (30 min TTL)

  • Query results (5 min TTL)

  • Rate limiting counters

  • Session data

Why Redis:

  • Sub-millisecond access

  • Automatic expiration (TTL)

  • Atomic operations

  • Pub/sub for real-time updates

Keys:

Data Flow

Indexing Flow

Query Flow

Scalability Patterns

Horizontal Scaling

API Server:

Database Layer:

  • MongoDB: Replica Set + Sharding

  • Qdrant: Distributed cluster

  • Memgraph: HA cluster (Enterprise)

  • Redis: Cluster mode

Vertical Scaling

Small (< 100K docs):

  • 4 CPU, 16GB RAM

  • Single server

  • Docker Compose

Medium (100K - 1M docs):

  • 8 CPU, 32GB RAM

  • Single server with more resources

  • Or 2-3 servers (API + Databases)

Large (1M - 10M docs):

  • 16 CPU, 64GB RAM per server

  • Multiple API servers (load balanced)

  • Distributed databases

  • Dedicated cache layer

Enterprise (> 10M docs):

  • Kubernetes cluster

  • Auto-scaling based on load

  • Multi-region deployment

  • Dedicated infrastructure per component

Performance Characteristics

Latency Breakdown

Typical Query (mix mode):

Fast Query (naive mode):

Throughput

Single Server (8 CPU, 32GB RAM):

  • Naive mode: ~200 queries/sec

  • Hybrid mode: ~50 queries/sec

  • Mix mode: ~20 queries/sec

Clustered (3 servers):

  • Naive mode: ~600 queries/sec

  • Hybrid mode: ~150 queries/sec

  • Mix mode: ~60 queries/sec

Indexing Speed

Vector Indexing:

  • 500-1000 docs/minute (single core)

  • 16,000-32,000 docs/minute (32 cores with CONCURRENCY=32)

Graph Indexing:

  • 200-400 docs/minute (LLM extraction bottleneck)

  • Can run 32 concurrent extractions

Concurrency Model

Almanac uses parallel processing for performance:

Benefits:

  • 32x faster than sequential processing

  • Efficient CPU utilization

  • Configurable based on system resources

Security Architecture

Authentication & Authorization

Encryption

At Rest:

  • MongoDB encryption-at-rest (optional)

  • Qdrant encrypted volumes

  • OAuth tokens encrypted in DB

In Transit:

  • HTTPS/TLS for API

  • TLS for database connections

  • Encrypted MCP connections

Sensitive Data

Encrypted Fields:

  • OAuth access tokens

  • OAuth refresh tokens

  • API keys

  • Environment variables with secrets

Encryption Method:

  • AES-256-GCM

  • Unique encryption key per deployment

  • Automatic via Mongoose hooks

Monitoring & Observability

Metrics

API Metrics:

  • Request rate (requests/sec)

  • Response time (p50, p95, p99)

  • Error rate

  • Cache hit rate

Database Metrics:

  • Query latency

  • Connection pool usage

  • Storage size

  • Index performance

Indexing Metrics:

  • Documents indexed/minute

  • Indexing errors

  • Queue depth

  • Processing time per document

Logging

Log Levels:

  • DEBUG: Detailed execution logs

  • INFO: Important events (sync started, query executed)

  • WARN: Recoverable errors (rate limit hit, cache miss)

  • ERROR: Critical errors (database down, indexing failed)

Log Format:

Deployment Architectures

Development

Production (Small)

Production (Large)

Technology Choices

Why Express.js?

  • Fast, minimal framework

  • Large ecosystem

  • TypeScript support

  • Battle-tested at scale

Why TypeScript?

  • Type safety catches bugs early

  • Better IDE support

  • Maintainability at scale

  • Gradual adoption path

Why Model Context Protocol?

  • Standard interface for data sources

  • Community-driven ecosystem

  • Easy to add new sources

  • Separation of concerns

Why LightRAG?

  • Better than pure vector search

  • Answers "who", "what", "how" questions

  • 8x token reduction vs traditional RAG

  • Multiple query modes for flexibility

Next Steps

Last updated

Was this helpful?