Skip to content

ADR-002: Modular Monolith Architecture

Status: Accepted Date: 2024-09-15 Deciders: Development team Context: Initial architecture design

Context

When designing the Portfolio Management Toolkit, we needed to choose an architecture that would:

  1. Support rapid development - Enable quick feature iteration and experimentation
  2. Enable offline execution - Work with cached data without external API dependencies
  3. Provide clear boundaries - Separate concerns without over-engineering
  4. Allow independent testing - Test each component in isolation
  5. Scale appropriately - Handle personal/small team portfolios (not institutional scale)
  6. Minimize operational complexity - Avoid infrastructure overhead
  7. Support configuration-driven workflows - CLI tools with YAML-based orchestration

The primary alternatives considered were:

  • Microservices - Separate services for data, portfolio, backtesting, etc.
  • Modular Monolith - Single codebase with well-defined package boundaries
  • Layered Monolith - Traditional 3-tier architecture (data, business, presentation)
  • Plugin Architecture - Core engine with dynamically loaded strategy plugins

Key constraints:

  • Target audience: Individual developers and small teams, not large institutions
  • Execution model: Offline, batch-oriented (not real-time trading)
  • Deployment: Developer laptops, not cloud infrastructure
  • Team size: 1-5 developers
  • Performance requirement: Process 1000-5000 assets monthly (not HFT scale)

Decision

We will use a Modular Monolith architecture with 12 well-defined packages.

The system is organized into packages with clear responsibilities:

Core Packages (Foundations)

  1. core/ - Shared types, protocols, exceptions, logging utilities
  2. config/ - Configuration loading, validation, schema definitions
  3. utils/ - General-purpose utilities (path handling, file I/O, etc.)

Domain Packages (Business Logic)

  1. data/ - Price loading, validation, transformation
  2. assets/ - Asset selection, classification, eligibility rules
  3. analytics/ - Return calculation, statistics, performance metrics
  4. portfolio/ - Weight calculation, optimization, constraints
  5. backtesting/ - Simulation engine, rebalancing logic
  6. reporting/ - Visualization, metrics export, HTML reports
  7. macro/ - Macro signals (regime detection, sentiment integration)

Service Packages (Orchestration)

  1. services/ - High-level orchestration (end-to-end workflows)
  2. cli/ (in scripts/) - Command-line interface, argument parsing

Package Dependencies

Strict dependency rules:

         ┌─────────┐
         │   CLI   │  (scripts/)
         └────┬────┘
         ┌────▼────┐
         │Services │  (orchestration)
         └────┬────┘
      ┌───────┴───────┐
      │               │
┌─────▼─────┐   ┌────▼────┐
│  Domain   │   │  Data   │  (business logic)
│ (portfolio│   │Analytics│
│ backtesting│  │  Macro  │
│  reporting)│  │         │
└─────┬─────┘   └────┬────┘
      │              │
      └──────┬───────┘
        ┌────▼────┐
        │  Core   │  (foundations)
        │  Config │
        │  Utils  │
        └─────────┘

Rules:

  • ✅ Domain packages can depend on Core/Config/Utils
  • ✅ Services can depend on Domain packages
  • ✅ CLI can depend on Services
  • ❌ Core/Config/Utils cannot depend on Domain packages
  • ❌ Domain packages cannot have circular dependencies
  • ❌ No direct database or external API calls (offline-first)

Testing Structure

Tests mirror package structure exactly:

tests/
  unit/
    analytics/
    assets/
    backtesting/
    core/
    data/
    portfolio/
    reporting/
  integration/
  e2e/

Each package has:

  • Unit tests for internal logic
  • Integration tests for cross-package interactions
  • End-to-end tests for complete workflows

Consequences

Positive

  • Simple deployment - Single process, no service orchestration needed
  • Fast development - No network boundaries, easier debugging, faster tests
  • Clear boundaries - Explicit package structure enforces separation of concerns
  • Easy refactoring - Can extract packages to services later if needed
  • Type safety - Cross-package contracts verified by MyPy at compile time
  • Testability - Each package tested independently; no mocking services
  • Minimal infrastructure - No Docker, Kubernetes, API gateways, service mesh
  • Atomic deployments - Single artifact to version and deploy
  • Shared code reuse - Common utilities accessible to all packages
  • Performance - No serialization overhead, direct function calls

Negative

  • ⚠️ Scaling limits - Cannot scale packages independently (must scale entire app)
  • ⚠️ Team coordination - Changes to shared packages affect all consumers
  • ⚠️ Deployment coupling - Must deploy entire app even for single-package changes
  • ⚠️ Technology lock-in - All packages must use Python (can't mix languages)
  • ⚠️ Risk of coupling - Without discipline, packages can become tightly coupled

Neutral

  • 📋 Single language - Python everywhere (consistent tooling, easier onboarding)
  • 📋 Evolution path - Can extract hot-path packages to services if needed
  • 📋 Testing speed - Fast (no network), but full test suite runs for any change

Alternatives Considered

Option A: Microservices Architecture

Description: Separate services for data ingestion, portfolio calculation, backtesting, reporting

Pros:

  • Independent scaling of each service
  • Technology diversity (could use Rust for backtesting, Python for data)
  • Team autonomy (different teams own different services)
  • Fault isolation (one service failure doesn't crash entire system)

Cons:

  • Operational complexity - Docker, Kubernetes, service discovery, load balancing
  • Development overhead - API contracts, versioning, backward compatibility
  • Debugging difficulty - Distributed tracing, cross-service debugging
  • Network latency - Serialization overhead for inter-service communication
  • Testing complexity - Service mocking, contract testing, integration tests
  • Infrastructure cost - Need orchestration platform, monitoring, logging

Why rejected: Massive overkill for a personal/small-team tool; operational burden far exceeds benefits

Option B: Layered Monolith (3-Tier)

Description: Traditional data layer → business logic layer → presentation layer

Pros:

  • Well-understood pattern
  • Clear separation of concerns
  • Easy to reason about for junior developers

Cons:

  • Horizontal layers create artificial boundaries (data + logic for portfolio optimization are split)
  • Hard to test - Business logic tightly coupled to data layer
  • Poor cohesion - Related functionality spread across layers
  • Difficult extraction - Can't easily pull out a "portfolio" module

Why rejected: Violates domain-driven design principles; makes refactoring harder

Option C: Plugin Architecture

Description: Minimal core engine with dynamically loaded strategy plugins

Pros:

  • Strategy isolation (add new strategies without core changes)
  • Extensibility (users can add custom strategies)
  • Clean separation between framework and strategies

Cons:

  • Dynamic loading complexity - Plugin discovery, version compatibility, error handling
  • Type safety loss - Plugins loaded at runtime; harder to type-check
  • Distribution complexity - How to package and distribute plugins?
  • Debugging difficulty - Harder to trace errors across plugin boundaries

Why rejected: Over-engineered for current needs; can add later if extension is needed

Option D: Monorepo with Libraries

Description: Separate Python packages (portfolio-data, portfolio-backtesting, etc.) in monorepo

Pros:

  • Package independence (can version separately)
  • Reusability (users can import only what they need)
  • Clearer API contracts (published interfaces)

Cons:

  • Versioning complexity - Must manage compatibility matrix
  • Breaking changes - Hard to coordinate across packages
  • Dependency hell - Circular dependencies require careful design
  • Distribution overhead - Must publish multiple PyPI packages

Why rejected: Adds versioning complexity without clear benefits for single-team project

Evolution Strategy

The modular monolith is designed for evolution:

Phase 1: Current (Modular Monolith)

  • Single Python package
  • Clear package boundaries
  • Dependency rules enforced by tests

Phase 2: Future (If Needed)

If specific packages become bottlenecks, we can extract them:

  • Extract backtesting engine - Rust service for 100x faster simulation
  • Extract data ingestion - Scheduled Lambda for daily price updates
  • Keep portfolio/analytics - Remain in monolith for rapid iteration

Key insight: Start with monolith, extract services only when proven necessary.

Implementation Notes

Enforcing Boundaries:

  1. Import restrictions - Pre-commit hook checks for illegal imports (e.g., domain → CLI)
  2. Dependency graph - pydeps generates visual dependency graph; violations fail CI
  3. pytest-archon - Architecture tests enforce layering rules
  4. Public APIs - Each package exports clear __init__.py interface

Refactoring History:

  • Sep 2024: Initial monolith (all code in src/)
  • Oct 2024: Created package structure (8 packages)
  • Oct 2024: Refactored backtest.py (749 lines → backtesting package)
  • Oct 2024: Refactored visualization.py (400 lines → reporting package)
  • Nov 2024: Added macro package for regime detection

References