Portfolio Management Toolkit - Functionality Overview¶
Executive Summary¶
The toolkit delivers an offline-first, production-ready CLI for Polish retail investors and operators who need to ingest historical Stooq data, assemble ruled universes, compare portfolio engines, and document every step with analytics and visualizations.
Core Capabilities¶
- Data pipeline: Raw Stooq CSVs are validated, matched, and cached with incremental resume and fast I/O options.
- Universe management: YAML-based universes orchestrate selection, classification, return calculations, and governance controls (preselection + membership policies).
- Portfolio construction: Equal-weight, risk-parity, and mean-variance engines wire into constraints (cardinality, transaction costs, risk overlays) before backtesting and reporting.
- Reporting & visualization: Backtests produce summary CSVs/JSON, a rich visualization module, and documentation-friendly narrative exports.
Advanced Features & Patterns¶
- Macro signals, regime gating, and technical indicator frameworks are designed as plug-ins so future overlays (e.g., sentiment, volatility targeting) can integrate without touching the core pipeline.
- Statistics caching, rolling analytics, and incremental computation patterns keep expensive operations bounded, even for 300+ asset universes.
Data Pipeline Walkthrough¶
-
Prepare tradeable data (
prepare_tradeable_data.py) -
Scans 70,000+ Stooq CSVs and builds a master index with metadata.
-
Matches broker instruments (BOŚ/mDM) to Stooq tickers, exports price files, and emits match/unmatched diagnostics.
-
Supports incremental resume (
--incremental) plus optional fast I/O backends (polars/pyarrow). -
Command example:
-
Select assets (
select_assets.py) -
Filters using data quality, markets, currencies, manual allowlists/blocklists, and streaming vs. eager modes for large universes.
- Accepts overrides for minimum history, gap tolerance, and optional PID preselection.
-
Outputs a canonical
selected_assets.csvfor downstream consumers. -
Classify assets (
classify_assets.py) -
Assigns hierarchical labels (asset class, sub-class, geography) plus confidence scores.
- Supports manual overrides, classification summaries, and export-for-review workflows.
-
Works with streaming/in-place workflows so you can plug it into
manage_universes.pyor run standalone. -
Compute returns (
calculate_returns.py) -
Generates monthly/quarterly/daily matrices with simple or log returns, alignment modes (inner/outer), and explicit handling for missing data.
- Validates coverage and raises
ConfigurationErrorif the dataset is insufficient for the chosen universe. - Supports catalysts such as forward-fill, interpolation, and minimum coverage gating.
Universe Management Blueprint¶
-
Universe YAMLs (e.g.,
config/universes.yaml) capture selection filters, classification requirements, return settings, and governance controls (cardinality, membership, turnover). -
Example snippet:
core_global:
description: "Global core sleeve of diversified ETFs"
filter_criteria:
data_status: ["ok"]
min_history_days: 756
markets: ["LSE", "GBR-LSE"]
currencies: ["GBP"]
classification_requirements:
asset_class: ["equity", "commodity", "real_estate"]
return_config:
method: "simple"
frequency: "monthly"
constraints:
min_assets: 30
max_assets: 50
manage_universes.pychains selection, classification, and return generation behind a single CLI entry:
- Commands include
load,validate,compare,list, andexport, letting you reuse YAML blueprints across experiments.
Portfolio Construction Strategies¶
- Equal weight: Benchmark baseline, ideal for large universes (>300 assets) and regulatory guardrails (max 25% per asset). Implements 1/N weighting with optional max/min constraints.
- Risk parity: Equalizes risk contribution using cached covariance/variance stats and respects the same transaction cost modeling as other engines.
- Mean-variance: Uses PyPortfolioOpt to maximize Sharpe or target-vol, subject to max equity/min bond constraints.
- Each strategy integrates with
construct_portfolio.py, honors turnover/transaction costs, and emitsportfolio_weights.csvfor reuse inrun_backtest.py.
See Also¶
- Comprehensive Example – Step-by-step basic, advanced, and production workflows that exercise every CLI.
- Quick Start – Fast 15-minute setup with sample data and strategy variations.