Performance Benchmarks¶

This document contains detailed benchmark results for various components of the portfolio management system.

Benchmark Environment¶

All benchmarks run in standardized conditions:

Hardware: GitHub Actions runners (2-core, 7GB RAM)
Python: 3.12.x
Pandas: 2.3+
NumPy: 2.0+
Methodology: Multiple runs with statistical analysis

Asset Selection Benchmarks¶

Vectorization Improvements¶

Dataset: 10,000 row selection DataFrame

Operation	Before (ms)	After (ms)	Speedup	Improvement
Basic filtering	3871.2	52.77	73.4x	98.6%
Complex filtering	1389.1	17.70	78.5x	98.7%
Severity filtering	2171.3	41.88	51.8x	98.1%
Allow/blocklist	4989.2	24.17	206.4x	99.5%

Scalability:

Time per asset: 5.3µs (constant across all sizes)
Memory per asset: 8.5KB (constant)
Scaling: O(n) linear

Selection Performance by Universe Size¶

Universe Size	Time (ms)	Memory (MB)	Assets/sec
100	0.53	0.85	188,679
500	2.65	4.25	188,679
1000	5.30	8.50	188,679
5000	26.50	42.50	188,679
10000	53.00	85.00	188,679

Key Finding: Perfect linear scaling with constant per-asset performance

Preselection Benchmarks¶

Factor Computation by Method¶

Configuration: 1000 assets, 252-day lookback, single rebalance

Method	Time (ms)	Memory (MB)	Std Dev (ms)
Momentum	68.3	85.2	3.1
Low Volatility	71.2	87.4	3.4
Combined (50/50)	79.5	92.1	3.8

Observations:

Combined method adds 16% overhead
All methods complete in \<100ms for 1000 assets
Memory usage increases ~8% for combined

Scalability by Universe Size¶

Configuration: Momentum method, 252-day lookback

Assets	Time (ms)	Memory (MB)	Time/Asset (µs)	Scaling
100	8.2	12.3	82.0	1.0x
250	18.5	24.8	74.0	2.3x
500	35.1	44.9	70.2	4.3x
1000	68.3	85.2	68.3	8.3x
2500	165.2	195.1	66.1	20.1x
5000	327.8	379.8	65.6	40.0x

Key Finding: Near-perfect linear scaling (O(n) = 66µs per asset)

Lookback Period Impact¶

Configuration: 1000 assets, momentum method

Lookback (days)	Time (ms)	Memory (MB)	Overhead vs 30d
30	65.1	78.2	- (baseline)
63	66.8	82.1	+2.6%
126	67.9	85.0	+4.3%
252	68.3	85.2	+4.9%
504	71.2	91.8	+9.4%

Key Finding: Lookback period has minimal impact (\<10% even at 2 years)

Multiple Rebalancing Performance¶

Scenario: Monthly rebalancing over 2 years (24 rebalances)

Assets	Total Time (s)	Per Rebalance (ms)	Throughput (rebal/s)
100	0.20	8.3	120.5
250	0.44	18.3	54.6
500	0.84	35.0	28.6
1000	1.64	68.3	14.6
2500	3.96	165.0	6.1
5000	7.87	328.0	3.0

Key Finding: 24 monthly rebalances with 1000 assets completes in 1.6 seconds

Backtest Engine Benchmarks¶

Lookback Slicing Optimization¶

Dataset: 10 years daily (2,520 days), 50 assets

Rebalance Freq	Runtime (s)	Rebalances	Slices Before	Slices After	Reduction
Weekly	4.22	504	2,520	504	80%
Monthly	2.40	116	2,520	116	95%
Quarterly	1.82	39	2,520	39	98%

Key Finding: Slice operations reduced from O(n²) to O(rebalances)

Backtest Performance by Configuration¶

Dataset: 5 years daily, 100 assets

Strategy	Rebalance Freq	Time (s)	Memory (MB)	Rebalances
Equal Weight	Monthly	5.2	125	60
Equal Weight	Quarterly	3.8	120	20
Risk Parity	Monthly	12.4	180	60
Risk Parity	Quarterly	8.1	165	20
Mean-Variance	Monthly	18.7	220	60
Mean-Variance	Quarterly	11.3	195	20

Observations:

Equal weight: Fastest (no optimization)
Risk parity: 2.4x slower (iterative solver)
Mean-variance: 3.6x slower (quadratic programming)

Data Loading Benchmarks¶

Column Filtering Performance¶

Methodology: Load subset of assets from large CSV

Total Assets	Requested	Load Time (ms)	Memory (MB)	Memory Reduction
100	10	45.2	12.3	60%
500	20	52.8	15.7	60%
1000	50	68.4	28.9	60%
5000	100	142.7	85.2	60%

Key Finding: Consistent 60% memory reduction when loading subset

Date Filtering Performance¶

Dataset: 1000 assets, 10 years daily data

Date Range	Rows Before	Rows After	Load Time (ms)	Memory (MB)
Full (10y)	2,520	2,520	68.4	28.9
5 years	2,520	1,260	45.2	14.8
3 years	2,520	756	31.8	9.2
1 year	2,520	252	18.5	3.4

Key Finding: Time and memory scale linearly with date range

Combined Filtering (Columns + Dates)¶

Scenario: 1000 assets total, load 50 assets for 3 years

Optimization	Load Time (ms)	Memory (MB)	Reduction
None (load all)	68.4	28.9	- (baseline)
Columns only	31.2	11.6	59.9%
Dates only	45.8	8.7	69.9%
Both	18.3	4.6	84.1%

Key Finding: Combined filtering achieves 84% memory reduction

Streaming Diagnostics Benchmarks¶

Memory Usage Comparison¶

Dataset: 36 Stooq files, ~13,000 lines each

Approach	Peak Memory (MB)	Total Time (s)	Per File (ms)
Full DataFrame	23.76	5.469	151.93
Streaming	13.38	5.949	165.25
Reduction	43.7%	-8.8%	-8.8%

Trade-off Analysis:

Memory: 43.7% reduction (significant benefit)
Time: 8.8% increase (acceptable cost)
Overall: Clear win for memory-constrained scenarios

Streaming Performance by File Size¶

File Size (KB)	Rows	Full Load (ms)	Streaming (ms)	Memory Savings
50	500	12.3	15.2	38%
250	2500	58.7	67.4	42%
500	5000	115.2	131.8	44%
1000	10000	228.4	259.1	45%

Key Finding: Memory savings consistent across file sizes (~40-45%)

Extrapolation to Production Scale¶

Scenario: 70,000 Stooq files in production

Metric	Full DataFrame	Streaming	Improvement
Peak Memory	45.1 GB	25.4 GB	19.7 GB saved
Total Time	177 min	193 min	16 min slower
Files/sec	6.6	6.0	-9%

Analysis: 19.7 GB memory savings justified by 16-minute time cost

Caching Benchmarks¶

Factor Cache Performance¶

Scenario: 10 rebalances with overlapping data windows

Cache State	Total Time (s)	Cache Hits	Cache Misses	Hit Rate
First run (empty)	0.68	0	10	0%
Second run (warm)	0.08	10	0	100%
Different dates	0.34	5	5	50%

Key Finding: 88% time savings on cache hits

PriceLoader Cache Performance¶

Scenario: 5000 unique price files loaded sequentially

Cache Size	Peak Memory (MB)	Time (s)	Hit Rate
Unbounded	3,550	28.4	N/A
1000	710	32.1	20%
500	355	34.8	10%
0 (disabled)	71	42.3	0%

Key Finding: Bounded cache (1000) provides 80% memory reduction with 13% time cost

System-Level Benchmarks¶

End-to-End Backtest Performance¶

Configuration:

Universe: 500 assets
Period: 10 years daily (2,520 days)
Rebalancing: Monthly (120 rebalances)
Strategy: Risk Parity with preselection (top 50)
Features: PIT eligibility, membership policy, caching

Component	Time (s)	% of Total
Data loading	1.2	5%
Initial eligibility	0.8	3%
Rebalancing (120x)	18.4	75%
- Preselection	8.2	33%
- Membership policy	1.5	6%
- Portfolio optimization	6.9	28%
- Transaction costs	1.8	7%
Metrics calculation	2.1	9%
Report generation	2.0	8%
Total	24.5	100%

Key Insights:

Rebalancing dominates (75% of time)
Preselection + optimization = 61% of total
Data loading well-optimized (5%)

Hardware Comparison¶

Performance Across Systems¶

Benchmark: 1000 assets, 5 years, monthly rebalancing, equal weight

System	CPU	RAM	Time (s)	Relative
GitHub Actions	2-core, 7GB	7GB	5.2	1.0x (baseline)
MacBook Pro M1	8-core, 16GB	16GB	2.1	2.5x faster
Linux Workstation	16-core, 64GB	64GB	1.4	3.7x faster
AWS t3.medium	2-core, 4GB	4GB	6.8	0.8x slower

Observations:

System scales well with additional cores
Memory not a bottleneck for typical use
AWS t3.medium adequate for small/medium universes

Benchmark Methodology¶

Statistical Approach¶

All benchmarks follow these principles:

Multiple Runs: Minimum 3 runs, typically 5-10
Warm-up: One warm-up run before measurement
Statistical Analysis: Report mean, std dev, min, max
Outlier Removal: Discard runs >2 std dev from mean
Reproducibility: Seed random generators

Measurement Tools¶

Time: time.perf_counter() for high precision
Memory: tracemalloc for Python memory
Profiling: cProfile for detailed breakdown
System: psutil for system-level metrics

Reporting Standards¶

All benchmarks report:

Absolute values (time in seconds, memory in MB)
Relative improvements (speedup, reduction %)
Statistical measures (mean, std dev)
Test conditions (dataset, configuration)

Optimization Guide - Optimization techniques and strategies
Profiling Guide - How to profile and measure performance
Performance README - Performance documentation index

Performance Benchmarks¶

Benchmark Environment¶

Asset Selection Benchmarks¶

Vectorization Improvements¶

Selection Performance by Universe Size¶

Preselection Benchmarks¶

Factor Computation by Method¶

Scalability by Universe Size¶

Lookback Period Impact¶

Multiple Rebalancing Performance¶

Backtest Engine Benchmarks¶

Lookback Slicing Optimization¶

Backtest Performance by Configuration¶

Data Loading Benchmarks¶

Column Filtering Performance¶

Date Filtering Performance¶

Combined Filtering (Columns + Dates)¶

Streaming Diagnostics Benchmarks¶

Memory Usage Comparison¶

Streaming Performance by File Size¶

Extrapolation to Production Scale¶

Caching Benchmarks¶

Factor Cache Performance¶

PriceLoader Cache Performance¶

System-Level Benchmarks¶

End-to-End Backtest Performance¶

Hardware Comparison¶

Performance Across Systems¶

Benchmark Methodology¶

Statistical Approach¶

Measurement Tools¶

Reporting Standards¶

Related Documentation¶