Analytics API Reference¶
The analytics package provides financial analytics including returns calculation and metrics.
Overview¶
The analytics package contains:
- Returns - Return calculation methods
- Metrics - Performance and risk metrics
- Indicators - Technical indicators
Analytics Package¶
portfolio_management.analytics
¶
Analytics package for financial calculations.
This package provides tools for analyzing financial data: - Return calculation from price data - Technical indicator-based filtering (stub implementation) - Performance and risk metrics (future)
FilterHook
¶
Hook for filtering assets based on technical indicator signals.
This class serves as a bridge between a chosen indicator provider and the asset selection process. It computes indicator signals for each asset's price series and filters the asset list based on the most recent signal value. Assets with a 'True' or '1.0' signal are retained, while those with 'False' or '0.0' are excluded.
Attributes:
| Name | Type | Description |
|---|---|---|
config |
IndicatorConfig
|
The configuration object for the indicators. |
provider |
IndicatorProvider
|
The provider instance used to compute signals. |
Example
import pandas as pd from .config import IndicatorConfig from .providers import NoOpIndicatorProvider
Using a NoOp provider which always returns True¶
config = IndicatorConfig(enabled=True, provider='noop') provider = NoOpIndicatorProvider() hook = FilterHook(config, provider)
prices = pd.DataFrame({'TICKER': [10, 11, 12]}) assets = ['TICKER'] result = hook.filter_assets(prices, assets) print(result) ['TICKER']
Source code in src/portfolio_management/analytics/indicators/filter_hook.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | |
filter_assets(prices, assets)
¶
Filter assets based on technical indicator signals.
For each asset in the input list, this method computes its technical indicator signal using the configured provider. It then includes the asset in the output list only if the most recent signal is True (or >= 0.5 for floating-point signals).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prices
|
DataFrame
|
A DataFrame of price data, with asset symbols as columns and dates as the index. |
required |
assets
|
list[str]
|
The list of asset symbols to be filtered. |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: A new list containing only the asset symbols that passed |
list[str]
|
the indicator filter. If indicators are disabled in the config, this |
list[str]
|
method returns the original list of assets unmodified. |
Source code in src/portfolio_management/analytics/indicators/filter_hook.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | |
IndicatorConfig
dataclass
¶
Configuration for technical indicator-based filtering.
This dataclass defines the parameters for technical indicator computation and filtering. It specifies whether the feature is enabled, which provider to use for calculations (e.g., 'talib'), and any indicator-specific parameters like window sizes or thresholds.
Attributes:
| Name | Type | Description |
|---|---|---|
enabled |
bool
|
If True, technical indicator filtering is active. Defaults to False. |
provider |
str
|
The provider to use for indicator calculations. Examples: 'noop', 'talib', 'ta'. Defaults to 'noop'. |
params |
dict[str, Any]
|
A dictionary of indicator-specific parameters. Common keys include 'window', 'threshold', 'indicator_type'. |
Example
Config for a 50-day RSI filter with a threshold of 0.5¶
rsi_config = IndicatorConfig( ... enabled=True, ... provider='talib', # Assuming 'talib' is a supported provider ... params={'indicator_type': 'rsi', 'window': 50, 'threshold': 0.5} ... ) rsi_config.validate()
print(f"Provider: {rsi_config.provider}, Window: {rsi_config.params['window']}") Provider: talib, Window: 50
Source code in src/portfolio_management/analytics/indicators/config.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
validate()
¶
Validate indicator configuration parameters.
Checks if the provider is supported and validates common parameters like 'window' and 'threshold' if they are present.
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If the configuration is invalid. |
Source code in src/portfolio_management/analytics/indicators/config.py
disabled()
classmethod
¶
Create a disabled indicator configuration.
This is a convenience factory method for creating a configuration that explicitly disables indicator filtering.
Returns:
| Name | Type | Description |
|---|---|---|
IndicatorConfig |
IndicatorConfig
|
An instance with |
Source code in src/portfolio_management/analytics/indicators/config.py
noop(params=None)
classmethod
¶
Create a no-op indicator configuration.
This factory creates a configuration that is enabled but uses the 'noop' provider, which performs no actual filtering. Useful for testing the pipeline's structure without applying indicator logic.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
dict[str, Any] | None
|
Optional parameters for the no-op provider, primarily for testing purposes. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
IndicatorConfig |
IndicatorConfig
|
An instance with |
Source code in src/portfolio_management/analytics/indicators/config.py
IndicatorProvider
¶
Bases: ABC
Abstract interface for technical indicator computation.
This abstract class defines the contract for computing technical indicators
from a time series. Concrete implementations should inherit from this class
and implement the compute method, typically by wrapping a technical
analysis library like TA-Lib or ta.
The provider pattern allows the system to remain agnostic to the specific backend used for indicator calculations.
Example (for creating a new provider): >>> class MovingAverageCrossProvider(IndicatorProvider): ... def compute(self, series: pd.Series, params: dict[str, Any]) -> pd.Series: ... short_window = params.get("short", 20) ... long_window = params.get("long", 50) ... short_ma = series.rolling(window=short_window).mean() ... long_ma = series.rolling(window=long_window).mean() ... # Signal is True when short MA crosses above long MA ... signal = (short_ma > long_ma) ... return signal.fillna(False)
Source code in src/portfolio_management/analytics/indicators/providers.py
compute(series, params)
abstractmethod
¶
Compute a technical indicator signal from a time series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
series
|
Series
|
An input time series of data, typically prices, indexed by date. |
required |
params
|
dict[str, Any]
|
A dictionary of indicator-specific parameters,
such as window sizes or thresholds (e.g., |
required |
Returns:
| Type | Description |
|---|---|
Series
|
pd.Series: A Series of indicator signals with the same index as the |
Series
|
input. Values should be boolean (True/False) or float (0.0 to 1.0) |
Series
|
to signify inclusion or exclusion. |
Source code in src/portfolio_management/analytics/indicators/providers.py
NoOpIndicatorProvider
¶
Bases: IndicatorProvider
No-op stub implementation that returns pass-through signals.
This implementation of IndicatorProvider always returns a signal of True,
effectively including all assets without applying any filtering. It serves as a
default placeholder, useful for testing the indicator framework's structure
without requiring technical analysis dependencies or logic.
Use this provider for: - Testing the overall asset selection pipeline. - Disabling indicator filtering while keeping the configuration enabled. - Serving as a base for future indicator implementations.
Example
import pandas as pd provider = NoOpIndicatorProvider() prices = pd.Series( ... [100, 101, 102], ... index=pd.to_datetime(['2023-01-01', '2023-01-02', '2023-01-03']) ... ) signal = provider.compute(prices, params={}) print(signal) 2023-01-01 True 2023-01-02 True 2023-01-03 True dtype: bool
Source code in src/portfolio_management/analytics/indicators/providers.py
compute(series, params)
¶
Return a pass-through signal (all True).
This method ignores the input series and parameters and simply returns
a boolean Series of True values with the same index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
series
|
Series
|
The input time series (ignored). |
required |
params
|
dict[str, Any]
|
Indicator-specific parameters (ignored). |
required |
Returns:
| Type | Description |
|---|---|
Series
|
pd.Series: A boolean Series of |
Series
|
assets being filtered out. |
Source code in src/portfolio_management/analytics/indicators/providers.py
PriceLoader
¶
Utilities for reading price files into pandas objects.
This loader is designed to efficiently read and process numerous price files. It includes a bounded LRU cache to prevent unbounded memory growth during long-running workflows and uses a thread pool for parallel I/O operations to accelerate data loading.
The loader also performs data validation and cleaning, such as removing duplicate timestamps and non-positive price values.
Attributes:
| Name | Type | Description |
|---|---|---|
max_workers |
int | None
|
Maximum number of concurrent threads for parallel loading. If None, a default is calculated based on CPU cores. |
cache_size |
int
|
Maximum number of price series to hold in the LRU cache. Set to 0 to disable caching. |
io_backend |
Backend
|
The backend to use for reading CSV files. Options include 'pandas', 'polars', and 'pyarrow'. 'auto' selects the fastest available option. |
Example
from pathlib import Path from portfolio_management.analytics.returns.loaders import PriceLoader from portfolio_management.assets.selection.models import SelectedAsset
prices_dir = Path("tests/data/prices") # Dummy path assets = [ ... SelectedAsset(symbol="AAPL"), ... SelectedAsset(symbol="MSFT") ... ] loader = PriceLoader(max_workers=4, cache_size=500)
price_df = loader.load_multiple_prices(assets, prices_dir)¶
if price_df is not None:¶
... # print(price_df.info())
Check cache status¶
stats = loader.get_cache_stats() print(f"Cache entries: {stats['cache_entries']}, " ... f"Cache size: {stats['cache_size']}") Cache entries: 0, Cache size: 500
Source code in src/portfolio_management/analytics/returns/loaders.py
57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 | |
load_price_file(path)
¶
Load a single price file into a Series indexed by date.
This method reads a CSV file, standardizes its columns, cleans the data (handles duplicates, non-positive values), and returns a sorted Series of close prices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
The path to the price CSV file. |
required |
Returns:
| Type | Description |
|---|---|
Series
|
pd.Series: A Series of close prices, indexed by date. The Series will be empty if the file is empty or contains no valid data. |
Raises:
| Type | Description |
|---|---|
DataLoadError
|
If the file cannot be found or parsed. |
Source code in src/portfolio_management/analytics/returns/loaders.py
load_multiple_prices(assets, prices_dir)
¶
Load price data for many assets and align on the union of dates.
This method orchestrates the loading of price files for a list of assets in parallel. It resolves file paths, submits loading tasks to a thread pool, and assembles the resulting Series into a single DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
assets
|
list[SelectedAsset]
|
The list of assets to load prices for. |
required |
prices_dir
|
Path
|
The base directory containing the price CSV files. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: A DataFrame containing the close prices for all successfully loaded assets. The index is the union of all dates, and columns are the asset symbols. Returns an empty DataFrame if no price files can be loaded. |
Source code in src/portfolio_management/analytics/returns/loaders.py
clear_cache()
¶
Clear all cached price series.
This is useful after bulk operations where cached data is unlikely to be reused, helping to free memory immediately rather than waiting for LRU eviction.
Source code in src/portfolio_management/analytics/returns/loaders.py
get_cache_stats()
¶
Get cache statistics.
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Dictionary with 'size' (current entries) and 'maxsize' (capacity). |
Useful for testing and monitoring.
Source code in src/portfolio_management/analytics/returns/loaders.py
cache_info()
¶
Return cache statistics for monitoring.
This method is an alias for get_cache_stats for backward compatibility.
ReturnCalculator
¶
Prepare aligned return series ready for portfolio construction.
Source code in src/portfolio_management/analytics/returns/calculator.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 | |
latest_summary
property
¶
Return the summary produced by the most recent pipeline run.
load_and_prepare(assets, prices_dir, config)
¶
Execute the full return-preparation pipeline for assets.
Source code in src/portfolio_management/analytics/returns/calculator.py
calculate_returns(prices, config)
¶
Calculate returns for each column in prices according to config.
This function applies the return calculation method specified in the configuration (simple, log, or excess) to the price data. It also filters out assets with insufficient historical data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prices
|
DataFrame
|
DataFrame of prices with dates as index and assets as columns. |
required |
config
|
ReturnConfig
|
Configuration object specifying calculation parameters. |
required |
Returns:
| Type | Description |
|---|---|
ReturnFrame
|
pd.DataFrame: DataFrame of calculated returns. |
Source code in src/portfolio_management/analytics/returns/calculator.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 | |
handle_missing_data(prices, config)
¶
Apply the configured missing-data strategy to prices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prices
|
DataFrame
|
The input price data. |
required |
config
|
ReturnConfig
|
Configuration specifying the handling method. |
required |
Returns:
| Type | Description |
|---|---|
PriceFrame
|
pd.DataFrame: Price data with missing values handled. |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If an unknown missing data handling method is configured. |
Source code in src/portfolio_management/analytics/returns/calculator.py
export_returns(returns, path)
staticmethod
¶
Persist prepared returns as a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
returns
|
DataFrame
|
The returns data to save. |
required |
path
|
Path
|
The file path to save the CSV to. |
required |
Source code in src/portfolio_management/analytics/returns/calculator.py
ReturnConfig
dataclass
¶
Configuration for return preparation.
This dataclass holds all settings related to the calculation and cleaning
of asset returns. It provides a centralized place to define the behavior
of the ReturnCalculator.
Attributes:
| Name | Type | Description |
|---|---|---|
method |
str
|
The method for calculating returns. Options: 'simple', 'log', 'excess'. Defaults to 'simple'. |
frequency |
str
|
The target frequency for the returns. Options: 'daily', 'weekly', 'monthly'. Defaults to 'daily'. |
risk_free_rate |
float
|
The annualized risk-free rate to use when
|
handle_missing |
str
|
The strategy for handling missing price data. Options: 'forward_fill', 'drop', 'interpolate'. Defaults to 'forward_fill'. |
max_forward_fill_days |
int
|
The maximum number of consecutive days to forward-fill missing data. Defaults to 5. |
min_periods |
int
|
The minimum number of price observations required for an asset to be included. Defaults to 2. |
align_method |
str
|
The method for aligning dates across assets. Options: 'outer' (union of dates), 'inner' (intersection of dates). Defaults to 'outer'. |
reindex_to_business_days |
bool
|
Whether to reindex the final returns to a standard business day calendar. Defaults to False. |
min_coverage |
float
|
The minimum proportion of non-NaN returns an asset must have to be kept after processing. Defaults to 0.8. |
Example
Create a config for weekly log returns, requiring at least 1 year of data¶
config = ReturnConfig( ... method="log", ... frequency="weekly", ... min_periods=52, ... handle_missing="interpolate", ... max_forward_fill_days=3, ... min_coverage=0.95 ... ) config.validate() # Raises ValueError on invalid settings
Source code in src/portfolio_management/analytics/returns/config.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 | |
validate()
¶
Validate the configuration values and raise ConfigurationError on issues.
Source code in src/portfolio_management/analytics/returns/config.py
default()
classmethod
¶
monthly_simple()
classmethod
¶
ReturnSummary
dataclass
¶
Summary statistics produced alongside prepared returns.
This dataclass acts as a container for key statistics derived from a
returns matrix, typically generated by the ReturnCalculator. It bundles
related data together for convenient access.
Attributes:
| Name | Type | Description |
|---|---|---|
mean_returns |
Series
|
A Series of annualized mean returns for each asset. |
volatility |
Series
|
A Series of annualized volatility (standard deviation) for each asset. |
correlation |
DataFrame
|
A DataFrame representing the correlation matrix between all assets' returns. |
coverage |
Series
|
A Series indicating the proportion of non-missing return data for each asset over the calculation period. |
Example
import pandas as pd
summary = ReturnSummary( ... mean_returns=pd.Series({"A": 0.1, "B": 0.12}), ... volatility=pd.Series({"A": 0.2, "B": 0.25}), ... correlation=pd.DataFrame({"A": [1.0, 0.5], "B": [0.5, 1.0]}, index=["A", "B"]), ... coverage=pd.Series({"A": 1.0, "B": 0.98}) ... ) print(f"Mean return for A: {summary.mean_returns['A']:.2f}") Mean return for A: 0.10
Source code in src/portfolio_management/analytics/returns/models.py
options: show_root_heading: true show_source: false members_order: source group_by_category: true show_category_heading: true