Configuration Best Practices¶
This guide provides best practices, tips for performance optimization, and common mistakes to avoid when configuring your universes.
General Best Practices¶
- Start Simple: Begin with a simple configuration and gradually add complexity. Don't try to enable everything at once.
- Use a
description: Always add adescriptionto your universes. It makes it much easier to remember what each configuration is for. - Separate Universes: Create separate universe definitions for different strategies. Don't try to make one universe that does everything.
- Version Control: Keep your
universes.yamlfile in version control (e.g., Git). This allows you to track changes to your strategies over time.
Performance Optimization Tips¶
If your backtests are running slowly, consider the following:
- Use Preselection: This is the most important optimization. Use the
preselectionsection to reduce the number of assets before they are passed to the optimizer. Reducing a universe of 1000 assets to 100 can result in a 1000x speedup for some optimizers. - Use a Shorter
lookback: A shorterlookbackperiod in thepreselectionsection means less data to process at each rebalance. A 6-month lookback (126days) can be significantly faster than a 12-month one (252days). - Enable Caching: Use the
--use-cachecommand-line flag when running your backtests. This will store intermediate results and speed up subsequent runs. - Use Chunked I/O: If your initial master list of assets is very large, use the
--chunk-sizecommand-line flag to process it in chunks. This dramatically reduces memory usage. - Rebalance Less Frequently: Changing the
frequencyinreturn_configfrommonthlytoquarterlywill reduce the number of rebalancing events and thus the total backtest time.
Common Mistakes and How to Avoid Them¶
-
Mistake: Setting
max_assetsinconstraintsto be smaller thantop_kinpreselection. -
Problem: This will cause an error, as the preselection will select
top_kassets, but the constraint will then try to reduce the universe to a smaller number. -
Solution: Ensure that
max_assetsis always greater than or equal totop_k. -
Mistake: Forgetting that
membership_policyrequires preselection. -
Problem: The membership policy needs ranked assets to work, and this ranking is provided by the preselection step.
-
Solution: If you enable
membership_policy, you must also have apreselectionsection in your configuration. -
Mistake: Setting a
buffer_rankthat is smaller thantop_k. -
Problem: The
buffer_rankis the total rank an asset must be within, not an addition totop_k. Iftop_kis 50 andbuffer_rankis 40, the buffer will have no effect. -
Solution: Ensure that
buffer_rankis always greater thantop_k. A good rule of thumb isbuffer_rank = top_k + 10. -
Mistake: Using a very aggressive membership policy.
-
Problem: Setting
max_turnoverormax_new_assetsto very low values can starve your portfolio of new opportunities and force it to hold onto losing assets. - Solution: Use membership policy to gently guide the portfolio. Start with
min_holding_periodsandbuffer_rank, and only use themax_constraints if necessary.
Debugging Configuration Issues¶
If your universe isn't behaving as you expect:
- Check the Logs: Run the backtest script with the
--verboseflag to get detailed logging output. This will often show you which assets are being filtered out at each stage. - Use
manage_universes.py: Themanage_universes.pyscript has avalidatecommand that can check your YAML file for syntax errors. - Isolate the Problem: Create a copy of your universe and disable sections one by one. For example, disable the
membership_policyto see if that is the source of the problem. Then disablepreselection. This can help you pinpoint the issue.
When to Enable Caching¶
- Always enable caching (
--use-cache) if you are running the same backtest multiple times. - The cache is smart enough to invalidate itself if you change the configuration of the universe.
- Caching is most effective for the return calculation step, which can be slow for large universes.
When to Enable Fast IO¶
- Enable Fast I/O (
--chunk-size) when your master list of all possible assets is very large (e.g., > 100,000 rows). - This is a feature of the asset selection script, not the backtester, so you need to use it when you are running the
manage_universes.py loadcommand. - A good starting value is
--chunk-size 5000.