As financial services become increasingly automated, retirement-spending apps have emerged that enable users to enter their income needs and portfolio information and ostensibly get a reasonable prediction of whether or how long their nest eggs may last in retirement.
There are many of these apps on the market now, some developed by firms such as Betterment, Vanguard, T. Rowe Price, and Schwab and others sold as subscription services to financial advisors for use with their clients. The problem is that users are led to believe they should make important life decisions with the aid of these apps, even though the underlying probabilities are based on inherently unpredictable outcomes.
In truth, applying probability software to retirement planning analysis is folly. Even the most sophisticated retirement planning software used by financial professionals is a far cry from a crystal ball.
The problem with probabilities
The failings of probability-based retirement software, specifically those apps that apply Monte Carlo simulation techniques, are reasonably well-known in professional circles. One of the first academic papers to raise the issue was a 2006 article entitled, “Will the True Monte Carlo Number, Please Stand Up?.” In it, the author, renowned retirement researcher and York University of Toronto Professor, Moshe Milevsky, notes:
“Of course, as most investment advisors have known for years, a retirement number – if it actually exists – is vague and imprecise, since it depends on many economic unknowns, especially future equity market returns. After all, this “number” must be invested somewhere in order to produce income – and the portfolio return process is inherently random.”
In addition to the unpredictability of future returns, Milevsky goes on to document how “probabilities” produced by popular retirement software applications vary from one app to the next, depending upon the applications’ internal assumptions and design parameters.
Another academic study published last month concluded that “the advice provided from a majority of these tools is extremely misleading to households.”
These publications have caused some to question whether retirement planning software offers any real value to consumers at all. So what’s the consumer to do?
Stress-testing vs. predicting
Like the weatherman, financial advisors who use Monte Carlo simulation software often express their clients’ results in terms of the likelihood of a positive outcome. Instead of attempting to predict “probabilities of success,” a better way to approach retirement planning is from a glass-half-empty perspective.
What consumers really need to know is not how they may fare if things go well, but what will happen to them if a 10% probability of rain turns into a 100% probability of a thunderstorm. Speaking less metaphorically, consumers desperately need to know, “If things go badly in the investment markets, will I still be okay?”
Traditionally, historical back-testing software has been used for this purpose. By entering one’s retirement profile into a back-testing app, a consumer can test how his portfolio may have fared if he retired prior to previous bear markets. While such information is useful and interesting to consumers, back-testing also has significant limitations.
Specifically, past returns are unlikely to be repeated in the exact same sequence again, and there is absolutely no guarantee that future returns may not be worse than historical experience. Further, suppose a person wanted to test how his portfolio might hold up over a 30-year retirement horizon if he had retired at the end of 1999 (i.e., just before the 2000-2002 and 2007-2009 bear markets). Since we are only in 2016, it is obviously not possible to back-test the future.
‘Bootstrapping’ technique offers alternative view
One solution to the limitations of back-testing is to apply a simulation technique called bootstrapping. While the simulation engine under the hood of many retirement apps requires the program designer to make assumptions about expected mean rates of return and volatility for various asset classes, bootstrapping requires no such assumptions. Simulations are produced instead by randomly sampling historical returns.
If enough simulations are generated (typically a minimum of 5,000), the median result may be expected to be roughly in line with the historical averages. By considering the range of results below the median, bootstrapping programs may illustrate scenarios representing below-average investment returns, with the Value at Risk (VaR) statistics (bottom 1%, 5%, and 10% results) representing scenarios that may be as bad or worse than the historical record.
To illustrate this concept by example, the following table presents the bootstrapping simulation results for a 65-year-old investor with a 25-year retirement horizon, a $1 million initial portfolio value and a 70-30 stock-bond retirement allocation. In this example, the investor requires a $50,000 (5%) initial (i.e., first year) withdrawal rate and a 3% annual cost-of-living increase thereafter. He estimates his annual investment expense at 1% and has stated that he expects to withdraw proportionately from each asset class each year and rebalance to maintain his 70:30 allocation.
Remaining Balance After…
Simulation results generated by Nest Egg Guru
By focusing on the bottom of half of the results and displaying the simulation range in five-year increments over the illustrated time period, the consumer can gain a much more tangible sense of whether and how long his savings may last. What’s more, by presenting the data in this format, it is easy for the consumer to then test how changing factors that are within his control (spending amount, withdrawal strategy, asset allocation, investment expenses, etc.) may impact the outcomes.
To be clear, there is absolutely nothing predictive in these simulation results, and the simulation percentiles should not be viewed as probabilities. Instead, the bottom half of the results merely represent potential scenarios that may be used to give consumers a clearer picture of what may happen if things go badly.
While bootstrapping offers a neat way to illustrate this data, it is not without its flaws and limitations too. In this example, bootstrapping was applied only to historical stock market data from 1970-2014. The bond portion of the portfolio was assumed to be a constant 2% per year, which reasonably reflects the return an investor might realistically earn today on a five-year CD or 10-year treasury. The fact that bootstrapping simulations were not applied to historical bond data reflects a limitation with most retirement apps in that the yields on bonds today are near the bottom of the historic extreme. As a result, any Monte Carlo application that is generating randomized returns based on mean historical bond returns or any bootstrapping simulation that is randomly sampling historical bond index returns may produce overly optimistic results.
Consumers should also be aware that, regardless of a retirement calculator’s underlying methodology, the design of the app may also have an impact on its applicability and reliability. For instance, applications that fail to properly account for the impact of investment expenses, inflation, or, as noted, the current low interest rates on bond, may be overly optimistic in their results reporting.
With any retirement planning app, the devil is in the details. Consumers and advisors alike would do well to take the time to understand the assumptions and limitations inherent in any retirement planning application.
John H. Robinson is the owner of Financial Planning Hawaii and a co-founder of Nest Egg Guru, a retirement planning software application for financial professionals.
A version of this article was originally published on Nerd Wallet and has also appeared on NASDAQ.com and NewsOK.