Pay Down Mortgage or Invest?¶

Questions home owners often ask themselves:

If I suddenly get $30,000, should I put it towards my mortgage payments, or should I invest it elsewhere?
After paying all my expenses, should I put all my extra money into the house or should I invest it?

People have strong beliefs about this. I decided to do a quantitative analysis, using the S&P 500 as the investment vehicle.

import pandas as pd
from scipy.stats.mstats import gmean
from numpy import roll
import numpy as np
import plotly.plotly as ply
import pylab as plt
import plotly.tools as tls
from scipy.optimize import fsolve
import seaborn as sn

%matplotlib inline

Data Sources¶

S & P 500¶

Ultimately, these are the Shiller data. http://www.econ.yale.edu/~shiller/data.htm

raw = pd.read_csv("data.csv", index_col=0)

raw.head()

The SP500 column has the value of the S&P 500.

snp500 = raw["SP500"]

I want to calculate the annualized rate of return, so I divide the following year's entry by today's entry.

aror_snp500 = (snp500.shift(-12)/snp500 - 1)*100
aror_snp500 = aror_snp500.dropna()

aror_snp500.tail(20)

Date
2012-05-01    22.260246
2012-06-01    22.311633
2012-07-01    22.716910
2012-08-01    18.998896
2012-09-01    16.886977
2012-10-01    19.627631
2012-11-01    27.897254
2012-12-01    27.103474
2013-01-01    23.099162
2013-02-01    20.149969
2013-03-01    20.162752
2013-04-01    18.689756
2013-05-01    15.241121
2013-06-01    20.282066
2013-07-01    18.243162
2013-08-01    17.450557
2013-09-01    18.140436
2013-10-01    12.630012
2013-11-01    14.635500
2013-12-01    13.549768
Name: SP500, dtype: float64

The number for 2013-01-01 is the rate of return for the year beginning in 2013-01-01! Likewise for 2013-10-01, etc. This is an annualized return, but rolling for each month.

So for 2013, starting Jan 1st, the annual return was 23.0992%

# We only want the January periods.
ror_snp500s = aror_snp500[0::12]

ror_snp500s.tail()

Date
2009-01-01    29.806604
2010-01-01    14.154755
2011-01-01     1.400259
2012-01-01    13.826139
2013-01-01    23.099162
Name: SP500, dtype: float64

Computing the Effective Annual Rate of Return Over Several Years¶

A common question is: Over a 10 year period, what was the average annual rate of return?

Many people make the mistake of taking each year's rate of return, and computing the mean. This is plain wrong. If you gain 10% one year and lose 10% the next year, the mean tells you your equivalent rate of return is 0, which means you have the same amount of money as when you started. If you do the math, you'll find you actually lost 1%.

The correct formula is:

$$r=\left(\prod_{i=1}^{n}\left(1+r_{i}\right)\right)^{1/n}-1$$

Over here we assume that $r_{i}$ is the annual rate of return for year $i$, and is expressed as a fraction. So 10% would be 0.1.

In words, this is just the geometric mean.

def avgrate(arr):
    """
    Given an array of annual returns, compute the equivalent annual 
    rate of return across all those years. The value it returns is 
    expressed as a percentage, not a decimal.
    """
    return (gmean(1 + arr/100) - 1) * 100

Computing the Rolling Annualized Rates of Returns¶

I want to add new columns in the data frame that have a rolling annualized average rate of return for a 5, 10, 20 and 30 year period.

def nwindow(df, years, src_label, new_label):
    """
    Compute the rolling rate of return over the last so many years.
    """
    ror = pd.rolling_apply(df[src_label], years, avgrate)
    # Uncomment the line below to make it show the rate at the START
    # of the investing period. 
    df[new_label] = roll(ror, -(years-1))
    # Uncomment the line below to make it show the rate at the END
    # of the investing period. 
    # df[new_label] = roll(ror, 1)

# Convert the series into a dataframe.
ror_snp500 = pd.DataFrame(ror_snp500s)

nwindow(ror_snp500, 5, "SP500", "05yrs")
nwindow(ror_snp500, 10, "SP500", "10yrs")
nwindow(ror_snp500, 20, "SP500", "20yrs")
nwindow(ror_snp500, 30, "SP500", "30yrs")

ror_snp500.tail(30)

So for 5 years beginning in 2009, the effective annual rate of return was 16.1%. For the 10 years beginning in 2004, it was only 4.87% per year.

ror_snp500.index = pd.to_datetime(ror_snp500.index)

SCALE = 3
def make_plot(df, title, ylabel, filename, new_fig=True, save_fig=True):
    # sn.axes_style("darkgrid")
    if new_fig:
        f = plt.figure()
    FONTSIZE = 15
    ax = df.plot(fontsize=FONTSIZE,
      sort_columns=True, title=title, figsize=(4*SCALE,3*SCALE))
    plt.ylabel(ylabel)  
    ax.set_title(title, fontsize=FONTSIZE)
    plt.legend(loc=0, prop={'size':FONTSIZE})
    ax.yaxis.label.set_size(FONTSIZE)
    plt.grid(True)
    if save_fig:
        fig = ax.get_figure()
        fig.savefig("plots/%s.png" % filename)

make_plot(ror_snp500[[x for x in ror_snp500.columns if x!="SP500"]], 
          "Historic annualized rate of return (rolling window)",
          "Annualized rate of return (%)", "ror_ls")

<matplotlib.figure.Figure at 0x7f2beefc4fd0>

for num_years in "30 20 10 05".split():
    make_plot(ror_snp500["%syrs" % num_years], 
              "%syr window" % num_years,
              "Annualized rate of return (%)",
             "ror_ls_%s" % num_years)

These charts look purely at the S&P 500 returns, and assume the dividends are not reinvested.

Reinvest Dividends¶

How should we account for the reinvestment of dividends? I used the approach in the link above where the dividends for each month are divided by 12, invested at the current S&P 500 "price", and then at the end of the year all these "shares" are added up and "sold" at the new price (Jan 1).

I'm assuming the Dividend column is the amount in Dollars one would receive if they owned 1 "share" of the S&P 500 fund?

nraw = raw[["SP500", "Dividend"]]
nraw["Dividend"] = nraw["Dividend"]/12.
nraw["Div Shares"] = nraw["Dividend"]/nraw["SP500"]
nraw["Div Sum"] = pd.rolling_sum(nraw["Div Shares"], 12).shift(1)
rord = (nraw.shift(-12)["SP500"]*(1+nraw.shift(-12)["Div Sum"])/nraw["SP500"]-1)*100

/usr/lib64/python2.7/site-packages/IPython/kernel/__main__.py:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

/usr/lib64/python2.7/site-packages/IPython/kernel/__main__.py:3: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

/usr/lib64/python2.7/site-packages/IPython/kernel/__main__.py:4: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

rord[1480:1500]

Date
1994-05-01    19.408109
1994-06-01    21.858827
1994-07-01    26.851684
1994-08-01    23.684119
1994-09-01    27.250015
1994-10-01    28.988608
1994-11-01    32.529444
1994-12-01    38.454788
1995-01-01    35.356233
1995-02-01    38.079163
1995-03-01    34.358346
1995-04-01    30.421610
1995-05-01    29.166120
1995-06-01    26.783260
1995-07-01    18.167716
1995-08-01    21.186385
1995-09-01    19.203382
1995-10-01    22.998648
1995-11-01    26.241075
1995-12-01    23.561270
dtype: float64

This means that in 1995, we got an effective rate of return of 35.4% if we include dividends. Look at the 1995-01-01 number.

# Take every 12 entries.
arord = rord[::12]
arord = pd.DataFrame(arord)
arord.columns = ["SP500D"]

nwindow(arord, 5, "SP500D", "05yrs")
nwindow(arord, 10, "SP500D", "10yrs")
nwindow(arord, 20, "SP500D", "20yrs")
nwindow(arord, 30, "SP500D", "30yrs")

arord[110:125]

arord.index = pd.to_datetime(arord.index)

make_plot(arord[[x for x in arord.columns if x!="SP500D"]], 
          "Historic annualized rate of return (rolling window) with reinvestment of dividends",
          "Annualized rate of return (%)", "ror_ls_div")

<matplotlib.figure.Figure at 0x7f2bee0f1210>

for num_years in "30 20 10 05".split():
    make_plot(arord["%syrs" % num_years], 
              "%syr window with dividends reinvested" % num_years,
              "Annualized rate of return (%)", "ror_ls_div_%s" % num_years)

Keep Dividends as Cash¶

Let's say I don't reinvest dividends and I keep them as cash. This avoids the loss in downtimes, but prevents large gains in the uptimes.

nnraw = raw[["SP500", "Dividend"]]
nnraw["Dividend"] = nnraw["Dividend"]/12.

/usr/lib64/python2.7/site-packages/IPython/kernel/__main__.py:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

def nwindow_c(df, years, new_label):
    """
    Compute the rolling rate of return over the last so many years.
    """
    # Add the dividend entries
    tmp = pd.rolling_sum(df["Dividend"], 12*years).shift(1)
    tmpdf = df.copy()
    tmpdf["Sum"] = tmpdf["SP500"]+tmp
    ror = tmpdf["Sum"]/tmpdf["SP500"].shift(12*years)
    ror = (np.power(ror, 1./years)-1) * 100
    df[new_label] = roll(ror, -(12*years))

nwindow_c(nnraw, 5, "05yrs")
nwindow_c(nnraw, 10,  "10yrs")
nwindow_c(nnraw, 20,  "20yrs")
nwindow_c(nnraw, 30,  "30yrs")

/usr/lib64/python2.7/site-packages/IPython/kernel/__main__.py:11: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

nnraw.index = pd.to_datetime(nnraw.index)

make_plot(nnraw[[x for x in nnraw.columns if x not in ("SP500", "Dividend")]], 
          "Historic annualized rate of return (rolling window): Dividends kept as cash",
          "Annualized rate of return (%)", "ror_ls_cash")

<matplotlib.figure.Figure at 0x7f2bef22ba50>

for num_years in "30 20 10 05".split():
    make_plot(nnraw["%syrs" % num_years], 
              "%syr window: Includes dividends kept as cash" % num_years,
              "Annualized rate of return (%)", "ror_ls_cash_%s" % num_years)

Was it ever better to keep the money instead of reinvesting it? Below I plot the ratio of the rate of returns: Keeping vs Investing. If the number is below 1, then reinvesting was better.

columns = [x for x in arord.columns if x!="SP500D"]
ratio = (nnraw[columns]/arord[columns]).dropna()
for num_years in "30 20 10 05".split():
    tmpdf = nnraw[["%syrs" % num_years]][::12]
    tmpdf.columns = ["Cash: %syrs" % num_years]
    tmpdf["Dividend: %syrs" % num_years] = arord["%syrs" % num_years]
    make_plot(tmpdf, 
              "%syr window" % num_years,
              "Hold vs Reinvest", "ror_ls_cash_div_%s" % num_years)

<matplotlib.figure.Figure at 0x7f2be77afb50>

<matplotlib.figure.Figure at 0x7f2be7a55310>

<matplotlib.figure.Figure at 0x7f2be7452f50>

<matplotlib.figure.Figure at 0x7f2be736f110>

For a 20 or 30 year period, it seems one should always reinvest. Even in the other cases, there were very few times in recent history where keeping the cash was a better idea. And don't be fooled by the extremes. That's often merely because the denominator was near zero.

What Does This All Mean?¶

If you're suddenly endowed with a lot of money, these plots show the effective annual rate of return if you invest in an S&P 500 index fund (ignoring any fund fees, of course).

It is noteworthy that over a 30 year period, the effective annual rate of return never dips below 5%. In recent times it rarely dipped below 10%. If you have that many years to live, and if your mortgage interest rate is below 5%, then never once in the history of the S&P 500 was it a better idea to put money into the house. The catch is that you should keep the money in the fund and not withdraw when the economy goes south.

At the other extreme is the 5 year window. If you plan to put in money and withdraw after only 5 years, it often does dip below 5%.

The 10 year window had a negative rate of return only twice: During the Great Depression and during the recent recession. It occasionally drops below 5%, though.

Inflation¶

Let's bring inflation into the picture.

inflation = pd.read_csv('inflation.csv', parse_dates=True, index_col=0)

inflation.head()

def avgrate_inflation(arr):
    """
    Given an array of annual returns, compute the equivalent annual 
    rate of return across all those years. The value it returns is 
    expressed as a percentage, not a decimal.
    """
    return (gmean(arr) - 1) * 100

def nwindow_inflation(df, inflation, years, src_label, new_label):
    """
    Compute the rolling rate of return over the last so many years, 
    adjusted for inflation.
    """
    tmpdf = (1+df[src_label]/100)#/(1+inflation/100)
    tmpinf = (1+inflation["Inflation"]/100)
    tmpdf = tmpdf/tmpinf
    tmpdf = tmpdf.dropna()
    ror = pd.rolling_apply(tmpdf, years, avgrate_inflation)
    df[new_label] = pd.Series(roll(ror, -(years-1)), index=tmpdf.index)

arordi = rord[::12]
arordi = pd.DataFrame(arordi)
arordi.columns = ["SP500DI"]
arordi.index = pd.to_datetime(arordi.index)

nwindow_inflation(arordi, inflation, 5, "SP500DI", "05yrs")
nwindow_inflation(arordi, inflation, 10, "SP500DI", "10yrs")
nwindow_inflation(arordi, inflation, 20, "SP500DI", "20yrs")
nwindow_inflation(arordi, inflation, 30, "SP500DI", "30yrs")

arordi.tail(31)

inflation.plot()

<matplotlib.axes._subplots.AxesSubplot at 0x7f2bec2a2590>

The first 43 years are just NaN as I don't have inflation data for it.

arordi=arordi[43:]

make_plot(arordi[[x for x in arordi.columns if x not in ("SP500DI", "Dividend")]], 
          "Rate of Return with Dividends & Inflation Adjusted",
          "Annualized rate of return with dividends and inflation (%)",
         "ror_ls_div_inflation")
for num_years in "30 20 10 05".split():
    make_plot(arordi["%syrs" % num_years], 
              "%syr window" % num_years,
              "Annualized rate of return with dividends and inflation adjusted (%)",
             "ror_ls_div_inflation_%s" % num_years)

<matplotlib.figure.Figure at 0x7f2be70c2690>

Note that as I only have annual inflation data (as opposed to per month), the accuracy may be off a little. However, I suspect the true result will not be very far off.

Dollar Cost Averaging¶

Not many of us suddenly receive a large sum of money. More common is dollar cost averaging where we put in, say, an extra $100 towards our mortgage or into investments every month. Let's see what this method of investing gives us.

Given a set of annual rate of returns $r_{i}$, if I put in $1 each year, how much money will I have after all those years?

$$\sum_{k=1}^{k=n}\prod_{i=k}^{n}\left(1+r_{i}\right)$$

The index $k$ is the year.

Without Inflation¶

def periodicsum(arr):
    """
    Given an array of annual interest rates, assume you put in $1 
    each year. How much money will you have after all those years?
    """
    result = 0
    # Note: I start indexing from 0 as the way I defined my 
    # rates, Jan 1, 1990 has the rate for 1990!
    for x in xrange(0, len(arr/100)):
        result += np.prod(1+arr[x:]/100)
    return result

How do we define the effective annual rate of return in this case?

Basically, I want to know the value of $r$ that satisfies:

$\sum_{k=1}^{k=n}\left(1+r\right)^{k}=\sum_{k=1}^{k=n}\prod_{i=1}^{k}\left(1+r\right)=\sum_{k=1}^{k=n}\prod_{i=k}^{n}\left(1+r_{i}\right)$

In words: Assume the annual rate of return is $r$ every year. What value of $r$ will give me the same result?

def periodicrate(arr):
    """
    With dollar cost averaging, what is the effective annual rate of 
    return?
    """
    total = periodicsum(arr)
    def objective(r):
        tmp = 0
        for k in xrange(1, len(arr)+1):
            tmp += np.power(1+r, k)
        return tmp - total
    return fsolve(objective, 0.1) * 100

prord = rord[::12]
prord = pd.DataFrame(prord)
prord.columns = ["SP500PD"]
prord.index = pd.to_datetime(prord.index)

def nwindow_periodic(df, years, src_label, new_label):
    ror = pd.rolling_apply(df[src_label], years, periodicrate)
    df[new_label] = roll(ror, -(years-1))
nwindow_periodic(prord, 5, "SP500PD", "05yrs")
nwindow_periodic(prord, 10, "SP500PD", "10yrs")
nwindow_periodic(prord, 20, "SP500PD", "20yrs")
nwindow_periodic(prord, 30, "SP500PD", "30yrs")

make_plot(prord[[x for x in prord.columns if x!="SP500PD"]], 
          "Annualized rate of return (rolling window) (averaging)",
          "Annualized rate of return (%)", "ror_dca")

<matplotlib.figure.Figure at 0x7f2bed4e4e10>

for num_years in "30 20 10 05".split():
    make_plot(prord["%syrs" % num_years], 
              "%syr window: Dollar Cost Averaging with Reinvestment of Dividends" % num_years,
              "Annualized rate of return - Averaging (%)", "ror_dca_%s" % num_years)

Let's overlay this with lumped sum

for num_years in "30 20 10 05".split():
    make_plot(prord["%syrs" % num_years], 
              "%syr window: Dollar cost averaging vs lumped sum" % num_years,
              "Annualized rate of return - Averaging vs Lumped Sum(%)",
             "ror_dca_vs_ls_%s" % num_years, save_fig=False)
    make_plot(arord["%syrs" % num_years], 
              "%syr window" % num_years,
              "Annualized rate of return - Averaging vs Lumped Sum (%)", 
             "ror_dca_vs_ls_%s" % num_years, False, True)

With Inflation¶

def nwindow_periodic_inflation(df, inflation, years, src_label, new_label):
    """
    Compute the rolling rate of return over the last so many years, 
    adjusted for inflation.
    """
    tmpdf = (1+df[src_label]/100)#/(1+inflation/100)
    tmpinf = (1+inflation["Inflation"]/100)
    tmpdf = tmpdf/tmpinf
    tmpdf = (tmpdf - 1)*100 # Need this to convert 1.03 to 3%
    tmpdf = tmpdf.dropna()
    ror = pd.rolling_apply(tmpdf, years, periodicrate)
    df[new_label] = pd.Series(roll(ror, -(years-1)), index=tmpdf.index)

prordi = rord[::12]
prordi = pd.DataFrame(prordi)
prordi.columns = ["SP500PDI"]
prordi.index = pd.to_datetime(prordi.index)

nwindow_periodic_inflation(prordi, inflation, 5, "SP500PDI", "05yrs")
nwindow_periodic_inflation(prordi, inflation, 10, "SP500PDI", "10yrs")
nwindow_periodic_inflation(prordi, inflation, 20, "SP500PDI", "20yrs")
nwindow_periodic_inflation(prordi, inflation, 30, "SP500PDI", "30yrs")

prordi = prordi[43:]

make_plot(prordi[[x for x in prordi.columns if x!="SP500PDI"]], 
          "Annualized rate of return (rolling window) (averaging) (with inflation)",
          "Annualized rate of return (%)", "ror_dca_inf")

<matplotlib.figure.Figure at 0x7f2be63df150>

for num_years in "30 20 10 05".split():
    make_plot(prordi["%syrs" % num_years], 
              "%syr window: Dollar Cost Averaging with Reinvestment of Dividends and Inflation Adjusted" % num_years,
              "Annualized rate of return - Averaging (%)", "ror_dca_inf_%s" % num_years)

Monte Carlo Simulations¶

Arguably, the annual rates of return from year to year have poor correlation. So instead of looking at consecutive years, just treat the annual rates as a pool to draw from, and pick any 30 at random to calculate a 30 year effective rate of return.

Lump Sum with Inflation Adjustments¶

mcrordi = rord[::12]
mcrordi = pd.DataFrame(mcrordi)
mcrordi.columns = ["SP500MCI"]
mcrordi.index = pd.to_datetime(mcrordi.index)
mcrordi = mcrordi.dropna()

def adjust_for_inflation(df, inflation, src_label):
    """
    Take a dataframe with annual return values (not in decimal).
    Take a series of inflation rates for each year (not in decimal).
    Return the rates adjusted for inflation (not in decimal).
    """
    tmpdf = (1+df[src_label]/100)
    tmpinf = (1+inflation["Inflation"]/100)
    tmpdf = tmpdf/tmpinf
    tmpdf = (tmpdf - 1)*100 # Need this to convert 1.03 to 3%
    tmpdf = tmpdf.dropna()
    return tmpdf
mcrordi["SP500MCI"] = adjust_for_inflation(mcrordi, inflation, "SP500MCI")

mcrordi = mcrordi.dropna()

def mc_returns(df, num_years, num_trials, src_label):
    rates = []
    for x in xrange(num_trials):
        rates.append(avgrate(df[src_label].sample(num_years, replace=True)))
    return pd.Series(rates)

NUM_TRIALS = 20000
mcrordi_dist = pd.DataFrame(mc_returns(mcrordi, 5, NUM_TRIALS, "SP500MCI"))
mcrordi_dist.columns = ["05yrs"]
mcrordi_dist["10yrs"] = mc_returns(mcrordi, 10, NUM_TRIALS, "SP500MCI")
mcrordi_dist["20yrs"] = mc_returns(mcrordi, 20, NUM_TRIALS, "SP500MCI")
mcrordi_dist["30yrs"] = mc_returns(mcrordi, 30, NUM_TRIALS, "SP500MCI")

SCALE = 3
def make_hist(df, title, xlabel, filename, new_fig=True, xlim=None, ylim=None):
    if new_fig:
        f = plt.figure()
    FONTSIZE = 15
    ax = df.plot(kind='hist', bins=50, fontsize=FONTSIZE,
      sort_columns=True, title=title, figsize=(4*SCALE,3*SCALE),
                xlim=xlim, ylim=ylim)
    plt.xlabel(xlabel)  
    ax.set_title(title, fontsize=FONTSIZE)
    plt.legend(loc=0, prop={'size':FONTSIZE})
    ax.xaxis.label.set_size(FONTSIZE)
    ax.yaxis.label.set_size(FONTSIZE)
    plt.grid(True)
    fig = ax.get_figure()
    fig.savefig("plots/%s.png" % filename)

make_hist(mcrordi_dist,
         "Distribution of effective annual rate of returns (Dividends Invested) (Inflation Adjusted)",
         "Annualized rate of return (%)", "ror_mc_ls_inf")

<matplotlib.figure.Figure at 0x7f2bed46cb50>

for num_years in "30 20 10 05".split():
    make_hist(mcrordi_dist["%syrs" % num_years], 
              "%syr window: Dividends Invested and Inflation Adjusted" % num_years,
              "Annualized rate of return (%)", "ror_mc_inf_%s" % num_years, xlim=(-30,50))

mcrordi_dist.describe()

These plots, more than anything, show the volatility of short term investing. Although they all have the same expected rate of return, the spread is more the shorter the investing interval.

Dollar Cost Averaging, Inflation Adjusted¶

mcrordidca = rord[::12]
mcrordidca = pd.DataFrame(mcrordidca)
mcrordidca.columns = ["SP500MCIDCA"]
mcrordidca.index = pd.to_datetime(mcrordidca.index)
mcrordidca = mcrordidca.dropna()

mcrordidca["SP500MCIDCA"] = adjust_for_inflation(mcrordidca, inflation, "SP500MCIDCA")

mcrordidca = mcrordidca.dropna()

def mc_returns_dca(df, num_years, num_trials, src_label):
    rates = []
    for x in xrange(num_trials):
        rates.append(periodicrate(df[src_label].sample(num_years, replace=True))[0])
    return pd.Series(rates)

NUM_TRIALS = 20000
mcrordidca_dist = pd.DataFrame(mc_returns_dca(mcrordidca, 5, NUM_TRIALS, "SP500MCIDCA"))
mcrordidca_dist.columns = ["05yrs"]
mcrordidca_dist["10yrs"] = mc_returns_dca(mcrordidca, 10, NUM_TRIALS, "SP500MCIDCA")
mcrordidca_dist["20yrs"] = mc_returns_dca(mcrordidca, 20, NUM_TRIALS, "SP500MCIDCA")
mcrordidca_dist["30yrs"] = mc_returns_dca(mcrordidca, 30, NUM_TRIALS, "SP500MCIDCA")

make_hist(mcrordidca_dist,
         "Distribution of effective annual rate of returns (Dividends Invested) (Inflation Adjusted) (Dollar Cost Avg)",
         "Annualized rate of return (%)", "ror_mc_dca_inf")

<matplotlib.figure.Figure at 0x7f2be4a3ef90>

for num_years in "30 20 10 05".split():
    make_hist(mcrordidca_dist["%syrs" % num_years], 
              "%syr window: Dividends Invested, Inflation Adjusted, and Dollar Cost Avg" % num_years,
              "Annualized rate of return (%)", "ror_mc_dca_inf_%s" % num_years, xlim=(-30,50))

mcrordidca_dist.describe()

	SP500	Dividend	Earnings	Consumer Price Index	Long Interest Rate	Real Price	Real Dividend	Real Earnings	P/E10
Date
1871-01-01	4.44	0.26	0.4	12.46	5.32	83.78	4.91	7.55	NaN
1871-02-01	4.50	0.26	0.4	12.84	5.32	82.40	4.76	7.32	NaN
1871-03-01	4.61	0.26	0.4	13.03	5.33	83.18	4.69	7.22	NaN
1871-04-01	4.74	0.26	0.4	12.56	5.33	88.76	4.87	7.49	NaN
1871-05-01	4.86	0.26	0.4	12.27	5.33	93.13	4.98	7.66	NaN

	SP500	05yrs	10yrs	20yrs	30yrs
Date
1984-01-01	3.125000	11.393572	11.011984	10.063832	8.305213
1985-01-01	21.328671	14.652799	10.488457	10.127091	NaN
1986-01-01	27.041306	9.348136	11.429059	9.500198	NaN
1987-01-01	-5.293006	9.483893	11.222533	8.781904	NaN
1988-01-01	13.932136	11.681752	14.418984	8.901547	NaN
1989-01-01	19.120533	10.631703	15.905181	5.704302	NaN
1990-01-01	-4.259199	6.475370	15.413179	6.159330	NaN
1991-01-01	27.831884	13.549582	15.163499	7.097176	NaN
1992-01-01	4.602480	12.988782	10.606538	5.863930	NaN
1993-01-01	8.675873	17.223304	7.485817	6.312177	NaN
1994-01-01	-1.636398	21.430029	9.123778	6.976679	NaN
1995-01-01	32.062332	25.101251	9.766907	NaN	NaN
1996-01-01	24.706227	16.800355	7.604727	NaN	NaN
1997-01-01	25.728903	8.274521	6.394831	NaN	NaN
1998-01-01	29.626516	-1.442797	3.650167	NaN	NaN
1999-01-01	14.159533	-1.935304	-3.598792	NaN	NaN
2000-01-01	-6.310370	-3.687823	-2.352543	NaN	NaN
2001-01-01	-14.631298	-0.866934	-0.404163	NaN	NaN
2002-01-01	-21.432017	4.547773	1.324676	NaN	NaN
2003-01-01	26.419896	9.006312	5.151352	NaN	NaN
2004-01-01	4.316922	-5.234062	4.871826	NaN	NaN
2005-01-01	8.237614	-0.998752	NaN	NaN	NaN
2006-01-01	11.373003	0.060768	NaN	NaN	NaN
2007-01-01	-3.187844	-1.799057	NaN	NaN	NaN
2008-01-01	-37.220401	1.432722	NaN	NaN	NaN
2009-01-01	29.806604	16.055411	NaN	NaN	NaN
2010-01-01	14.154755	NaN	NaN	NaN	NaN
2011-01-01	1.400259	NaN	NaN	NaN	NaN
2012-01-01	13.826139	NaN	NaN	NaN	NaN
2013-01-01	23.099162	NaN	NaN	NaN	NaN

	SP500D	05yrs	10yrs	20yrs	30yrs
Date
1981-01-01	-7.374496	14.583149	13.783636	15.725125	10.760281
1982-01-01	30.045597	22.886735	17.884676	15.330400	11.166137
1983-01-01	20.334368	16.019957	15.683760	12.550949	10.749651
1984-01-01	7.829149	15.547594	14.824409	12.925126	10.908009
1985-01-01	26.372237	18.623213	14.088416	12.830541	NaN
1986-01-01	31.417088	12.989702	14.874652	12.056537	NaN
1987-01-01	-2.448462	13.086223	14.517339	11.232317	NaN
1988-01-01	17.904591	15.348538	17.664846	11.288423	NaN
1989-01-01	22.964587	14.105750	18.959270	7.964258	NaN
1990-01-01	-0.917357	9.726979	18.223921	8.402917	NaN
1991-01-01	31.979363	16.791048	17.699742	9.278864	NaN
1992-01-01	7.707534	15.966566	12.831469	7.951851	NaN
1993-01-01	11.688355	20.027669	9.502978	8.362084	NaN
1994-01-01	1.113588	24.019236	11.057259	9.000190	NaN
1995-01-01	35.356233	27.378842	11.586535	NaN	NaN

	Inflation
Year
1914-01-01	1.35
1915-01-01	0.92
1916-01-01	7.64
1917-01-01	17.80
1918-01-01	17.26

	SP500DI	05yrs	10yrs	20yrs	30yrs
1984-01-01	7.829149	11.643891	10.629747	9.510716	7.807358
1985-01-01	26.372237	14.499472	10.100354	9.504667	NaN
1986-01-01	31.417088	8.678293	10.938643	8.761887	NaN
1987-01-01	-2.448462	8.278391	10.483486	7.891930	NaN
1988-01-01	17.904591	10.579262	13.665700	7.988699	NaN
1989-01-01	22.964587	9.624814	15.199269	4.774685	NaN
1990-01-01	-0.917357	5.870251	14.779549	5.466750	NaN
1991-01-01	31.979363	13.246004	14.490895	6.511746	NaN
1992-01-01	7.707534	12.733488	9.905976	5.273646	NaN
1993-01-01	11.688355	16.838286	6.814022	5.723175	NaN
1994-01-01	1.113588	21.057187	8.403004	6.423281	NaN
1995-01-01	35.356233	24.438590	8.912202	NaN	NaN
1996-01-01	27.386153	15.749471	6.627842	NaN	NaN
1997-01-01	27.934878	7.149382	5.361162	NaN	NaN
1998-01-01	31.536502	-2.350200	2.595233	NaN	NaN
1999-01-01	15.571500	-2.928429	-4.706561	NaN	NaN
2000-01-01	-5.223845	-4.676935	-3.090442	NaN	NaN
2001-01-01	-13.492580	-1.774958	-0.911317	NaN	NaN
2002-01-01	-20.163230	3.602786	0.836559	NaN	NaN
2003-01-01	28.593233	7.791126	4.643469	NaN	NaN
2004-01-01	6.037115	-6.452121	4.479713	NaN	NaN
2005-01-01	10.118992	-1.477545	NaN	NaN	NaN
2006-01-01	13.377683	-0.040083	NaN	NaN	NaN
2007-01-01	-1.459636	-1.855808	NaN	NaN	NaN
2008-01-01	-35.711835	1.587727	NaN	NaN	NaN
2009-01-01	33.351939	16.689022	NaN	NaN	NaN
2010-01-01	16.386244	NaN	NaN	NaN	NaN
2011-01-01	3.367617	NaN	NaN	NaN	NaN
2012-01-01	16.196003	NaN	NaN	NaN	NaN
2013-01-01	25.604721	NaN	NaN	NaN	NaN
2014-01-01	NaN	NaN	NaN	NaN	NaN

	05yrs	10yrs	20yrs	30yrs
count	20000.000000	20000.000000	20000.000000	20000.000000
mean	6.853995	6.660871	6.589037	6.544952
std	9.175585	6.467511	4.586195	3.728357
min	-24.743745	-19.642585	-9.839428	-8.223338
25%	0.617879	2.274127	3.488647	4.000163
50%	6.987468	6.693657	6.656587	6.584169
75%	13.112201	11.083349	9.717398	9.049604
max	50.751993	29.903719	26.922536	21.116198

	05yrs	10yrs	20yrs	30yrs
count	20000.000000	20000.000000	20000.000000	20000.000000
mean	7.232822	6.891985	6.818483	6.824882
std	9.910812	7.197388	5.022567	4.048870
min	-31.020148	-24.041342	-15.301114	-11.703804
25%	0.666543	2.212749	3.548100	4.161366
50%	7.500898	7.092738	6.909313	6.883050
75%	14.004130	11.751754	10.256557	9.631896
max	45.169947	31.498749	24.925648	21.015151