Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

6 Stochastic Discounting

Authors
Affiliations
New York University
Australian National University

In this chapter we describe how to extend the MDP model to handle time-varying discount factors, a specification now widely used in macroeconomics and finance.

6.1Time-Varying Discount Factors

We introduce formulas for infinite-horizon lifetime valuations under stochastic discounting and provide necessary and sufficient conditions for existence of finite solutions.

6.1.1Valuation

Our first step is to motivate and understand lifetime valuation when discount factors vary over time.

6.1.1.1Motivation

In Section 3.2.2.2 we discussed firm valuation in a setting where the interest rate is constant. But data show that interest rates are time-varying, even for safe assets such as US Treasury bills. Figure Figure 6.1 shows nominal interest rate on one year Treasury bills since the 1950s, whereas Figure Figure 6.2 shows an estimate of the real interest rate for 10 year T-bills since 2012. Both nominal and real interest rates are evidently time varying.

Nominal US interest rates (plot_interest_rates_nominal.jl)

Figure 6.1:Nominal US interest rates (plot_interest_rates_nominal.jl)

Real US interest rates (plot_interest_rates_real.jl)

Figure 6.2:Real US interest rates (plot_interest_rates_real.jl)

6.1.1.2Theory

The aim of this section is to understand and evaluate expressions such as (6.2). Throughout,

βtb(Xt1,Xt) for tNwithβ01.\beta_t \coloneq b(X_{t-1}, X_t) \text{ for } t \in \NN \quad \text{with} \quad \beta_0 \coloneq 1.

The sequence (βt)t0(\beta_t)_{t \geq 0} is called a discount factor process and i=0tβi\prod_{i=0}^t \beta_i is the discount factor for period tt payoffs evaluated at time zero. We are interested in expected discounted sums of the form

v(x)Ext=0[i=0tβi]h(Xt)(xX).v(x) \coloneq \EE_x \, \sum_{t=0}^\infty \left[ \prod_{i=0}^t \beta_i \right] h(X_t) \qquad (x \in \Xsf).

Theorem 6.1.1 generalizes Lemma 3.2.1. Indeed, if bβ(0,1)b \equiv \beta \in (0, 1), then L=βPL = \beta P and ρ(L)=βρ(P)=β<1\rho(L) = \beta \rho(P) = \beta < 1, so the result in Theorem 6.1.1 reduces to Lemma 3.2.1.

In (6.11) we passed expectations through an infinite sum. This operation is valid under the assumption ρ(L)<1\rho(L)<1. A complete proof can be found in Section B.2.

6.1.2Testing the Spectral Radius Condition

In Theorem 6.1.1 the condition ρ(L)<1\rho(L) < 1 drives stability. In this section, we develop necessary and sufficient conditions for ρ(L)<1\rho(L) < 1 to hold.

6.1.2.1Spectral Radii via Expectations

First we develop an alternative representation of the spectral radius based on expectations. The next result is proved via a local spectral radius argument. In the statement, βt\beta_t is as defined in (6.3) and LL is the operator in (6.5).

The expression in (6.13) connects the spectral radius with the long-run properties of the discount factor process. The connection becomes even simpler when PP is irreducible, as the next exercise asks you to show.

Exercise 6.3 shows that the spectral radius is a long-run (geometric) average of the discount factor process. For the conclusions of Theorem 6.1.1 to hold, we need this long-run average to be less than unity.

Figure Figure 6.3 illustrates the condition ρ(L)<1\rho(L) < 1 when βt=Xt\beta_t = X_t and PP is a Markov matrix produced by discretization of the AR1 process

Xt+1=μ(1a)+aXt+s(1a2)1/2ϵt+1(ϵt) iid N(0,1).X_{t+1} = \mu (1 - a) + a X_t + s (1 - a^2)^{1/2} \epsilon_{t+1} \qquad (\epsilon_t) \iidsim N(0,1).

The discussion in Section 3.1.3 tells us that the stationary distribution ψ\psi^* of (6.17) is normally distributed with mean μ\mu and standard deviation ss. The parameter aa controls autocorrelation. In the figure we set μ\mu to 0.96, which, since βt=Xt\beta_t = X_t, is the stationary mean of the discount factor process. The parameters aa and ss are varied in the figure, and the contour plot shows the corresponding value of ρ(L)\rho(L). The process (6.17) is discretized via the Tauchen method with the size of the state space set to 6 (which avoids negative values for β(x)\beta(x)).

\rho(L) for different values of (a,s) (discount_spec_rad.jl)

Figure 6.3:ρ(L)\rho(L) for different values of (a,s)(a,s) (discount_spec_rad.jl)

The figure shows that ρ(L)\rho(L) tends to increase with both the volatility and the autocorrelation of the state process. This seems natural given the expression on the right hand side of (6.15), since sequences of large values of βi\beta_i compound in the product i=0tβi\prod_{i=0}^{t} \beta_i, pushing up the long-run average value, and such sequences occur more often when autocorrelation and volatility are large.

We finish this section with a lemma that simplifies computation of the spectral radius in settings where the process (βt)(\beta_t) depends only on a subset of the state variables -- a setting that is common in applications. In the statement of the lemma, the state space X\Xsf takes the form X=Y×Z\Xsf = \Ysf \times \Zsf. We fix QM(RZ)Q \in \mopz and RM(RY)R \in \mopy. The discount operator LL is

L(x,x)=b(z,z)Q(z,z)R(y,y)withb ⁣:Z×ZR+.L(x,x') = b(z, z') Q(z, z') R(y, y') \quad \text{with} \quad b \colon \Zsf \times \Zsf \to \RR_+.

Let (Zt)(Z_t) and (Yt)(Y_t) be QQ-Markov and RR-Markov respectively, so that, with PP as the pointwise product of QQ and RR, the process (Xt)((Zt,Yt))(X_t) \coloneq ((Z_t, Y_t)) is PP-Markov. We set LZ(z,z)b(z,z)Q(z,z)L_\Zsf(z,z') \coloneq b(z, z') Q(z, z').

6.1.2.2Necessary Conditions

In Section 6.1.1 we studied settings where lifetime value is a function vv on a state space X\Xsf that satisfies an equation of the form v=h+Lvv = h + L v. The unknown is vRXv \in \RR^\Xsf where X\Xsf is a finite set, hRXh \in \RR^\Xsf is given and LL is a linear operator from RX\RR^\Xsf to itself. We discussed the fact that ρ(L)<1\rho(L) < 1 is sufficient for v=h+Lvv = h + L v to have a unique solution.

In some settings the condition ρ(L)<1\rho(L) < 1 is also necessary. For example, let

In this setting we have the following result:

In Section 7.1.3 we will extend lemma Lemma 6.1.4 to handle certain nonlinear equations.

6.1.3Fixed-Point Results

State-dependent discounting breaks the contractivity properties that we exploited in Chapter 5, when we studied optimality of MDPs (see, e.g., the proof of Proposition 5.1.1). Here we introduce a generalization of Banach’s fixed-point theorem that can deliver global stability under weaker conditions. For the remainder of this section, X\Xsf is any finite set.

6.1.3.1Long-Run Contractions

Fix URXU \subset \RR^\Xsf. We call a self-map TT on UU eventually contracting if there exists a kNk \in \NN and a norm \| \cdot \| on RX\RR^\Xsf such that TkT^k is a contraction on UU under \| \cdot \|.

The next example illustrates Theorem 6.1.5 by proving a result similar to Exercise 1.20.

Example 6.1.2 illustrates the connection between Theorem 6.1.5 and the Neumann series lemma. Theorem 6.1.5 is more general because it can be applied in nonlinear settings. But the Neumann series lemma remains important because, when applicable, it provides inverse and power series representations of the fixed point.

On one hand, if TT is a contraction map on URXU \subset \RR^\Xsf with respect to a given norm a\| \cdot \|_a, we cannot necessarily say that TT is a contraction with respect to some other norm b\| \cdot \|_b on RX\RR^\Xsf. On the other hand, if TT is an eventual contraction on UU with respect to some given norm on RX\RR^\Xsf, then TT is eventually contracting with respect to every norm on RX\RR^\Xsf. The next exercise asks you to verify this.

6.1.3.2A Spectral Radius Condition

The following sufficient condition for eventual contractivity will be helpful when we study dynamic programs with state-dependent discounting.

6.1.3.3A Generalized Blackwell Condition

In Section 2.2.3.3 we studied a sufficient condition for order-preserving self maps to be contractions. The next proposition provides an analogous result for eventual contractions. In the statement of the proposition, UU is a subset of RX\RR^\Xsf such that v,cUv, c \in U and c0c \geq 0 implies v+cUv+c \in U.

6.2Optimality with State-Dependent Discounting

We can now turn to dynamic programs in which the objective is to maximize a lifetime value in the presence of state-dependent discounting. First, we present an extension of the MDP model from Chapter 5 that admits state-dependent discounting. Then we provide weak conditions under which optimal policies exist and Bellman’s principle of optimality holds.

6.2.1MDPs with State-Dependent Discounting

We are ready to extend the MDP model to include state-dependent discounting. We construct a framework and then provide weak conditions for optimality based on spectral radius methods.

6.2.1.1Setup

To provide a framework for dynamic programs with state-dependent discounting, we begin with an MDP (Γ,β,r,P)(\Gamma, \beta, r, P) with state space X\Xsf, action space A\Asf and feasible state action pairs G\Gsf. We then replace the constant discount factor β\beta with a function β\beta from G×X\Gsf \times \Xsf to R+\RR_+. We call the resulting model an MDP with state-dependent discounting. The Bellman equation takes the form

v(x)=maxaΓ(x){r(x,a)+xv(x)β(x,a,x)P(x,a,x)},v(x) = \max_{a \in \Gamma(x)} \left\{ r(x, a) + \sum_{x'} v(x') \beta(x, a, x') P(x, a, x') \right\},

where xXx \in \Xsf and vRXv \in \RR^\Xsf. Notice that the discount factor depends on all relevant information: The current action, the current state and the stochastically determined next period state.

For MDPs with state-dependent discounting, we can obtain standard optimality results by assuming a that there exists a b<1b < 1 such that β(x,a,x)b\beta(x, a, x') \leq b for all (x,a,x)G×X(x, a, x') \in \Gsf \times \Xsf. In this setting it is easy to show that lifetime values are finite, and to extend the optimality results for regular MDPs found in Proposition 5.1.1.

Unfortunately, the assumption discussed in the previous paragraph is too strict for many applications. (We return to this point in Section 6.2.1.6.) We will state an optimality result under weaker conditions.

6.2.1.2Finite Lifetime Values

Let Σ\Sigma be the set of all feasible policies, defined as for regular MDPs. The policy operator TσT_\sigma corresponding to σΣ\sigma \in \Sigma is represented by

(Tσv)(x)=r(x,σ(x))+xv(x)β(x,σ(x),x)P(x,σ(x),x).(T_\sigma \, v)(x) = r(x, \sigma(x)) + \sum_{x'} v(x') \beta(x, \sigma(x), x') P(x, \sigma(x), x').

Following Chapter 5, we set rσ(x)r(x,σ(x))r_\sigma(x) \coloneq r(x,\sigma(x)). We define LσL(RX)L_\sigma \in \lopx via

Lσ(x,x)β(x,σ(x),x)P(x,σ(x),x).L_\sigma(x,x') \coloneq \beta(x, \sigma(x), x') P(x, \sigma(x), x').

Notice that we can now write (6.29) as Tσv=rσ+LσvT_\sigma \, v = r_\sigma + L_\sigma \, v. In line with our discussion of MDPs in Chapter 5, when TσT_\sigma has a unique fixed point we denote it by vσv_\sigma and interpret it as lifetime value.

As discussed, the value vσ(x)v_\sigma(x) has the interpretation of lifetime value of policy σ\sigma conditional on initial state xx. We can reinforce this interpretation by connecting Lemma 6.2.1 to Theorem 6.1.1. The next exercise asks you to work through all the steps.

6.2.1.3Optimality

The Bellman operator takes the form

(Tv)(x)=maxaΓ(x){r(x,a)+xv(x)β(x,a,x)P(x,a,x)},(Tv)(x) = \max_{a \in \Gamma(x)} \left\{ r(x, a) + \sum_{x'} v(x') \beta(x, a, x') P(x, a, x') \right\},

where xXx \in \Xsf and vRXv \in \RR^\Xsf.

Given vRXv \in \RR^\Xsf, a policy σ\sigma is called vv-greedy if σ(x)\sigma(x) is a maximizer of the right-hand side of (6.35) for all xx in X\Xsf. Equivalently, σ\sigma is vv-greedy whenever Tσv=TvT_\sigma \, v = Tv.

When Assumption 6.2.1 holds and, as a result, TσT_\sigma has a unique fixed point vσv_\sigma for each σΣ\sigma \in \Sigma, we let vv^* denote the value function, which is defined as vσΣvσv^* \coloneq \vee_{\sigma \in \Sigma} v_\sigma. As for the regular MDP case, a policy σ\sigma is called optimal if vσ=vv_\sigma = v^*.

We can now state our main optimality result for MDPs with state-dependent discounting.

In Section 8.2.2 we prove a result that includes Proposition 6.2.2 as a special case.

6.2.1.4Algorithms

Algorithms for solving an MDP with state-dependent discounting include value function iteration (VFI), Howard policy iteration (HPI), and optimistic policy iteration (OPI). The algorithms for VFI and OPI are identical to those given for regular MDPs (see Section 5.1.4), provided that the correct operators TT and TσT_\sigma are used, and that the definition of a vv-greedy policy is as given in Section 6.2.1.1. The algorithm for HPI is almost identical, with the only change being that computation of lifetime values involves LσL_\sigma. Details are given in Algorithm 6.1.

We prove in Chapter 8 that, under the conditions of Assumption 6.2.1, VFI, OPI and HPI are all convergent, and that HPI converges to an exact optimal policy in a finite number of steps.

6.2.1.5Exogenous Discounting

Some applications use an exogenous state component to drive a discount factor process. In this section we set up such a model and obtain optimality conditions by applying Proposition 6.2.2.

The first step is to decompose the state XtX_t into a pair (Yt,Zt)(Y_t, Z_t), where (Yt)t0(Y_t)_{t \geq 0} is endogenous (i.e., affected by the actions of the controller) and (Zt)t0(Z_t)_{t \geq 0} is purely exogenous. In particular, the primitives consist of

  1. a nonempty correspondence Γ\Gamma from Y×Z\Ysf \times \Zsf to A\Asf,

  2. a function β\beta from Z\Zsf to R+\RR_+,

  3. a function rr from G{(y,a)Y×A:aΓ(y)}\Gsf \coloneq \setntn{(y, a) \in \Ysf \times \Asf}{a \in \Gamma(y)} to R\RR,

  4. a stochastic matrix QQ on Z\Zsf and

  5. a stochastic kernel RR from G\Gsf to Y\Ysf.

The corresponding Bellman equation is

v(y,z)=maxaΓ(y,z){r(y,a)+β(z)z,yv(y,z)Q(z,z)R(y,a,y)},v(y, z) = \max_{a \in \Gamma(y, z)} \left\{ r(y, a) + \beta(z) \sum_{z', \, y'} v(y', z') Q(z, z') R(y, a, y') \right\},

for all (y,z)X(y, z) \in \Xsf. Given vRXv \in \RR^\Xsf, a policy σΣ\sigma \in \Sigma is called vv-greedy if

σ(y,z)argmaxaΓ(y,z){r(y,a)+β(z)z,yv(y,z)Q(z,z)R(y,a,y)},\sigma(y, z) \in \argmax_{a \in \Gamma(y, z)} \left\{ r(y, a) + \beta(z) \sum_{z', \, y'} v(y', z') Q(z, z') R(y, a, y') \right\},

for all (y,z)X(y, z) \in \Xsf.

This exogenous discount model is a special case of the general MDP with state-dependent discounting. Indeed, we can write (6.36) as (6.35) by setting x(y,z)x \coloneq (y,z) and defining

P(x,a,x)P((y,z),a,(y,z))Q(z,z)R(y,a,y).P(x,a,x') \coloneq P((y, z),a, (y', z')) \coloneq Q(z, z') R(y, a, y').

The following proposition provides a relatively simple sufficient condition for the core optimality results in the setting of the exogenous discount model.

6.2.1.6Comments on the Spectral Radius Condition

In Section 6.2.1.2 we mentioned that requiring supβ<1\sup \beta < 1 is too strict for some applications. For example, the real interest rate rtr_t shown in Figure Figure 6.2 is sometimes negative. Using long historical records, Farmer et al. (2023) find that the discount rate is negative around 1/3 of the time. This means that the associated discount factor βt=1/(1+rt)\beta_t = 1/(1+r_t) is sometimes greater than 1 and supβ<1\sup \beta < 1 fails.

In macroeconomics, empirically motivated time-varying discount factor specifications lead to models where βt>1\beta_t > 1 occurs with positive probability. For example, Hills et al. (2019) study a model that can be embedded in the MDP framework just described. Figure Figure 6.4 shows a simulation of one of the discount factor processes used in their model, prior to discretization. The exogenous state and discount factor process takes the form βt=bZt\beta_t = b Z_t, where (Zt)(Z_t) is an exogenous state obeying Zt+1=1ρ+ρZt+σϵt+1Z_{t+1} = 1 - \rho + \rho Z_t + \sigma \epsilon_{t+1} with (ϵt)(\epsilon_t) iid and standard normal. Clearly supβ<1\sup \beta < 1 fails for this model too.

Let’s now consider the weaker condition ρ(L)<1\rho(L) < 1 described in Proposition 6.2.3 and check whether it holds. Following Hills et al. (2019), we discretize the dynamics of (Zt)(Z_t) via a Tauchen approximation, producing a stochastic matrix QQ on a finite set Z\Zsf.[2] The set of values for βt\beta_t ranges between 0.95 and 1.04, so that βt>1\beta_t > 1 remains possible. Nonetheless, with L(z,z)=β(z)Q(z,z)L(z, z') = \beta(z) Q(z, z') we obtain ρ(L)=0.9996\rho(L)=0.9996. Hence Proposition 6.2.3 applies.

Discount factor process (\beta)_{t
    \geq 0} in .

Figure 6.4:Discount factor process (β)t0(\beta)_{t \geq 0} in Hills et al. (2019).

6.2.2Inventory Management Revisited

In this section, we modify the inventory management model from Section 5.2.1 to include time-varying interest rates.

Recall that, in the model of Section 5.2.1, the Bellman equation takes the form

v(x)=maxaΓ(x){r(x,a)+βd0v(f(x,a,d))ϕ(d)},v(x) = \max_{a \in \Gamma(x)} \left\{ r(x, a) + \beta \sum_{d \geq 0} v(f(x, a, d)) \phi(d) \right\},

at each xXx \in \Xsf, where X{0,,K}\Xsf \coloneq \{0, \ldots, K\}, xx is the current inventory level, aa is the current inventory order, r(x,a)r(x, a) is current profits (defined in (5.12)), f(x,a,d)(xd)0+af(x,a,d) \coloneq (x - d)\vee 0 + a and dd is an iid demand shock with distribution ϕ\phi. Let’s now add a time-varying discount rate and investigate its impact on optimal choices.

We add time-varying discounting by replacing the constant β\beta in (6.40) with a stochastic process (βt)(\beta_t) where βt=1/(1+rt)\beta_t = 1/(1+r_t). We suppose that the dynamics can be expressed as βt=β(Zt)\beta_t = \beta(Z_t), where the exogenous process (Zt)t0(Z_t)_{t \geq 0} is QQ-Markov on Z\Zsf. After relabeling the endogenous state XtX_t as YtY_t and xx as yy, in line with the notation in Section 6.2.1.5, the Bellman equation becomes v(y,z)=maxaΓ(y,z)B((y,z),a,v)v(y, z) = \max_{a \in \Gamma(y, z)} B((y, z), a, v) where

B((y,z),a,v)=r(y,a)+β(z)d,zv(f(y,a,d),z)ϕ(d)Q(z,z).B((y, z), a, v) = r(y, a) + \beta(z) \sum_{d, \, z'} v(f(y, a, d), z') \phi(d) Q(z, z').

If we set

R(y,a,y)P{f(y,a,d)=y}whenDϕ,R(y, a, y') \coloneq \PP\{f(y, a, d) = y'\} \quad \text{when} \quad D \sim \phi,

then R(y,a,y)R(y, a, y') is the probability of realizing next period inventory level yy' when the current level is yy and the action is aa. Hence we can rewrite (6.41) as

B((y,z),a,v)=r(y,a)+β(z)y,zv(y,z)Q(z,z)R(y,a,y).B((y, z), a, v) = r(y, a) + \beta(z) \sum_{y', z'} v(y', z') Q(z, z') R(y, a, y') .

We have now created a version of the MDP with exogenous state-dependent discounting described in Section 6.2.1.5. Letting L(z,z)β(z)Q(z,z)L(z, z') \coloneq \beta(z) Q(z, z') and applying Proposition 6.2.3, we see that all of the standard optimality results hold whenever ρ(L)<1\rho(L)<1.

Figure Figure 6.5 shows how inventory evolves under an optimal program when the parameters of the problem are as given in Listing 1. (The code preallocates and computes arrays representing rr, RR, and QQ in (6.43) and includes a test for ρ(L)<1\rho(L)<1.) We set β(z)=z\beta(z) = z and take (Zt)(Z_t) to be a discretization of an AR(1) process. Figure Figure 6.5 was created by simulating (Zt)(Z_t) according to QQ and inventory (Yt)(Y_t) according to Yt+1=(YtDt+1)0+AtY_{t+1} = (Y_t - D_{t+1} ) \vee 0 + A_t, where AtA_t follows the optimal policy.

The outcome is similar to Figure Figure 5.7, in the sense that inventory falls slowly and then jumps up. As before, fixed costs induce this lumpy behavior. However, a new phenomenon is now present: Inventories trend up when interest rates fall and down when they rise. (The interest rate rtr_t is calculated via βt=1/(1+rt)\beta_t = 1/(1+r_t) at each tt.) High interest rates foreshadow high interest rates due to positive autocorrelation (ρ>0\rho > 0), which in turn devalue future profits and hence encourage managers to economize on stock.

Inventory dynamics with time-varying interest rates

Figure 6.5:Inventory dynamics with time-varying interest rates

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
using LinearAlgebra, Random, Distributions, QuantEcon

f(y, a, d) = max(y - d, 0) + a  # Inventory update

function create_sdd_inventory_model(; 
            ρ=0.98, ν=0.002, n_z=20, b=0.97,  # Z state parameters
            K=40, c=0.2, κ=0.8, p=0.6,        # firm and demand parameters
            d_max=100)                        # truncation of demand shock

    ϕ(d) = (1 - p)^d * p                      # demand distribution
    d_vals = collect(0:d_max)
    ϕ_vals = ϕ.(d_vals)
    y_vals = collect(0:K)                     # inventory levels
    n_y = length(y_vals)
    mc = tauchen(n_z, ρ, ν)
    z_vals, Q = mc.state_values .+ b, mc.p
    ρL = maximum(abs.(eigvals(z_vals .* Q)))     
    @assert  ρL < 1 "Error: ρ(L) ≥ 1."    

    R = zeros(n_y, n_y, n_y)
    for (i_y, y) in enumerate(y_vals)
        for (i_y′, y′) in enumerate(y_vals)
            for (i_a, a) in enumerate(0:(K - y))
                hits = f.(y, a, d_vals) .== y′
                R[i_y, i_a, i_y′] = dot(hits, ϕ_vals)
            end
        end
    end

    r = fill(-Inf, n_y, n_y)
    for (i_y, y) in enumerate(y_vals)
        for (i_a, a) in enumerate(0:(K - y))
                cost = c * a + κ * (a > 0)
                r[i_y, i_a] = dot(min.(y, d_vals),  ϕ_vals) - cost
        end
    end

    return (; K, c, κ, p, r, R, y_vals, z_vals, Q)
end

Program 1:Investment model with time-varying discounting (inventory_sdd.jl)

Figure Figure 6.6 shows execution time for VFI and OPI at different choices of mm (see Section 6.2.1.3 for the interpretation of mm). As for the optimal savings problem we studied in Chapter 5, OPI is around 1 order of magnitude faster when mm is close to 50 (cf. Figure Figure 5.8).

OPV vs VFI timings for the inventory problem

Figure 6.6:OPV vs VFI timings for the inventory problem

6.3Asset Pricing

This section provides a brief introduction to asset pricing in a Markov environment. While the topic of asset pricing is fascinating in its own right, our main aim is to provide additional practice in handling linear valuation problems. (Readers who wish to push ahead with their study of dynamic programming can safely skip to Chapter 7.)

6.3.1Introduction to Asset Pricing

We first discuss risk-neutral pricing and show why this assumption is typically implausible. Next, we introduce stochastic discount factors and stationary asset pricing.

6.3.1.1Risk-Neutral Pricing?

Consider the problem of assigning a current price Πt\Pi_t to an asset that confers on its owner the right to payoff Gt+1G_{t+1}. The payoff is stochastic and realized next period. One simple idea is to use risk-neutral pricing, which implies that

Πt=EtβGt+1,\Pi_t = \EE_t \, \beta \, G_{t+1},

for some constant discount factor β(0,1)\beta \in (0,1). If the payoff is in kk periods, then we modify the price to EtβkGt+k\EE_t \, \beta^k \, G_{t+k}. In essence, risk-neutral pricing says that cost equals expected reward, discounted to present value by compounding a constant rate of discount. (A rate of discount, say ρ\rho, is linked to a discount factor, say β\beta, by β=1/(1+ρ)exp(ρ)\beta = 1/(1+\rho) \approx \exp(-\rho).)

Although risk neutrality allows for simple pricing, assuming risk neutrality for all investors is not plausible.

To give one example, suppose that we take the asset that pays Gt+1G_{t+1} in (6.44) and replace it with another asset that pays Ht+1=Gt+1+ϵt+1H_{t+1} =G_{t+1} + \epsilon_{t+1}, where ϵt+1\epsilon_{t+1} is independent of Gt+1G_{t+1}, Etϵt+1=0\EE_t \, \epsilon_{t+1}=0 and Varϵt+1>0\var \epsilon_{t+1} > 0. In effect, we are adding risk to the original payoff without changing its mean.

Under risk neutrality, the price of this new asset is

ΠtH=Etβ[Gt+1+ϵt+1]=Πt+βEtϵt+1=Πt.\Pi_t^H = \EE_t \, \beta \, [G_{t+1} + \epsilon_{t+1}] = \Pi_t + \beta \, \EE_t \, \epsilon_{t+1} = \Pi_t.

Thus, Ht+1H_{t+1} and Gt+1G_{t+1} are priced identically, even though their means are both EtGt+1\EE_t G_{t+1} and their variances satisfy

VarHt+1=VarGt+1+Varϵt+1>VarGt+1.\var H_{t+1} = \var G_{t+1} + \var \epsilon_{t+1} > \var G_{t+1}.

This outcome contradicts the idea that investors typically want compensation for bearing risk.

A helpful way to think about the same point is to consider the rate of return rt+1(Gt+1Πt)/Πtr_{t+1} \coloneq (G_{t+1}-\Pi_t) / \Pi_t on holding an asset with payoff Gt+1G_{t+1}. From (6.44) we have Etβ(1+rt+1)=1\EE_t \, \beta (1 + r_{t+1}) = 1, or

Etrt+1=1ββ.\EE_t \, r_{t+1} = \frac{1-\beta}{\beta}.

Since the right-hand side does not depend on Gt+1G_{t+1}, risk neutrality implies that all assets have the same expected rate of return. But this contradicts the finding that, on average, riskier assets tend to have higher rates of return that compensate investors for bearing risk.

6.3.1.2A Stochastic Discount Factor

To go beyond risk neutral-pricing, let’s start with a model containing one asset and one agent. It is straightforward to price the asset and compare it to the risk neutral case.

A representative agent takes the price Πt\Pi_t of a risky asset as given and solves

max0α1{u(Ct)+βEtu(Ct+1)}subject toCt=EtΠtαandCt+1=Et+1+αGt+1.\begin{aligned} & \max_{0 \leq \alpha \leq 1} \{ u(C_t) + \beta \EE_t u(C_{t+1}) \} \\ \text{subject to} \quad & C_t = E_t - \Pi_t \alpha \quad \text{and} \quad C_{t+1} = E_{t+1} + \alpha G_{t+1}. \end{aligned}

Here

Rewriting as maxα{u(EtΠtα)+βEtu(Et+1+αGt+1)}\max_\alpha \{ u(E_t - \Pi_t \alpha) + \beta \EE_t u(E_{t+1} + \alpha G_{t+1}) \} and differentiating with respect to α\alpha leads to the first order condition

u(EtΠtα)Πt=βEtu(Et+1+αGt+1)Gt+1.u'(E_t - \Pi_t \alpha) \Pi_t = \beta \EE_t u'(E_{t+1} + \alpha G_{t+1}) G_{t+1}.

Rearranging gives us

Πt=Et[βu(Ct+1)u(Ct)Gt+1].\Pi_t = \EE_t \left[ \beta \frac{u'(C_{t+1})}{u'(C_t)} G_{t+1} \right].

Comparing (6.51) with (6.44), we see that the payoff is now multiplied by a positive random variable rather than a constant. The random variable

Mt+1βu(Ct+1)u(Ct)M_{t+1} \coloneq \beta \frac{u'(C_{t+1})}{u'(C_t)}

is called the stochastic discount factor or pricing kernel. We call this particular form of the pricing kernel shown in (6.52) Lucas stochastic discount factor (Lucas SDF) in honor of Lucas (1978).

In the CRRA case, the Lucas SDF applies heavier discounting to assets that concentrate payoffs in states of the world where the agent is already enjoying strong consumption growth. Conversely, the SDF attaches higher weights to future payoffs that occur when consumption growth is low because such payoffs hedge against the risk of drawing low consumption states.

6.3.1.3A General Specification

The standard neoclassical theory of asset pricing generalizes the Lucas discounting specification by assuming only that there exists a positive random variable Mt+1M_{t+1} such that the price of an asset with payoff Gt+1G_{t+1} is

Πt=EtMt+1Gt+1(t0).\Pi_t = \EE_t \, M_{t+1} \, G_{t+1} \qquad (t \geq 0).

In line with the preceding discussion, Mt+1M_{t+1} is called a stochastic discount factor (SDF). Equation (6.54) generalizes (6.51) by refraining from restricting the SDF (apart from assuming positivity).

Actually, it can be shown that there exists an SDF Mt+1M_{t+1} such that (6.54) is always valid under relatively weak assumptions. In particular, a single SDF Mt+1M_{t+1} can be used to price any asset in the market, so if Ht+1H_{t+1} is a another stochastic payoff then the current price of an asset with this payoff is EtMt+1Ht+1\EE_t \, M_{t+1} \, H_{t+1}.

We do not prove these claims, since our interest is in understanding forward-looking equations in Markov environments. Some relevant references are listed in Section 6.4.

6.3.1.4Markov Pricing

A common assumption in quantitative applications is that all underlying randomness is driven by a Markov model. In this spirit, we take (Xt)(X_t) to be PP-Markov on finite-state X\Xsf, where PM(RX)P \in \mopx, and suppose further that the SDF and payoff have the forms

Mt+1=m(Xt,Xt+1)andGt+1=g(Xt,Xt+1),M_{t+1} = m(X_t, X_{t+1}) \quad \text{and} \quad G_{t+1} = g(X_t, X_{t+1}),

for fixed functions m,gm, g mapping X×X\Xsf \times \Xsf to R+\RR_+. Since mm is arbitrary at this point, we don’t assume a particular specification for the SDF.

In this setting, conditioning on Xt=xX_t = x, the standard asset pricing equation Πt=EtMt+1Gt+1\Pi_t = \EE_t \, M_{t+1} \, G_{t+1} becomes

π(x)=xm(x,x)g(x,x)P(x,x)(xX),\pi(x) = \sum_{x'} m(x, x') g(x, x') P(x, x') \qquad (x \in \Xsf),

where π(x)\pi(x) is the price of the asset conditional on Xt=xX_t = x (i.e., Πt=π(Xt)\Pi_t = \pi(X_t)).

6.3.1.5Pricing a Stationary Dividend Stream

Now we are ready to look at pricing a stationary cash flow over an infinite horizon, a basic problem in asset pricing. We will apply the Markov structure assumed in Section 6.3.1.4. In all that follows, (Xt)(X_t) is PP-Markov on X\Xsf and Mt+1M_{t+1} is defined as in Section 6.3.1.4.

We seek the time tt price, denoted by Πt\Pi_t, for an ex dividend contract on the dividend stream (Dt)t0(D_t)_{t \geq 0}. The contract provides the owner with the right to the dividend stream. The “ex dividend” component means that, should the dividend stream be traded at time tt, the dividend paid at time tt goes to the seller rather than the buyer. As a result, purchasing at tt and selling at t+1t+1 pays Πt+1+Dt+1\Pi_{t+1} + D_{t+1}. Hence, applying the asset pricing rule (6.54), at time tt price Πt\Pi_t of the contract must satisfy

Πt=EtMt+1(Πt+1+Dt+1).\Pi_t = \EE_t \, M_{t+1} (\Pi_{t+1} + D_{t+1}).

We assume the existence of a dR+Xd \in \RR_+^\Xsf such that Dt=d(Xt)D_t = d(X_t) for all tt. Using (6.56), we can write this as

π(x)=xm(x,x)(π(x)+d(x))P(x,x)(xX),\pi(x) = \sum_{x'} m(x, x') (\pi(x') + d(x')) P(x, x') \qquad (x \in \Xsf),

or, equivalently,

π=Aπ+Adwhen A(x,x)m(x,x)P(x,x).\pi = A \pi + A d \quad \text{when } A(x,x') \coloneq m(x, x') P(x, x').

By the Neumann series lemma, ρ(A)<1\rho(A) < 1 implies (6.59) has unique solution

π(IA)1Ad=k=1Akd.\pi^* \coloneq (I - A )^{-1} A d = \sum_{k=1}^\infty A^k d.

The vector π\pi^* is called an equilibrium price function.

6.3.1.6Forward Sum Representation

Asset prices can be expressed as infinite sums. Let’s show this for cum dividend contracts (although the case of ex dividend contracts is similar). In Exercise 6.13 you found that the state-contingent price vector π\pi for a cum dividend contract on the dividend stream (Dt)t0(D_t)_{t \geq 0} obeys

π=d+Aπwhen A(x,x)m(x,x)P(x,x)\pi = d + A \pi \quad \text{when } A(x,x') \coloneq m(x, x') P(x, x')

and ρ(A)<1\rho(A) < 1. As before, Dt=d(Xt)D_t = d(X_t) and (Xt)t0(X_t)_{t\geq 0} is PP-Markov on X\Xsf. Applying the uniqueness component of the Neumann series lemma and Theorem 6.1.1, we see that the function π\pi also obeys

π(x)=Ext=0[i=0tMi]Dt(xX),\pi(x) = \EE_x \, \sum_{t=0}^\infty \left[ \prod_{i=0}^t M_i \right] D_t \qquad (x \in \Xsf),

where Mt+1m(Xt,Xt+1)M_{t+1} \coloneq m(X_t, X_{t+1}) for t0t \geq 0 and M01M_0 \coloneq 1. This expression agrees with our intuition: The price of the contract is the expected present value of the dividend stream, with the time tt dividend discounted by the composite factor M1MtM_1 \cdots M_t.

6.3.2Nonstationary Dividends

Until now, our discussion of asset pricing has assumed that dividends are stationary. However, dividends typically grow over time, along with other economic measures such as GDP. In this section, we solve for the price of a dividend stream when dividends exhibit random growth.

6.3.2.1Price-Dividend Ratios

A standard model of dividend growth is

lnDt+1Dt=κ(Xt,ηt+1)t=0,1,,\ln \frac{D_{t+1}}{D_t} = \kappa(X_t, \eta_{t+1}) \qquad t = 0, 1, \ldots,

where κ\kappa is a fixed function, (Xt)(X_t) is the state process and (ηt)(\eta_t) is iid. We let ϕ\phi be the density of each ηt\eta_t and assume that (Xt)(X_t) is PP-Markov on a finite set X\Xsf. Let’s suppose as before that the SDF obeys Mt+1=m(Xt,Xt+1)M_{t+1} = m(X_t, X_{t+1}) for some positive function mm.

Since dividends grow over time, so will the price of the asset. As such, we should no longer seek a fixed function π\pi such that Πt=π(Xt)\Pi_t = \pi(X_t) for all tt, since the resulting price process (Πt)(\Pi_t) will fail to grow. Instead, we try to solve for the price-dividend ratio VtΠt/DtV_t \coloneq \Pi_t / D_t, which we hope will be stationary.

After conditioning on Xt=xX_t = x, (6.65) leads us to conjecture existence of a function vv such that

v(x)=xm(x,x)exp(κ(x,η))ϕ( ⁣dη)[1+v(x)]P(x,x),v(x) = \sum_{x'} m(x, x') \int \exp(\kappa(x, \eta)) \phi(\diff \eta) \left[ 1 + v(x') \right] P(x, x'),

for all xXx \in \Xsf. We understand (6.66) as an equation to be solved for the unknown object vRXv \in \RR^\Xsf. If we can find a solution vv^* to (6.66), then setting Vt=v(Xt)V_t = v^*(X_t) yields a process (Vt)(V_t) that obeys (6.65).

The price-dividend process (Vt)(V^*_t) defined by Vt=v(Xt)V^*_t = v^*(X_t) solves (6.65). The price can be recovered via Πt=VtDt\Pi_t = V^*_t D_t.

6.3.2.2Application: Markov Growth with a Lucas SDF

As an example, suppose that dividend growth obeys

κ(Xt,ηd,t+1)=μd+Xt+σdηd,t+1,\kappa(X_t, \eta_{d, t+1}) = \mu_d + X_t + \sigma_d \, \eta_{d, t+1},

where (ηd,t)t0(\eta_{d,t})_{t \geq 0} is iid and standard normal. Consumption growth is given by

lnCt+1Ct=μc+Xt+σcηc,t+1,\ln \frac{C_{t+1}}{C_t} = \mu_c + X_t + \sigma_c \, \eta_{c, t+1} ,

where (ηc,t)t0(\eta_{c,t})_{t \geq 0} is also iid and standard normal. We use the Lucas SDF in (6.53), implying that

Mt+1=β(Ct+1Ct)γ=βexp(γ(μc+Xt+σcηc,t+1)).M_{t+1} = \beta \left( \frac{C_{t+1}}{C_t} \right)^{-\gamma} = \beta \exp(-\gamma( \mu_c + X_t + \sigma_c \eta_{c, t+1} )).
1
2
3
4
5
6
7
8
9
10
11
12
13
using QuantEcon, LinearAlgebra

"Creates an instance of the asset pricing model with Markov state."
function create_asset_pricing_model(;
        n=200,              # state grid size
        ρ=0.9, ν=0.2,       # state persistence and volatility
        β=0.99, γ=2.5,      # discount and preference parameter
        μ_c=0.01, σ_c=0.02, # consumption growth mean and volatility
        μ_d=0.02, σ_d=0.1)  # dividend growth mean and volatility
    mc = tauchen(n, ρ, ν)
    x_vals, P = exp.(mc.state_values), mc.p
    return (; x_vals, P, β, γ, μ_c, σ_c, μ_d, σ_d)
end

Program 2:Asset pricing model with Lucas SDF (pd_ratio.jl)

Figure Figure 6.7 shows the price-dividend ratio function vv^* for the specification given in Listing 2, as well as for an alternative mean dividend growth rate μd\mu_d. The state process is a Tauchen discretization of an AR(1) process with positive autocorrelation. An increase in the state predicts higher dividends, which tends to increase the price. At the same time, higher xx also predicts higher consumption growth, which acts negatively on the price. For values of γ\gamma greater than 1, the second effect dominates and the price-dividend ratio slopes down.

Price-dividend ratio as a function of the state

Figure 6.7:Price-dividend ratio as a function of the state

6.3.3Incomplete Markets

In Section 6.3.1.5 we used the Neumann series lemma to solve for the equilibrium price vector π\pi. However, some modifications to the basic model introduce nonlinearities that render the Neumann series lemma inapplicable. For example, Harrison & Kreps (1978) analyze a setting with heterogeneous beliefs and incomplete markets, leading to failure of the standard asset pricing equation. This results in a nonlinear equation for prices.

We treat the Harrison & Kreps (1978) model only briefly. There are two types of agents. Type ii believes that the state updates according to stochastic matrix PiP_i for i=1,2i=1,2. Agents are risk-neutral, so m(x,y)β(0,1)m(x,y) \equiv \beta \in (0, 1). Harrison & Kreps (1978) show that, for their model, the equilibrium condition (6.58) becomes

π(x)=maxiβx[π(x)+d(x)]Pi(x,x)\pi(x) = \max_i \beta \sum_{x'} [\pi(x') + d(x')] P_i(x, x')

for xXx \in \Xsf and i{1,2}i \in \{1, 2\}. Setting aside the details that lead to this equation, our objective is simply to obtain a vector of prices π\pi that solves (6.74).

As a first step, we introduce an operator T ⁣:R+XR+XT \colon \RR^\Xsf_+ \to \RR^\Xsf_+ that maps π\pi to TπT \pi via

(Tπ)(x)=maxiβx[π(x)+d(x)]Pi(x,x)(xX).(T \pi)(x) = \max_i \beta \sum_{x'} [\pi(x') + d(x')] P_i(x, x') \qquad (x \in \Xsf).

We are assuming d0d \geq 0, so TT is indeed a self-map on R+X\RR^\Xsf_+.

By construction, a vector πR+X\pi \in \RR_+^\Xsf is a fixed point of TT if and only if it is a vector of prices that solves (6.74). Hence, we have successfully converted our equilibrium problem into a fixed point problem.

We aim to show that TT is a contraction. To this end, pick any p,qR+Xp, q \in \RR^\Xsf_+. Applying the inequality from Lemma 2.2.2, we obtain

(Tp)(x)(Tq)(x)βmaxix[p(x)+d(x)]Pi(x,x)x[q(x)d(x)]Pi(x,x).| (Tp)(x) - (Tq)(x) | \leq \beta \max_i \left| \sum_{x'} [p(x') + d(x')] P_i(x, x') - \sum_{x'} [q(x') - d(x')] P_i(x, x') \right|.

Using the triangle inequality and canceling terms leads to

(Tp)(x)(Tq)(x)βmaxi{1,2}xp(x)q(x)Pi(x,x)βpq.| (Tp)(x) - (Tq)(x) | \leq \beta \max_{i \in \{1, 2\}} \sum_{x'} |p(x') - q(x')| P_i(x, x') \leq \beta \| p - q \|_\infty.

Since this bound holds for all xx, we can take the maximum with respect to xx and obtain

TpTqβpq.\| Tp - Tq \|_\infty \leq \beta \| p - q \|_\infty.

Thus, on R+X\RR^\Xsf_+, the map TT is a contraction of modulus β\beta with respect to the sup norm.

Since R+X\RR^\Xsf_+ is a closed subset of RX\RR^\Xsf, we conclude that TT has a unique fixed point in this set. Hence, the system (6.74) has a unique solution π\pi^* in R+X\RR^\Xsf_+, representing equilibrium prices. This fixed point can be computed by successive approximation.

6.4Chapter Notes

Asset pricing is discussed in many sources, including Hansen & Renault (2010), Ross (2009), Cochrane (2009), Duffie (2010) and Campbell (2017). Asset pricing is part of many applications and extensions in macroeconomics, public finance, international economics, and other fields. Some of these are described in Ljungqvist & Sargent (2018).

Dynamic programming with state-dependent discounting is becoming more common in macroeconomics and finance. Representative examples include Krusell & Smith (1998), Woodford (2011), Christiano et al. (2014), Albuquerque et al. (2016), Saijo (2017), Basu & Bundick (2017), Groot et al. (2018), Schorfheide et al. (2018), Hills et al. (2019), Toda (2019), Fagereng et al. (2019), Hubmer et al. (2020) and Cao (2020). For more on the theory of state-dependent discounting, see Jasso-Fuentes et al. (2020), Toda (2021) or Stachurski & Zhang (2021). An analysis of sovereign default with time-varying interest rates is provided by Bloise & Vailakis (2022).

Another challenge to the standard model with constant discount rates comes from empirical and experimental studies that find evidence of “hyperbolic discounting,” where valuations across time fall rapidly at first and then more slowly. Provocative reviews of hyperbolic and quasi-hyperbolic discounting can be found in Frederick et al. (2002) and Rubinstein (2003). Cao & Werning (2018) provide conditions under which predictions from optimal savings models with quasi-hyperbolic discounting are robust. Balbus et al. (2018) analyze uniqueness of time-consistent stationary Markov policies for quasi-hyperbolic households under uncertainty. Balbus et al. (2022) study equilibria in dynamic models with recursive payoffs and generalized discounting. Noor & Takeoka (2022) addresses the topic of optimal discounting. Additional references include Diamond & Köszegi (2003), Dasgupta & Maskin (2005), Karp (2005), Amador et al. (2006), Balbus et al. (2018), Fedus et al. (2019), Hens & Schindler (2020), Jaśkiewicz & Nowak (2021), and Drugeon & Wigniolle (2021).

This chapter focused on time additive models with state-dependent discounting. More general preference specifications with this feature include Albuquerque et al. (2016), Schorfheide et al. (2018), Pohl et al. (2018), Gomez-Cram & Yaron (2020), and Groot et al. (2022). In Chapter 8 we consider state-dependent discounting in general settings that accommodate such nonlinearities.

Footnotes
  1. We are assuming that randomness in interest rates is a function of the same Markov state that influences profits. There is very little loss of generality in making this assumption. In fact, the two processes can still be statistically independent. For example, if we take XtX_t to have the form Xt=(Yt,Zt)X_t = (Y_t, Z_t), where (Yt)(Y_t) and (Zt)(Z_t) are independent Markov chains, then we can take βt\beta_t to be a function of YtY_t and πt\pi_t to be a function of ZtZ_t. The resulting interest and profit processes are statistically independent.

  2. The parameters are ρ=0.85\rho = 0.85, σ=0.0062\sigma = 0.0062, and b=0.99875b = 0.99875. In line with Hills et al. (2019), we discretize the model via mc = tauchen(n, ρ, σ, 1 - ρ, m){.julia} with m=4.5m = 4.5 and n=15n = 15.

References
  1. Krusell, P., & Smith, A. A., Jr. (1998). Income and wealth heterogeneity in the macroeconomy. Journal of Political Economy, 106(5), 867–896.
  2. Marimon, R. (1984). General Equilibrium and Growth under Uncertainty: the Turnpike Property.
  3. Marimon, R. (1989). Stochastic turnpike property and stationary equilibrium. Journal of Economic Theory, 47(2), 282–306.
  4. Stachurski, J., & Zhang, J. (2021). Dynamic programming with state-dependent discounting. Journal of Economic Theory, 192, 105190.
  5. Farmer, J. D., Geanakoplos, J., Richiardi, M. G., Montero, M., Perelló, J., & Masoliver, J. (2023). Discounting the distant future: What do historical bond prices imply about the long term discount rate? [Techreport]. arXiv, 2312.17157.
  6. Hills, T. S., Nakata, T., & Schmidt, S. (2019). Effective lower bound risk. European Economic Review, 120, 103321.
  7. Cochrane, J. H. (2009). Asset Pricing: Revised Edition. Princeton University Press.
  8. Lucas, R. E. (1978). Asset prices in an exchange economy. Econometrica, 46(6), 1429–1445.
  9. Harrison, J. M., & Kreps, D. M. (1978). Speculative investor behavior in a stock market with heterogeneous expectations. The Quarterly Journal of Economics, 92(2), 323–336.
  10. Hansen, L. P., & Renault, E. (2010). Pricing kernels. Encyclopedia of Quantitative Finance.
  11. Ross, S. A. (2009). Neoclassical Finance. Princeton University Press.
  12. Duffie, D. (2010). Dynamic Asset Pricing Theory. Princeton University Press.
  13. Campbell, J. Y. (2017). Financial Decisions and Markets: A Course in Asset Pricing. Princeton University Press.
  14. Ljungqvist, L., & Sargent, T. (2018). Recursive Macroeconomic Theory (4th ed.). MIT Press.
  15. Woodford, M. (2011). Simple analytics of the government expenditure multiplier. American Economic Journal: Macroeconomics, 3(1), 1–35.