Markov Dynamics - Dynamic Programming Volume I: Finite States

To prepare to analyze dynamic programs, we now study stochastic processes generated by Markov chains. These processes are widely used to construct economic and financial models.

At the end of this chapter we return to the job search problem from Chapter 1 and allow wage draws to be correlated over time (rather than IID). We use a Markov chain to generate serially correlated wage draws.

Throughout this chapter, the symbol $\Xsf$ represents a finite set.

3.1Foundations¶

This section describes elementary properties of Markov models.

3.1.1Markov Chains¶

Let’s start with a definition and some simple examples.

3.1.1.1Defining Markov Chains¶

Fix $\Xsf = \{x_1, \ldots, x_n\}$ and $P \in \mopx$ . We interpret $P(x, x')$ as the probability that a random process moves from $x$ to $x'$ over one unit of time. For this interpretation to make sense we need $P(x, x')$ to be nonnegative and $\sum_{x' \in \Xsf} P(x, x')$ to equal one for every $x \in \Xsf$ , since we want the chain to stay somewhere in the state space after each update. These are exactly the properties guaranteed by the assumption $P \in \mopx$ (see Exercise 2.3.12).

To formalize ideas, let $(X_t) \coloneq (X_t)_{t \geq 0}$ be a sequence of random variables taking values in $\Xsf$ and call $(X_t)$ a Markov chain on state space $\Xsf$ if there exists a $P \in \mopx$ such that

\PP \{ X_{t+1} = x' \mid X_0, X_1, \ldots, X_t \} = P(X_t, x') \quad \text{for all} \quad t \geq 0, \; x' \in \Xsf.

(3.1)

To simplify terminology, we also call $(X_t)$ $P$ -Markov when (3.1) holds. We call either $X_0$ or its distribution $\psi_0$ the initial condition of $(X_t)$ , depending on context. $P$ is also called the transition matrix of the Markov chain.

The definition of a Markov chain says two things:

(i) When updating to $X_{t+1}$ from $X_t$ , earlier states are not required.

(ii) $P$ encodes all of the information required to perform the update, given the current state $X_t$ .

One way to think about Markov chains is algorithmically: Fix $P \in \mopx$ and let $\psi_0$ be an element of $\dD(\Xsf)$ . Now generate $(X_t)$ via Algorithm 3.1. The resulting sequence is $P$ -Markov with initial condition $\psi_0$ .

3.1.1.2Application: S–s Dynamics¶

As an example, consider a firm whose inventory of some product follows S–s dynamics, meaning that the firm waits until its inventory falls below some level $s > 0$ and then immediately replenishes by ordering $S$ units. This pattern of decisions can be rationalized if ordering requires paying a fixed cost. Thus, in Section 5.2.1, we will show that S–s behavior is optimal in a setting where fixed costs exist and the firm’s aim is to maximize its present value.

To represent S–s dynamics, we suppose that a firm’s inventory $(X_t)_{t \geq 0}$ of a given product obeys

X_{t+1} = \max\{ X_t - D_{t+1}, 0\} + S \1\{X_t \leq s\},

where

$(D_t)_{t \geq 1}$ is an exogenous IID demand process with $D_t \eqdist \phi \in \dD(\ZZ_+)$ for all $t$ and
$S$ is the quantity ordered when $X_t \leq s$ .

For the distribution $\phi$ of demand we take the geometric distribution, so that $\phi(d) = \PP\{D_t = d\} = p (1 - p)^d$ for $d \in \ZZ_+$ .

If we define $h(x, d) \coloneq \max\{ x - d, 0\} + S \1\{x \leq s\}$ , so that $X_{t+1} = h(X_t, D_{t+1})$ for all $t$ , then the transition matrix can be expressed as

P(x, x') = \PP\{h(x, D_{t+1}) = x'\} = \sum_{d \geq 0} \1\{h(x, d) = x'\} \phi(d) \qquad ((x, x') \in \Xsf \times \Xsf).

Listing 1 provides code that simulates inventory paths and computes other objects of interest. Since the state space $\Xsf = \{x_1, \ldots, x_n\}$ corresponds to $\{0, \ldots, S+s\}$ and Julia indexing starts at 1, we set $x_i = i-1$ . This convention is used when computing P[i, j]{.julia}, which corresponds to $P(x_i, x_j)$ . The code in the listing is used to produce the simulation of inventories in Figure 3.1.

The function compute_mc{.julia} returns an instance of a MarkovChain{.Julia} object that can store both the state $\Xsf$ and the transition probabilities. The QuantEcon.jl{.julia} library defines this data type and provides functions that simulate a Markov chain, compute a stationary distribution, and perform related tasks.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
using Distributions, QuantEcon, IterTools

function create_inventory_model(; S=100,  # Order size
                                  s=10,   # Order threshold
                                  p=0.4)  # Demand parameter
    ϕ = Geometric(p)
    h(x, d) = max(x - d, 0) + S * (x <= s)
    return (; S, s, ϕ, h)
end

"Simulate the inventory process."
function sim_inventories(model; ts_length=200)
    (; S, s, ϕ, h) = model
    X = Vector{Int32}(undef, ts_length)
    X[1] = S  # Initial condition
    for t in 1:(ts_length-1)
        X[t+1] = h(X[t], rand(ϕ))
    end
    return X
end

"Compute the transition probabilities and state."
function compute_mc(model; d_max=100)
    (; S, s, ϕ, h) = model
    n = S + s + 1  # Size of state space
    state_vals = collect(0:(S + s))
    P = Matrix{Float64}(undef, n, n)
    for (i, j) in product(1:n, 1:n)
        P[i, j] = sum((h(i-1, d) == j-1) * pdf(ϕ, d) for d in 0:d_max)
    end
    return MarkovChain(P, state_vals)
end

"Compute the stationary distribution of the model."
function compute_stationary_dist(model)
    mc = compute_mc(model)
    return mc.state_values, stationary_distributions(mc)[1]
end

Program 1:An implementation of S--s inventory dynamics (inventory_sim.jl)

Inventory simulation (inventory_sim.jl) — Figure 3.1:Inventory simulation (`inventory_sim.jl`)

3.1.1.3Higher Order Transition Matrices¶

Given a finite state space $\Xsf$ , $k \geq 0$ and $P \in \mopx$ , let $P^k$ be the $k$ -th power of $P$ . (If $k = 0$ , then $P^k$ is the identity matrix.) Since $\mopx$ is closed under multiplication (Exercise 2.3.2), $P^k$ is in $\mopx$ for all $k \geq 0$ . In this context, $P^k$ is sometimes called the $k$ -step transition matrix corresponding to $P$ . In what follows, $P^k(x,x')$ denotes the $(x,x')$ -th element of the matrix representation of $P^k$ .

The $k$ -step transition matrix has the following interpretation: If $(X_t)$ is $P$ -Markov, then for any $t, k \in \ZZ_+$ and $x, x' \in \Xsf$ ,

P^k(x, x') = \PP\{X_{t + k} = x' \given X_t = x\}.

(3.2)

Thus, $P^k$ provides the $k$ -step transition probabilities for the $P$ -Markov chain $(X_t)$ .

Solution to Exercise 3.1.2

Fixing $t \geq 0$ and $P \in \mopx$ , this claim can be verified by induction over $k$ . The claim is obviously true when $k=0, 1$ . Suppose the claim is also true at $k$ and now consider the case $k+1$ . By the law of total probability, for given $x, x' \in \Xsf$ , we have

\PP\{X_{t+k+1} = x' \given X_t = x \} = \sum_z \PP\{X_{t+k+1} = x' \given X_{t+k} = z\} \PP\{X_{t+k} = z \given X_t = x\}.

The induction hypothesis allows us to use (3.2) at $k$ , so the last equation becomes

\PP\{X_{t+k+1} = x' \given X_t = x \} = \sum_z P^k(x, z) P(z, x') = P^{k+1} (x, x').

The law (3.2) is now verified at $k+1$ , completing our proof by induction.

We can now give the following useful characterization of irreducibility:

Thus, irreducibility of $P$ means that the $P$ -Markov chain eventually visits any state from any other state with positive probability.

Several libraries have code for testing irreducibility, including QuantEcon.jl. See Listing 2 for an example of a call to this functionality. In this case, irreducibility fails because state 2 is an absorbing state. Once entered, the probability of ever leaving that state is zero. (A subset $\Ysf$ of $\Xsf$ with this property is called an absorbing set.)

1
2
3
4
5
using QuantEcon
P = [0.1 0.9;
     0.0 1.0]
mc = MarkovChain(P)
print(is_irreducible(mc))

Program 2:Testing irreducibility (is_irreducible.jl)

3.1.2Stationarity and Ergodicity¶

Next we review aspects of Markov dynamics, including stationarity and ergodicity.

Fix $P \in \mopx$ and let $(X_t)$ be $P$ -Markov. Let $\psi_t$ be the distribution of $X_t$ . Marginal distributions $\psi_t$ evolve according to

\psi_{t+1}(x') = \sum_x P(x, x') \psi_t(x) \quad \text{for all } x' \in \Xsf \text{ and } t \geq 0.

(3.3)

To verify (3.3), rewrite it as $\PP \{X_{t+1} = x'\} = \sum_x \PP\{X_{t+1}=x' \,|\, X_t=x\} \PP\{X_t=x\}$ , which is true by the law of total probability. With each $\psi_t$ regarded as a row vector, (3.3) can also be written as

\psi_{t+1} = \psi_t P.

(3.4)

Equation (3.4) tells us that dynamics of marginal distributions for Markov chains are generated by deterministic linear difference equations in distribution space. This is remarkable because the dynamics that drive $(X_t)$ are stochastic and can be arbitrarily nonlinear.

Iterating on (3.4), we get $\psi_t = \psi_0 P^t$ for all $t$ . In summary,

(X_t)_{t \geq 0} \text{ is } P \text{-Markov with } X_0 \eqdist \psi_0 \; \implies \; X_t \eqdist \psi_0 P^t \text{ for all } t \geq 0.

(3.5)

For (3.5) and $\psi_{t+1} = \psi_t P$ to hold, each $\psi_t$ must be a row vector. In what follows, we always treat the distributions $(\psi_t)_{t \geq 0}$ of $(X_t)_{t \geq 0}$ as row vectors.

Consistent with our definition of stationary distributions in Section 2.3.1.3, a marginal distribution $\psi^* \in \dD(\Xsf)$ is called stationary for $P$ if

\sum_x P(x, x') \psi^*(x) = \psi^*(x') \quad \text{for all } x \in \Xsf.

In vector form, this is $\psi^* P = \psi^*$ . By this definition and (3.3), if $\psi^*$ is stationary and $X_t$ has distribution $\psi^*$ , then so does $X_{t+k}$ for all $k \geq 1$ .

We saw in Exercise 2.3.2 that every irreducible $P \in \mopx$ has exactly one stationary distribution in $\dD(\Xsf)$ . The following ergodic property holds under the same assumptions.

A proof of (3.7) can be found in Brémaud (2020).

Property (3.7) tells us that, with probability one (i.e., for almost every $P$ -Markov chain that we generate), the fraction of time that the chain spends in any given state is, in the limit, equal to the probability assigned to that state by the stationary distribution. Markov chains with this property are sometimes said to be ergodic.

Since the S–s inventory model from Section 3.1.1.2 is irreducible, the ergodicity result from Theorem 3.1.2 applies. In particular, the process has only one stationary distribution $\psi^*$ in $\dD(\Xsf)$ , where $\Xsf = \{0, \ldots, S+s\}$ , and (3.7) is valid. Figure 3.2 illustrates this by plotting both the stationary distribution $\psi^*$ (which is computed using the code in Listing 1), and the value $m(y) \coloneq \frac{1}{k} \sum_{t=0}^{k-1} \1\{X_t = y\}$ at each $y \in \Xsf$ for $k$ set to $1,000,000$ . As predicted by the theorem, the fraction of time spent by the chain in each state is close to the probability assigned by $\psi^*$ .

Ergodicity (inventory_sim.jl) — Figure 3.2:Ergodicity (`inventory_sim.jl`)

3.1.2.1Application: Day Laborer¶

Suppose that a day laborer is either unemployed ( $X_t = 1$ ) or employed ( $X_t = 2$ ) in each period. In state 1 he is hired with probability $\alpha \in (0, 1)$ . In state 2 he is fired with probability $\beta \in (0, 1)$ . The corresponding state space and transition matrix are

\Xsf = \{1, 2\} \quad \text{and} \quad P = \begin{pmatrix} 1-\alpha & \alpha \\ \beta & 1-\beta \end{pmatrix}.

(3.8)

Listing 3 provides a function to update from $X_t$ to $X_{t+1}$ , using the fact that $\texttt{rand()}$ generates a draw from the uniform distribution on $[0,1)$ .

1
2
3
4
5
6
7
8
9
10
11
12
13
function create_laborer_model(; α=0.3, β=0.2)
    return (; α, β)
end

function laborer_update(x, model)  # update X from t to t+1
    (; α, β) = model
    if x == 1    
        x′ = rand() < α ? 2 : 1
    else 
        x′ = rand() < β ? 1 : 2
    end
    return x′
end

Program 3:Updating the state of the day laborer (laborer_sim.jl)

It is also true that $\psi P^t \to \psi^*$ as $t \to \infty$ for any $\psi \in \dD(\Xsf)$ . Thus, the operator $P$ when understood as the mapping $\psi \mapsto \psi P$ , is globally stable on $\dD(\Xsf)$

Solution to Exercise 3.1.7

Assume $P$ is everywhere positive with unique stationary distribution $\psi^*$ . Since $\rho(P)=1$ , the last part of the Perron–Frobenius theorem tells us that $P^t \to e \, \epsilon$ as $t \to \infty$ , where $e$ and $\epsilon$ are the dominant right and left eigenvectors, normalized such that $\inner{e, \epsilon} = 1$ . In this case, we know $\psi^*$ is the dominant left eigenvector and $\1$ is the dominant right eigenvector. Moreover, $\psi^* \in \dD(\Xsf)$ yields $\inner{\psi^*, \1}=1$ . Hence, for any $\psi \in \dD(\Xsf)$ , we have

\psi P^t \to \psi \1 \, \psi^* = \psi^* \quad \text{as} \quad t \to \infty.

Hence global stability holds, as claimed.

3.1.3Approximation¶

To simplify numerical calculations, we sometimes approximate a continuous-state Markov process with a Markov chain. For example, consider a linear Gaussian AR(1) model, where $(X_t)_{t \geq 0}$ evolves in $\RR$ according to

X_{t+1} = \rho X_t + b + \nu \epsilon_{t+1}, \quad |\rho|<1, \quad ( \epsilon_t ) \iidsim N(0, 1).

(3.10)

The model (3.10) has a unique stationary distribution $\psi^*$ given by

\psi^* = N(\mu_x, \sigma_x^2) \quad \text{with} \quad \mu_x \coloneq \frac{b}{1-\rho} \quad \text{and} \quad \sigma_x^2 \coloneq \frac{\nu^2}{1-\rho^2}.

This means that

\text{ } X_t \eqdist \psi^* \text{ and } X_{t+1} = \rho X_t + b + \nu \epsilon_{t+1} \text{ implies } X_{t+1} \eqdist \psi^* \text{. }

Process (3.10) is also ergodic in a similar sense to (3.7): On average, realizations of the process spend most of their time in regions of the state where the stationary distribution puts high probability mass. (You can check this via simulations if you wish.) Hence, in the discretization that follows, we shall put the discrete state space in this area.

To discretize (3.10) we use Tauchen’s method, starting with the case $b=0$ .^[1] As a first step, we choose $n$ as the number of states for the discrete approximation and $m$ as an integer that sets the width of the state space. Then we create a state space $\Xsf \coloneq \{x_1, \ldots, x_n\} \subset \mathbb R$ as an equispaced grid that brackets the stationary mean on both sides by $m$ standard deviations:

set $x_1 = -m \, \sigma_x$ ,
set $x_n = m \, \sigma_x$ and
set $x_{i+1} = x_i + s$ where $s = (x_n - x_1) / (n - 1)$ and $i$ in $\natset{n-1}$ .

The next step is to create an $n \times n$ matrix $P$ that approximates the dynamics in (3.10). For $i, j \in \natset{n}$ ,

(i) if $j = 1$ , then set $P(x_i, x_j) = F(x_1-\rho x_i + s/2)$ .

(ii) If $j = n$ , then set $P(x_i, x_j) = 1 - F(x_n - \rho x_i - s/2)$ .

(iii) Otherwise, set $P(x_i, x_j) = F(x_j - \rho x_i + s/2) - F(x_j - \rho x_i - s/2)$ .

The first two are boundary rules and the third applies Exercise 3.1.11.

Finally, if $b \neq 0$ , then we shift the state space to center it on the mean $\mu_x$ of the stationary distribution $N(\mu_x, \sigma_x^2)$ . This is done by replacing $x_i$ with $x_i + \mu_x$ for each $i$ .

Julia routines that compute $\Xsf$ and $P$ can be found in the library QuantEcon.jl.

Figure 3.3 compares the continuous stationary distribution $\psi^*$ and the unique stationary distribution of the discrete approximation when $\Xsf$ and $P$ are constructed using Tauchen’s method when $\rho=0.9$ , $b=0.0$ , $\nu=1.0$ and the discretization parameters are $n=15$ and $m=3$ .

Comparison of \psi^* = N(\mu_x, \sigma_x^2) and its discrete approximant — Figure 3.3:Comparison of $\psi^* = N(\mu_x, \sigma_x^2)$ and its discrete approximant

3.2Conditional Expectations¶

In this section, we discuss how to compute conditional expectations for Markov chains. The theory will be essential for the study of finite Markov decision processes, since, in these models, lifetime rewards are mathematical expectations of flow reward functions of Markov states.

3.2.1Mathematical Expectations¶

We begin with mathematical expectations of functions of Markov states.

3.2.1.1Conditional Expectations¶

Fix $P \in \mopx$ . For each $h \in \RR^\Xsf$ , we define

(P h)(x) = \sum_{x' \in \Xsf} h(x') P(x,x') \qquad (x \in \Xsf).

(3.12)

Noting that $P(x, \cdot)$ is the distribution of $X_{t+1}$ given $X_t = x$ , we can write

(P h)(x) = \EE [h(X_{t+1}) \given X_t = x],

(3.13)

where $(X_t)$ is any $P$ -Markov chain on $\Xsf$ . In terms of matrix algebra, viewing $h$ has an $n \times 1$ column vector, the expression $(Ph)(x)$ is one element of the vector $Ph$ obtained by premultiplying $h$ by $P$ .

The interpretation in (3.13) extends to powers of $P$ . In particular, we have

(P^k h)(x) = \sum_{x'} h(x') P^k(x,x') = \EE [h(X_{t+k}) \given X_t = x].

(3.14)

3.2.1.2The Law of Iterated Expectations¶

The law of iterated expectations is a workhorse in economics and finance. One version of the law states that if $X$ and $Y$ are two random variables, then $\EE[ \EE[Y \given X] ] = \EE[Y]$ . Let’s show how this law operates for Markov chains.

Let $(X_t)$ be $P$ -Markov with $X_0 \eqdist \psi_0$ . Fix $t, k \in \NN$ . Set $\EE_t \coloneq \EE [ \cdot \given X_t]$ . We claim that

\EE [ \EE_t [h(X_{t+k})] ] = \EE [ h(X_{t+k}) ] \quad \text{for any } h \in \RR^\Xsf.

(3.15)

To see that this holds, recall that $\EE [h(X_{t+k}) \given X_t = x] = (P^k h)(x)$ . Hence $\EE [h(X_{t+k}) \given X_t] = (P^k h)(X_t)$ . Therefore,

\EE [ \EE_t [h(X_{t+k})] ] = \EE [ (P^k h)(X_t) ] = \sum_{x'} (P^k h)(x') \psi_t (x') = \sum_{x'} (P^k h)(x') (\psi_0 P^t) (x').

Since $\psi_0 P^t$ is a row vector, we can write the last expression as

\psi_0 P^t P^k h = \psi_0 P^{t+k} h = \psi_{t+k} h = \EE h(X_{t+k}).

Hence (3.15) holds.

3.2.1.3Monotone Markov Chains¶

Next, we connect Markov chains to order theory via stochastic dominance. These connections will have applications later in the book.

Let $\Xsf$ be a finite set partially ordered by $\preceq$ . A Markov operator $P \in \mopx$ is called monotone increasing if

x, y \in \Xsf \text{ and } x \preceq y \quad \implies \quad P(x, \cdot) \lefsd P(y, \cdot).

Thus, $P$ is monotone increasing if shifting up the current state shifts up the next-period state, in the sense that its distribution increases in the stochastic dominance ordering (see Section 2.2.4) on $\dD(\Xsf)$ . Below, we will see that monotonicity of Markov operators is closely related to monotonicity of value functions in dynamic programming.

Monotonicity of Markov operators is related to positive autocorrelation. To illustrate the idea, consider the AR(1) model $X_{t+1} = \rho X_t + \sigma \epsilon_{t+1}$ from Section 3.1.3 and suppose we apply Tauchen discretization, mapping the parameters $\rho , \sigma$ and a discretization size $n$ into a Markov operator $P$ on state space $\Xsf = \{x_1, \ldots, x_n\} \subset \RR$ , totally ordered by $\leq$ . If $\rho \geq 0$ , so that positive autocorrelation holds, then $P$ is monotone increasing.

Solution to Exercise 3.2.2

Using Exercise 3.1.11 and the definition of $P$ , it can be shown that

G(x, x_k) \coloneq \sum_{j=k}^n P(x, x_j) = \PP\{x_k - s/2 < X_{t+1} \given X_t = x\}.

Rewriting the probability in terms of $\epsilon_{t+1}$ , we get

G(x, x_k) = \PP\{\epsilon_{t+1} > (x_k - s/2 - \rho x) / \sigma \}.

Since $\rho \geq 0$ , we can now see that $x \leq y$ implies $G(x, x_k) \leq G(y, x_k)$ for all $k$ , or, equivalently, $G(x, \cdot) \leq G(y, \cdot)$ pointwise on $\Xsf$ . By Lemma 2.2.5, this is equivalent to the statement that $P(x, \cdot) \lefsd P(y, \cdot)$ , which confirms that $P$ is monotone increasing.

Solution to Exercise 3.2.4

Suppose that $P$ is monotone increasing and fix $h \in i\RR^\Xsf$ . We claim that $Ph \in i\RR^\Xsf$ . To see this, pick any $x, y \in \Xsf$ with $x \preceq y$ . Since $x \preceq y$ we have $P(x, \cdot) \lefsd P(y, \cdot)$ . Hence $\sum_{x'} h(x') P(x, x') \leq \sum_{x'} h(x')P(y, x')$ . This shows that $Ph \in i\RR^\Xsf$ .

To see the converse, suppose that $P$ is invariant on $i\RR^\Xsf$ . Fix $x, y \in \Xsf$ with $x \preceq y$ . We claim that $P(x, \cdot) \lefsd P(y, \cdot)$ . To see this, fix $u \in i\RR^\Xsf$ . $Pu \in i\RR^\Xsf$ by invariance, so $(Pu)(x) \leq (Pu)(y)$ and hence $\sum_{x'} u(x') P(x, x') \leq \sum_{x'} u(x')P(y, x')$ . Since $u$ was chosen arbitrarily from $i\RR^\Xsf$ , we have $P(x, \cdot) \lefsd P(y, \cdot)$ . Hence $P$ is monotone increasing, as was to be shown.

3.2.2Geometric Sums¶

Dynamic programs often form a lifetime value $V_0$ as a geometric sum of a reward sequence $(R_t)_{t \geq 0}$ with constant discount factor, so that $V_0 = \EE \sum_{t = 0}^\infty \beta^t R_t$ for some $\beta > 0$ . We saw this in (1.1), where we aggregated a profit stream $(\pi_t)_{t \geq 0}$ into an expected present value of the firm, and again in (1.6), where a worker evaluates lifetime earnings. In this section, we study expectations of geometric sums.

3.2.2.1Theory¶

Consider a conditional mathematical expectation of a discounted sum of future measurements:

v(x) \coloneq \EE_x \, \sum_{t=0}^\infty \beta^t h(X_t) \coloneq \EE \left[ \, \sum_{t=0}^\infty \beta^t h(X_t) \given X_0 = x \right],

(3.16)

for some constant $\beta \in \RR_+$ and $h \in \RR^\Xsf$ . Here

$(X_t)$ is $P$ -Markov on some finite set $\Xsf$ ,
$v(x)$ is a lifetime reward starting from state $x$ , and
$\EE_x$ indicates that we are conditioning on $X_0 =x$ .

With $I$ as the identity matrix, the next result describes $v$ as function of $\beta$ , $P$ and $h$ .

Proof

Under the stated conditions

\EE_x \, \sum_{t=0}^\infty \beta^t h(X_t) = \sum_{t=0}^\infty \beta^t \EE_x h(X_t) = \sum_{t=0}^\infty \beta^t (P^t h)(x),

(3.18)

where the first equality in (3.18) uses linearity of expectations and the second follows from (3.14) and the assumption that $(X_t)$ is $P$ -Markov starting at $x$ .^[2] Applying the Neumann series lemma to the matrix $\beta P$ , we see that $\sum_{t=0}^\infty (\beta P)^t = (I - \beta P)^{-1}$ . The lemma applies because $\rho(\beta P) = \beta \rho(P) = \beta < 1$ , as follows from Exercise 2.3.2. ◻

3.2.2.2Application: Valuation of Firms¶

Consider a firm that receives random profit stream $(\pi_t)_{t \geq 0}$ . Suppose that the value of the firm equals the expected present value of its profit stream. Suppose for now that the interest rate is constant at $r > 0$ . With $\beta \coloneq 1/(1+r)$ , total valuation is

V_0 = \EE \sum_{t=0}^\infty \beta^t \pi_t.

(3.19)

To compute this value, we need to know how profits evolve. A common strategy is to set $\pi_t = \pi(X_t)$ for some fixed $\pi \in \RR^\Xsf$ , where $(X_t)_{t \geq 0}$ is a state process. For known dynamics of $(X_t)$ and function $\pi$ , the value $V_0$ in (3.19) can be computed.

Here we assume that $(X_t)$ is $P$ -Markov for $P \in \mopx$ with finite $\Xsf$ . Then conditioning on $X_0 = x$ , we can write the value as

v(x) \coloneq \EE_x \sum_{t=0}^\infty \beta^t \pi_t \coloneq \EE \left[ \sum_{t=0}^\infty \beta^t \pi_t \given X_0 = x \right].

By Lemma 3.2.1, the value $v(x)$ is finite and the function $v \in \RR^\Xsf$ can be obtained by

v = \sum_{t = 0}^\infty \beta^t P^t \pi = (I - \beta P)^{-1} \pi.

It is plausible that the value of the firm will be higher for a return process in which higher states generate higher profits and predict higher future states. The next exercise confirms this.

3.2.2.3Application: Valuing Consumption Streams¶

To model consumption-saving choices we want to evaluate different consumption paths, where a consumption path is a nonnegative random sequence $(C_t)_{t \geq 0}$ . In what follows we consider consumption paths such that $C_t = c(X_t)$ for all $t \geq 0$ , where $c \in \RR_+^\Xsf$ and $(X_t)_{t \geq 0}$ is $P$ -Markov on finite set $\Xsf$ . Thus, consumption streams are time-invariant functions of a finite-state Markov chain.

In a standard “time additive” model of consumer preferences with constant geometric discounting, the time zero value of a consumption stream $(C_t)_{t \geq 0}$ , given current state $X_0 = x \in \Xsf$ , is

v(x) = \EE_x \sum_{t=0}^\infty \beta^t u(C_t),

(3.20)

where $\beta \in (0,1)$ is a discount factor and $u \colon \RR_+ \to \RR$ is called the flow utility function. Dependence of $v(x)$ on $x$ comes from the initial condition $X_0 = x$ influencing the Markov state process and, therefore, the consumption path.

Using $C_t = c(X_t)$ and defining $r \coloneq u \circ c$ we can write $v(x) = \EE_x \, \sum_{t \geq 0} \beta^t r(X_t)$ . By Lemma 3.2.1, this sum is finite and $v$ can be expressed as

v = (I - \beta P)^{-1} r.

(3.21)

Figure 3.4 shows an example when $u$ has the constant relative risk aversion (CRRA) specification

u(c)=\frac{c^{1-\gamma}}{1-\gamma} \qquad (c \geq 0, \; \gamma > 0),

(3.22)

while $c(x) = \exp(x)$ , so that consumption takes the form $C_t = \exp(X_t)$ , and $(X_t)_{t \geq 0}$ is a Tauchen discretization (see Section 3.1.3) of $X_{t+1} = \rho X_t + \nu W_{t+1}$ where $(W_t)_{t \geq 1}$ is IID and standard normal. Parameters are $n=25$ , $\beta=0.98$ , $\rho=0.96$ , $\nu=0.05$ and $\gamma = 2$ . We set $r = u \circ c$ and solved for $v$ via (3.21).

The value of (C_t)_{t\geq 0} given X_t = x — Figure 3.4:The value of $(C_t)_{t\geq 0}$ given $X_t = x$

3.3Job Search Revisited¶

In this section, we extend the job search problem studied in Section 1.3 to a setting with Markov wage offers. We discuss additional structure when the Markov operator for wage offers is monotone increasing. We will also allow job separations to occur.

3.3.1Job Search with Markov State¶

We adopt the job search setting of Section 1.3 but assume now that the wage process $(W_t)$ is $P$ -Markov on $\Wsf \subset \RR_+$ , where $P \in \mopw$ and $\Wsf$ is finite.

3.3.1.1Value Function Iteration¶

The value function $v^*$ for the Markov job search model is now defined as follows: $v^*(w)$ is the maximum lifetime value that can be obtained when the worker is unemployed with current wage offer is $w$ in hand. Value function $v^*$ satisfies Bellman equation

v^*(w) = \max \left\{ \frac{w}{1-\beta} ,\, c + \beta \, \sum_{w' \in \Wsf} \, v^*(w') P(w, w') \right\} \qquad (w \in \Wsf).

(3.23)

We continue to assume that $c > 0$ and $\beta \in (0,1)$ .

Bellman equation (3.23) extends a corresponding Bellman equation for the IID case (cf. (1.25)). (A full proof is given in Chapter 4.) The Bellman operator corresponding to (3.23) is

(Tv)(w) = \max \left\{ \frac{w}{1-\beta} ,\, c + \beta \, \sum_{w'} \, v(w') P(w, w') \right\} \qquad (w \in \Wsf).

As before, $T$ is constructed so that $v^*$ is a fixed point (since (3.23) holds). Exercise 3.3.1 will show that $v^*$ is the only fixed point of $T$ in $\RR^\Wsf_+$ .

Extending the IID definition (cf. (1.29)), a policy $\sigma \colon \Wsf \to \{0,1\}$ is called $v$ -greedy if

\sigma(w) = \1\left\{ \frac{w'}{1-\beta} \geq c + \beta \, \sum_{w'} \, v(w') P(w, w') \right\},

for all $w \in \Wsf$ .

Let $\vV \coloneq \RR^\Wsf_+$ and endow $\vV$ with the pointwise partial order $\leq$ and the supremum norm, so that $\| f - g\|_\infty = \max_{w \in \Wsf}|f(w) - g(w)|$ .

Solution to Exercise 3.3.1

We start with part (i). To show that $T$ is a self-map on $\vV \coloneq \RR^\Wsf_+$ , we just need to verify that $v \in \vV$ implies $Tv \in \vV$ , which only requires us to verify that $T$ maps nonnegative functions into nonnegative functions. This is clear from the definition. Regarding the order-preserving property, fix $f, g \in \vV$ with $f \leq g$ . We claim that $Tf \leq Tg$ . Indeed, if $w \in \Wsf$ , then $\sum_{w' \in \Wsf} \, f(w') P(w, w') \leq \sum_{w' \in \Wsf} \, g(w') P(w, w')$ , which in turn implies that $(Tf)(w) \leq (Tg)(w)$ . Since $w$ was an arbitrary wage value, we have $Tf \leq Tg$ , so $T$ is order-preserving.

Regarding part (ii), let $e(w) \coloneq w/(1-\beta)$ and fix $f, g$ in $V$ . Writing the operators pointwise and applying the last result in Lemma 2.2.1 gives

\begin{aligned} |Tf - Tg| & = | e \vee (c + \beta Pf) - e \vee (c + \beta Pg)| \\ & \leq \left| \beta Pf - \beta Pg \right| \\ & = \beta \left| P(f-g) \right| \\ & \leq \beta P \left| f-g \right|. \end{aligned}

(Here the last inequality uses the result in Exercise 2.2.7.) Since $P \geq 0$ we have $P | f-g | \leq P \|f-g\|_\infty \1 = \|f-g\|_\infty \1$ , so

|Tf - Tg | \leq \beta \| f - g \|_\infty \1.

Taking the maximum on both sides gives $\|Tf-Tg\|_\infty \leq \beta \|f-g\|_\infty$ . Since $f, g$ were arbitrary elements of $V$ , the contraction claim is verified.

We recommend that you study the proof of the next lemma, since the same style of argument occurs often in the book.

In view of the contraction property established in Exercise 3.3.1, we can use value function iteration (i) to compute an approximation $v$ to the value function and (ii) to calculate the $v$ -greedy policy that approximates the optimal policy. Code for implementing this procedure is in Listing 4. The definition of a $v$ -greedy policy resembles that of the IID case (see (1.29)).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
using QuantEcon, LinearAlgebra
include("s_approx.jl")

"Creates an instance of the job search model with Markov wages."
function create_markov_js_model(;
        n=200,       # wage grid size
        ρ=0.9,       # wage persistence
        ν=0.2,       # wage volatility
        β=0.98,      # discount factor
        c=1.0        # unemployment compensation
    )
    mc = tauchen(n, ρ, ν)
    w_vals, P = exp.(mc.state_values), mc.p
    return (; n, w_vals, P, β, c)
end

" The Bellman operator Tv = max{e, c + β P v} with e(w) = w / (1-β)."
function T(v, model)
    (; n, w_vals, P, β, c) = model
    h = c .+ β * P * v
    e = w_vals ./ (1 - β)
    return max.(e, h)
end

" Get a v-greedy policy."
function get_greedy(v, model)
    (; n, w_vals, P, β, c) = model
    σ = w_vals / (1 - β) .>= c .+ β * P * v
    return σ
end

"Solve the infinite-horizon Markov job search model by VFI."
function vfi(model) 
    v_init = zero(model.w_vals)  
    v_star = successive_approx(v -> T(v, model), v_init)
    σ_star = get_greedy(v_star, model)
    return v_star, σ_star
end

Program 4:Job search with Markov state (markov_js.jl)

3.3.1.2Continuation Values¶

The continuation value $h^*$ from the IID case is now replaced by a continuation value function

h^*(w) \coloneq c + \beta \, \sum_{w'} \, v^*(w') P(w, w') \qquad (w \in \Wsf).

The continuation value depends on $w$ because the current offer helps predict the offer next period, which in turn affects the value of continuing. The functions $w \mapsto w / (1-\beta)$ , $h^*$ and $v^*$ corresponding to the default model in Listing 4 are shown in Figure 3.5.

Figure 3.5:Value, stopping, and continuation for Markov job search

Exercise 3.3.4 suggests an alternative way to solve the job search problem: iterate with $Q$ to obtain the continuation value function $h^*$ and then use the policy

\sigma^*(w) = \1\left\{ \frac{w}{1-\beta} \geq h^*(w) \right\} \qquad (w \in \Wsf)

that tells the worker to accept when the current stopping value exceeds the current continuation value.

We saw that, in the IID case, a computational strategy based on continuation values is far more efficient than value function iteration (see Section 1.3.2.2). Since continuation values are functions rather than scalars, here the two approaches (iterating with $T$ vs iterating with $Q$ ) are more similar. In Chapter 4 we discuss alternative computational strategies in more detail, seeking conditions under which one approach will be more efficient than the other.

3.3.2Job Search with Separation¶

We now modify the job search problem discussed in Section 3.3.1 by adding separations. In particular, an existing match between worker and firm terminates with probability $\alpha$ every period. (This is an extension because setting $\alpha=0$ recovers the permanent job scenario from Section 3.3.1.)

The worker now views the loss of a job as a capital loss and a spell of unemployment as an investment. In what follows, the wage process and discount factor are unchanged from Section 3.3.1. As before, $\vV \coloneq \RR^\Wsf_+$ is endowed with the supremum norm.

The value function $v^*_u$ for an unemployed worker satisfies the recursion

v^*_u(w) = \max \left\{ v^*_e(w) ,\, c + \beta \, \sum_{w' \in \Wsf} \, v^*_u(w') P(w, w') \right\} \qquad (w \in \Wsf),

(3.25)

where $v^*_e$ is the value function for an employed worker, that is, the lifetime value of a worker who starts the period employed at wage $w$ . Value function $v^*_e$ satisfies

v^*_e(w) = w + \beta \left[ \alpha \sum_{w'} v^*_u(w') P(w, w') + (1-\alpha) v^*_e(w) \right] \qquad (w \in \Wsf).

(3.26)

This equation states that value accruing to an employed worker is current wage plus the discounted expected value of being either employed or unemployed next period.

We claim that, when $0 < \alpha, \beta < 1$ , the system (3.25)–(3.26) has a unique solution $(v_e^*, v_u^*)$ in $\vV \times \vV$ . To show this we first solve (3.26) in terms of $v^*_e(w)$ to obtain

v^*_e(w) = \frac{1}{1 - \beta(1-\alpha)} \left(w + \alpha \beta (P v^*_u)(w) \right).

(3.27)

(Recall $(Ph)(w) \coloneq \sum_{w'} h(w') P(w, w')$ for $h \in \RR^\Wsf$ .) Substituting into (3.25) yields

v^*_u(w) = \max \left\{ \frac{1}{1 - \beta(1-\alpha)} \left(w + \alpha \beta (P v^*_u)(w) \right) ,\, c + \beta \, (P v^*_u)(w) \right\}.

(3.28)

Solution to Exercise 3.3.5

Let $T$ be the operator on $\vV$ such that $(Tv_u)(w)$ is the right-hand side of (3.28). To solve the exercise, it suffices to prove that $T$ is a contraction map on $\vV$ . (Then $v_u$ can be obtained, in the limit, by applying successive approximation to $T$ and, once the approximate fixed point is computed, $v_e$ can be obtained via (3.27).) To show that $T$ is a contraction, we let $T_1$ and $T_2$ be the operators on $\vV$ defined by

(T_1 v)(w) = \frac{1}{1 - \beta(1-\alpha)} \left(w + \alpha \beta (P v)(w) \right) \quad \text{and} \quad (T_2 v)(w) = c + \beta \, (P v)(w) .

Since $Tv = (T_1 v) \vee (T_2 v)$ , Lemma 2.2.3 tells us that $T$ will be a contraction provided that $T_1$ and $T_2$ are both contraction maps. For the case of $T_2$ , we have

\| T_2 f - T_2 g\|_\infty = \max_w |c + \beta \, (P f)(w) - c - \beta \, (P g)(w)| \leq \max_w \beta \sum_{w'} | f(w') - g(w')| P(w, w').

The last term is dominated by $\beta \| f - g \|_\infty$ , so $T_2$ is a contraction. The proof for $T_1$ is similar in spirit and hence left to the reader.

Figure 3.6 shows the value function $v_u^*$ for an unemployed worker, which is the fixed point of (3.28), as well as the stopping and continuation values, which are given by

s^*(w) \coloneq \frac{1}{1 - \beta(1-\alpha)} \left(w + \alpha \beta (P v^*_u)(w) \right) \quad \text{and} \quad h^*_e(w) \coloneq c + \beta \, (P v^*_u)(w)

respectively, for each $w \in \Wsf$ . Parameters are as in Listing 5. The value function $v^*_u$ is the pointwise maximum (i.e., $v^*_u = s^* \vee h^*$ ). The worker’s optimal policy while unemployed is

\sigma^*(w) \coloneq \1\{s^*(w) \geq h^*(w)\}.

As before, the smallest $w$ such that $\sigma^*(w) = 1$ is called the reservation wage.

1
2
3
4
5
6
7
8
9
10
11
12
using QuantEcon, LinearAlgebra

"Creates an instance of the job search model with separation."
function create_js_with_sep_model(;
        n=200,          # wage grid size
        ρ=0.9, ν=0.2,   # wage persistence and volatility
        β=0.98, α=0.1,  # discount factor and separation rate
        c=1.0)          # unemployment compensation
    mc = tauchen(n, ρ, ν)
    w_vals, P = exp.(mc.state_values), mc.p
    return (; n, w_vals, P, β, c, α)
end

Program 5:Job search with separation model (markov_js_with_sep.jl)

Figure 3.6:Value function with job separation

Figure 3.7 shows how the reservation wage changes with $\alpha$ . To produce this figure we solved the model for the reservation wage at 10 values of $\alpha$ in an evenly spaced grid ranging 0 to 1. The reservation wage falls with $\alpha$ , since time spent unemployed is a capital investment in better wages, and the value of this investment declines as the separation rate rises.

Figure 3.7:Reservation wage versus separation rate

3.4Chapter Notes¶

Many good textbooks on Markov chains exist, including Norris (1998), Häggström & others (2002), and Privault (2013). Sargent & Stachurski (2023) provides a relatively comprehensive treatment from a network perspective that is a natural one for Markov chains. Other economic applications are discussed in Stokey & Lucas (1989) and Ljungqvist & Sargent (2018). Meyer (2000) gives a detailed account of the theory of nonnegative matrices. Another useful reference is Horn & Johnson (2012).

A systematic study of monotone Markov chains was initiated by Daley (1968). Monotone Markov methods have many important applications in economics. See, for example, Hopenhayn & Prescott (1992), Kamihigashi & Stachurski (2014), Jaśkiewicz & Nowak (2014), Balbus et al. (2014), Foss et al. (2018) and Hu & Shmaya (2019).

Footnotes¶

Tauchen’s method Tauchen, 1986 is simple but sub-optimal in some cases. For a more general discretization method and a survey of the literature, see Farmer & Toda (2017).
↩
To justify the first equality, care must be taken when pushing expectations through infinite sums. In the present setting, justification can be provided via the dominated convergence theorem (see, e.g., Dudley (2002), Theorem 4.3.5). A proof of a more general result can be found in Section B.2.
↩

References¶

Brémaud, P. (2020). Markov Chains: Gibbs Fields, Monte Carlo Simulation and Queues (Vol. 31). Springer Nature.
Norris, J. R. (1998). Markov Chains. Cambridge University Press.
Häggström, O., & others. (2002). Finite Markov Chains and Algorithmic Applications. Cambridge University Press.
Privault, N. (2013). Understanding Markov Chains: Examples and Applications. Springer-Verlag Singapore.
Sargent, T., & Stachurski, J. (2023). Economic Networks: Theory and Computation. Cambridge University Press.
Stokey, N., & Lucas, R. (1989). Recursive Methods in Dynamic Economics. Harvard University Press.
Ljungqvist, L., & Sargent, T. (2018). Recursive Macroeconomic Theory (4th ed.). MIT Press.
Meyer, C. D. (2000). Matrix Analysis and Applied Linear Algebra (Vol. 71). Siam.
Horn, R. A., & Johnson, C. R. (2012). Matrix Analysis. Cambridge University Press.
Daley, D. (1968). Stochastically monotone Markov Chains. Probability Theory and Related Fields, 10(4), 305–317.
Hopenhayn, H. A., & Prescott, E. C. (1992). Stochastic monotonicity and stationary distributions for dynamic economies. Econometrica, 60(6), 1387–1406.
Kamihigashi, T., & Stachurski, J. (2014). Stochastic stability in monotone economies. Theoretical Economics, 9(2), 383–407.
Jaśkiewicz, A., & Nowak, A. S. (2014). Stationary Markov perfect equilibria in risk sensitive stochastic overlapping generations models. Journal of Economic Theory, 151, 411–447.
Balbus, Ł., Reffett, K., & Woźny, Ł. (2014). A constructive study of Markov equilibria in stochastic games with strategic complementarities. Journal of Economic Theory, 150, 815–840.
Foss, S., Shneer, V., Thomas, J. P., & Worrall, T. (2018). Stochastic stability of monotone economies in regenerative environments. Journal of Economic Theory, 173, 334–360.