Earlier chapters treated dynamics in discrete time. Now we switch to continuous time. We restrict ourselves to finite state spaces, where continuous-time processes are pure jump processes. This allows us to provide a rigorous and self-contained treatment, while laying foundations for a treatment of general state problems.
In this section, we introduce continuous-time Markov models. In Section 10.2, we will use them as components of continuous-time Markov decision processes.
In Section 3.1.1 we learned that if (Xt)=(X0,X1,…) is P-Markov, then the distributions (ψt) of the state process obey ψt+1=ψtP for all t. This update rule is a linear difference equation in distribution space, which in turn suggests that, once we switch to continuous-time, distributions will evolve according to linear differential equations in distribution space.
This idea turns out to be correct. As such, we begin this chapter with some facts about linear differential equations.
The continuous-time system in Example 10.1.1 is closely related to the discrete time difference equation ut+1=erut. Indeed, if we start at u0, then the t-th iterate is ertu0, so solutions agree at integer times. We can think of the continuous-time system as one that interpolates between points in time of a corresponding discrete time system.
The exponential eλ of λ=a+ib∈C can also be defined via (10.1). From the identity eib=cos(b)+isin(b), we obtain
Continuous-time Markov chains have a close relationship with the exponential distribution, a fact that stems from its being the only distribution having the memoryless property
The memoryless property is special. For example, the probability that an individual human being lives 70 years from birth is not equal to the probability that he or she lives another 70 years conditional on having reached age 70. In fact, the exponential distribution is the only memoryless distribution supported on the nonnegative reals:
where A is any square matrix. As we will see, the matrix exponential plays a key role in the solution of vector-valued linear differential equations.
In Lemma 10.1.2 and what follows, integration or differentiation of a vector- or matrix-valued function is carried out element by element. For example, to differentiate a matrix B(t)=(bij(t)) that depends on t, we form a new matrix by differentiating each element bij(t) with respect to t. The integral ∫abB(t)dt is the matrix of integrals ∫abbij(t)dt.
The proof of part (ii) of Lemma 10.1.2 uses the definition of the exponential and the binomial formula. See, for example, Hirsh & Smale (1974). Part (iii) follows directly from part (ii). Part (iv) follows easily from part (i) when A is diagonalizable (and can be proved more generally via the Jordan canonical form).
As for (vii), we are drawing an analogy with the fundamental theorem of calculus for scalar-valued functions, which states that f(t)−f(s)=∫stf′(τ)dτ for all s⩽t, where f′ is the derivative of f.
Next, we study solutions of multivariate differential equations, with a focus on linear systems. These results lay foundations for our study of continuous-time Markov dynamics in Section 10.1.3.
Recall from Section 2.1.1 that a discrete dynamical system is a pair (U,S), where U is a set and S is a self-map on U. Trajectories are sequences (Stu)t⩾0=(u,Su,S2u,…), where u∈U is the initial condition. These ideas can be extended to continuous-time by considering a pair (U,(St)t⩾0) where U is any set and St is a self-map on U for each t∈R+. The interpretation is that if u∈U is the current state of the system, then Stu will be the state after t units of time.
In general, to understand (U,(St)t⩾0) as a continuous-time dynamical system, we require that (a) S0 is the identity map, so that the state after zero units of time is just the initial condition, and (b) if we start at u, move forward to us:=Ssu, and then move again to Stus after another t units of time, the outcome should be the same as moving from u to Ss+tu directly. That is,
One way that continuous-time dynamical systems arise is via initial value problems. An initial value problem (IVP) in Rn consists of a differential equation u˙t=f(ut) paired with an initial condition u0∈Rn, where ut∈Rn and f:Rn→Rn. Under suitable conditions on f, the solution ut:=F(t,u0) is uniquely defined for all t⩾0, and, moreover,
F(0,u0)=u0andF(s+t,u0)=F(t,F(s,u0))for all s,t⩾0
(see, e.g., Hirsh & Smale (1974), Section 8.7). Hence (St)t⩾0 defined by Stu=F(t,u) satisfies the semigroup property and (Rn,(St)t⩾0) is a continuous-time dynamical system. The function f is called the vector field of (Rn,(St)t⩾0).
Given our interest in continuous-time Markov chains and their connection to linear systems (see the comments at the start of Section 10.1.1), we focus primarily on linear differential equations. The next result discusses linear IVPs, illustrating the key role of the matrix exponential. In the statement, A is n×n and both u˙t and ut are column vectors in Rn.
(Here u˙t:=dut/dt is defined by differentiating the vector ut element-by-element, as discussed after Lemma 10.1.2.)
where A is n×n, u0 is a vector in Rn indicating the initial condition, and ut is the “state” of the system at time t. Figure Figure 10.1 shows an example when
For an exponential flow such as (10.24), a key question is whether or not ut→0 as t→∞. (This will matter when we try to evaluate lifetime rewards over an infinite horizon in continuous time.) Rather than analyze these issues at every possible u0, we directly consider the matrix-valued flow t↦etA and study whether etA→0.
The case where A is diagonalizable provides a good starting point. Suppose A=P−1DP with D=diagj(λj) containing the eigenvalues of A. Then, by Lemma 10.1.2, for any t⩾0, we have
Exercise 10.8 and equation (10.26) tell us that the long run dynamics of etA are determined by the scalar flows t↦etλj. How does etλ evolve over time when λ∈C?
To answer this question we write λ=a+ib and apply (10.4) to obtain
where Reλ is the real part of λ (i.e., if λ=a+ib, then Reλ=a).
From this analysis, we conclude that, when A is diagonalizable, we have etA→0 if and only if Reλj<0 for all λj∈σ(A), where σ(A) denotes the set of all eigenvalues (the spectrum) of A. Another way to put this is that etA→0 if and only if s(A)<0, where
As the preceding analysis suggests, the spectral bound plays a key role in the asymptotics of exponential flows, just as a spectral radius governs asymptotics of trajectories of linear maps (see, e.g., Exercise 1.14). Section Section 10.1.2.4 expands on this analysis, while dropping the assumption that A is diagonalizable.
Let A be any square matrix. In the following statement about a spectral bound, ∥⋅∥ is the operator norm defined in Section 1.2.1.4.
(The second equality in (10.30) also holds when the limit is taken over t∈R+. See, for example, Engel & Nagel (2006).)
The next theorem is a key stability result for exponential flows. Among other things, it extends to arbitrary A the finding that s(A)<0 is necessary and sufficient for stability.
A full proof of Theorem 10.1.5 in a general setting can be found in §V.II of Engel & Nagel (2006).
Theorem 10.1.5 tells us that the flow t↦etAu0 converges to the origin at an exponential rate if and only if s(A)<0. The equivalence of (i) and (ii) was proved for the diagonalizable case in Section 10.1.2.3. It can be viewed as the continuous-time analog of ∥Bk∥→0 if and only if ρ(B)<1 (see Exercise 1.14).
Advanced treatments of continuous-time systems often begin with semigroups. Let’s briefly describe these and connect them to things we have studied earlier. (If you prefer to skip this section on first reading, you can move to the next one after noting that, given an n×n matrix A, the family (St)t⩾0=(etA)t⩾0 is called an exponential semigroup and that A is called the infinitesimal generator of the semigroup.)
Let X be a finite set and let (St)t⩾0 be a subset of L(RX) indexed by t∈R+. The family (St)t⩾0 is called a strongly continuous semigroup or C0-semigroup on RX if
S0=I, where I is the identity,
St+t′=St∘St′, and
t↦Stu is a continuous map from R+ to RX for every u∈RX.
In essence, a C0-semigroup on RX is a continuous-time dynamical system (RX,(St)t⩾0) where each St maps an initial state into a time t state.
The semigroup perspective is important because it extends naturally to settings where X is not finite, in which case we replace the finite-dimensional set RX with some (typically infinite-dimensional) class of functions G⊂RX, and each St becomes a linear operator mapping G into itself. At this level of generality, Stu can be the solution to a partial differential equation, or a stochastic differential equation (see, e.g., Engel & Nagel (2006) or Applebaum (2019)). Operator semigroup theory offers an elegant and powerful framework for handling such systems.
For semigroups in general settings we often have no analytical expressions for St. This situation is like the one we encountered in the continuous-time system in Section 10.1.2.1, where u˙t=f(ut) and f is potentially nonlinear. When no analytical solution ut exists, analyzing the dynamics requires us to try to infer its properties from the vector field f, so that f becomes the primary focus of analysis.
A natural question, then, is, given a semigroup (St)t⩾0 on L(RX), does there always exist a “vector field” type object that “generates” (St)t⩾0? When X is finite, the answer is affirmative. This object, to be denoted by A, is called the infinitesimal generator of the semigroup and is defined by
At u∈U, the vector Au indicates the instantaneous change in the state.
More precisely, when X is finite, we have:
Semigroups of the form described in Proposition 10.1.6 are called exponential semigroups (or “uniformly continuous” semigroups).
A full proof of Proposition 10.1.6 can be found in the discussion of Theorem 2.12 of Engel & Nagel (2006). The results are not surprising, since the main claim is that, in finite dimensions, solutions to linear differential equations have exponential form. The fact that A is the infinitesimal generator of the semigroup (St)t⩾0=(etA)t⩾0 follows from Lemma 10.1.2, which gives
The preceding discussion places our analysis in a wider context. To practice our new terminology, we can restate (i) ⟺ (ii) from Theorem 10.1.5 by saying that the exponential semigroup (St)t⩾0=(etA)t⩾0 converges to zero if and only if the spectral bound of its infinitesimal generator is negative.
Having studied multivariate linear dynamics, we are now ready to concentrate on the Markov case, where dynamics evolve in distribution space. For the most part we now switch to operator-theoretic notation, where X is a finite set with n elements, and an n×n matrix is identified with a linear operator on L(RX). As emphasized in Section 2.3.3.1, this is merely a change in terminology, and all preceding results for matrices extend directly to linear operators.
If (Xt)t⩾0 is P-Markov on X for some P∈M(RX), then the marginal distributions of (Xt)t⩾0 evolve according to the linear difference system ψt+1=ψtP (see Section 3.1.1). We now seek a continuous-time analog in the form of a linear differential equation.
To this end we call Q∈L(RX) an intensity operator or intensity matrix[1] when
Q(x,x′)⩾0 whenever x=x′andx′∑Q(x,x′)=0 for all x∈X.
when ψt and ψ˙t are understood to be row vectors. We say that D(X) is invariant for the IVP (10.40) if the solution (ψt)t⩾0 remains in D(X) for all t⩾0.
In view of Proposition 10.1.3, we can rephrase this by stating that D(X) is invariant for (10.40) whenever
Our key result for this section shows the central role of intensity matrices:
Proposition 10.1.7 tells us that the set I(RX) coincides with the set of continuous-time (and time-homogeneous) Markov models on X. Any specification outside this class fails to generate flows in distribution space. The proof is completed in several steps.
For the proof of Proposition 10.1.7, we have now shown that (i) implies (ii). Evidently (ii) implies (iii), because if ψ0∈D and ψt=ψ0Pt where Pt is stochastic, then ψt∈D(X). Hence it remains only to show that (iii) implies (i).
Returning to Proposition 10.1.7, the last two exercises confirm that (iii) implies (i). The proof is now complete.
Section Section 10.1.3.1 covered the mathematical relationship between intensity matrices and Markov operators. Let’s now discuss the connection more informally, in order to build intuition.
To this end, let (Xt)t⩾0 be Ph-Markov in discrete time. Here h>0 is the length of the time step. We write the corresponding distribution sequence ψt+h=ψtPh in terms of change per unit of time, as in
Equation (10.50) tells us that Q(x,x′) represents the instantaneous rate of flow out of state x and into state x′. The on-diagonal value Ph(x,x) just balances the off-diagonal probabilities.
Fix Q∈I(RX). In the terminology of Section 10.1.2.5, the family of operators (Pt)t⩾0=(etQ)t⩾0⊂M(RX) that solves ψ˙t=ψtQ (see (10.41)) is an exponential semigroup. Since each Pt is in M(RX), it is also called the Markov semigroup generated by Q. It satisfies the semigroup property Ps+t=PsPt for all s,t⩾0, which can be written more explicitly as
In the present setting, (10.52) is called the (continuous-time) Chapman--Kolmogorov equation. It states that the probability of moving from x to x′ over s+t units of time equals the probability of moving from x to z over s units of time, and then z to x′ over t units of time, summed over all z.
Again following the terminology in Section 10.1.2.5, the intensity matrix Q that defines (Pt)t⩾0=(etQ)t⩾0 is also called the infinitesimal generator of (Pt)t⩾0.
From Lemma 10.1.2, the derivative of etQ is QetQ=etQQ. We can write this as
P˙t=QPt, which is called the Kolmogorov backward equation, and
P˙t=PtQ, which is called the Kolmogorov forward equation.
We can work in the other direction as well: If we can establish that a function t↦Pt from R+ to L(RX) satisfies either one of these equations, then (Pt)t⩾0 is a Markov semigroup with infinitesimal generator Q. The next proposition gives details.
Proposition 10.1.8 is a version of our result for linear IVPs in Proposition 10.1.3, except that the IVP is now defined in operator space, rather than vector space.
We have discussed the connection between intensity matrices, Markov semigroups, and distribution flows. Let’s now connect these objects to continuous-time Markov chains. In this section, we will (a) provide a formal definition of a continuous-time Markov chain associated with a given initial condition ψ and intensity matrix Q, and (b) show how to construct such a chain algorithmically. We’ll accomplish (b) in two steps: first showing how to construct a jump chain from certain primitives (Section 10.1.4.2--Section 10.1.4.3) and then showing how to construct those primitives from a given initial condition ψ and intensity matrix Q (Section 10.1.4.4).
Let C(R+,X) be the set of right-continuous functions from R+ to X and let (Pt)t⩾0 be a Markov semigroup generated by some Q∈I(RX). A continuous-time Markov chain generated by (Pt)t⩾0 is a random function (Xt)t⩾0 that takes values in C(R+,X) and satisfies
P{Xs+t=x′∣Fs}=Pt(Xs,x′)for all s,t⩾0 and x′∈X,
where Fs:=(Xτ)0⩽τ⩽s is the history of the process up to time s. To update from time s to time t given this history, we simply take the last value Xs and update using Pt. Conditioning on Xs=x, we get
Mirroring terminology for discrete chains from Section 3.1.1.1, we will call a continuous-time Markov chain (Xt)t⩾0Q-Markov when (10.53) holds and Q is the infinitesimal generator of (Pt)t⩾0.
In what follows, Px and Ex denote probabilities and expectations conditional on X0=x. Given h∈RX, we have
In Section 10.1.4.1 we defined a continuous-time Markov chain. In this section, we describe a standard method for constructing one by using three components:
an initial condition ψ∈D(X),
a jump matrixΠ∈M(RX), and
a rate functionλ mapping X to (0,∞).
The process (Xt) starts at state x, which is drawn from ψ, waits there for an exponential time W with rate λ(x), and then updates to a new state x′ drawn from Π(x,⋅). We take x′ as the new state for the process and repeat.
These ideas are restated in Algorithm 10.1. In the algorithm, (Wk) and (Yk) are drawn independently. The process (Wk) is called the sequence of holding times or wait times, the sums Jk=∑i=1kWi are called the jump times and (Yk) is called the embedded jump chain. The jumps and the process (Xt)t⩾0 are illustrated in Figure Figure 10.2.
It is easy to verify that Q is an intensity matrix. In fact, Q is the intensity matrix for the Markov semigroup associated with the process generated by Algorithm 10.1. For x=x′, it tells us that probability flows from x to x′ at rate λ(x)Π(x,x′), which is the rate of leaving x times the rate of moving from x to x′. The next result formalizes these ideas.
To prove Proposition 10.1.9 we take (Xt)t⩾0 to be as in the statement of the proposition and define (Pt)t⩾0 by Pt(x,x′)=Px{Xt=x′} for all x,x′∈X. The proof uses the following steps:
Obtain an integral equation that (Pt)t⩾0 must satisfy.
Differentiate to obtain the Kolmogorov backward equation P˙t=QPt.
Solve this differential equation to obtain Pt=etQ for all t.
Here is the first step. In the statement, ΠPt−τ is the matrix product of Π and Pt−τ, while the equation in (10.57) is sometimes called the integrated Kolmogorov backward equation.
Differentiating the integrated Kolmogorov backward equation produces the Kolmogorov backward equation:
Let Xt be a firm’s inventory at time t. When current stock is x>0, customers arrive at rate λ(x), so the wait time for the next customer is an independent draw from the Exp(λ(x)) distribution; λ maps X to (0,∞).
The k-th customer demands Uk units, where each Uk is an independent draw from a fixed distribution ϕ on N. Purchases are constrained by inventory, however, so inventory falls by Uk∧Xt. When inventory hits zero the firm orders b units of new stock. The wait time for new stock is also exponential, being an independent draw from Exp(λ(0)).
Let Y represent the inventory size after the next jump (induced by either a customer purchase or ordering new stock), given current stock x. If x>0, then Y is a draw from the distribution of x−U∧x where U∼ϕ. If x=0, then Y≡b. Hence Y is a draw from Π(x,⋅), where Π(0,y)=1{y=b} and, for 0<x⩽b,
Π(x,y)=⎩⎨⎧0P{x−U=y}P{U⩾x} if x⩽y if 0<y<x if y=0
We can simulate the inventory process (Xt)t⩾0 via the jump chain algorithm. In this case, the wait time sequence (Wk) is the wait time for customers (and for inventory when Xt=0) and the jump sequence (Yk) is the level of inventory immediately after each jump. By Proposition 10.1.9, the inventory process is Q-Markov with Q given by Q(x,x′)=λ(x)(Π(x,x′)−I(x,x′)).
Figure Figure 10.3 shows a simulation when orders are geometric, so that
In the simulation we set α=0.7, b=10 and λ(x)≡0.5. The figure plots Xt for t∈[0,50]. Since each wait time Wi is a draw from Exp(0.5) the mean wait time is 2.0. The function that produces the map t↦Xt is shown in Listing 1.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
using Random, Distributions
"""
Generate a path for inventory starting at b, up to time T.
Return the path as a function X(t) constructed from (J_k) and (Y_k).
"""
function sim_path(; T=10, seed=123, λ=0.5, α=0.7, b=10)
J, Y = 0.0, b
J_vals, Y_vals = [J], [Y]
Random.seed!(seed)
φ = Exponential(1/λ) # Wait times are exponential
G = Geometric(α) # Orders are geometric
while true
W = rand(φ)
J += W
push!(J_vals, J)
if Y == 0
Y = b
else
U = rand(G) + 1 # Geometric on 1, 2,...
Y = Y - min(Y, U)
end
push!(Y_vals, Y)
if J > T
break
end
end
function X(t)
k = searchsortedlast(J_vals, t)
return Y_vals[k+1]
end
return X
end
Program 1:Continuous-time inventory dynamics (inventory_cont_time.jl)
If Q∈L(RX) is a given intensity matrix, how should we produce a continuous-time Q-Markov chain? If we can construct a jump chain that is Q-Markov, then not only do we obtain existence of a Q-Markov chain but we also provide a way to simulate one (via Algorithm 10.1).
To construct such a jump chain we first fix an intensity matrix Q∈L(RX) and, to simply matters, assume that all rows of Q are nonzero. This means that the process has no absorbing states (since nonzero rows are equivalent to Q(x,x)<0 for all x, which in turn states that there is a nonzero outflow from each state).
It is straightforward to confirm that Π∈M(RX) and that Q satisfies (10.56). Hence, by Proposition 10.1.9, the process (Xt)t⩾0 generated by Algorithm 10.1 is Q-Markov.
We are ready to turn to dynamic programming in continuous-time. As for the discrete time case, continuous-time dynamic programs aim to maximize a measure of lifetime value. In Section 10.2.1 we study lifetime valuations. In Section 10.2.2 we learn how to maximize them.
In this section, we consider lifetime valuations associated with continuous reward flows, starting from a general semigroup perspective and then progressing to specific cases (such as expected lifetime value under constant discounting). Throughout, X is a finite set.
For the discrete time problems with state-dependent discounting that we studied in Chapter 6, lifetime valuations take the form v=∑t⩾0Kth for some h∈RX and a positive linear operator K on RX. (See Theorem 6.1.1 and (6.31).) For a continuous-time version we fix h∈RX, take (Kt)t⩾0 to be a positive exponential semigroup in L(RX), where positive means Kt⩾0 for all t, and set
Let A∈L(RX) be the infinitesimal generator of (Kt)t⩾0. The next result provides a condition for finiteness of v and several characterizations.
A way to understand (10.69) is to view the valuation v as a price that reflects prospective benefits from holding an asset. The asset yields a flow of benefits, where h(x) is the instantaneous reward in state x. Rewards t periods in the future are discounted by the pricing operator Kt. Thus, (Kth)(x) is the anticipated payoff t periods ahead, discounted for the wait time and possibly also for risk as in (6.54). The value v(x) is then lifetime value, which equals the current price.
In this asset valuation setting, (10.69) is a natural consistency condition. It says that the price of purchasing the asset today is equal to the payouts obtained from holding the asset from now until time t and then selling it for current discounted value Ktv. (This is the continuous-time analog of (6.57).)
The preceding discussion matches the semigroup perspective on asset pricing introduced in Garman (1985) and Duffie & Garman (1986). In addition to shedding light on (10.69), it also leads to the assertion that v=−A−1h in (ii), which is obtained by differentiating (10.69). Details are in the proof:
In applications, the expression v=∫0∞Kthdt from (10.68) typically arises as a discounted expectation over a flow of rewards. When analyzing v we wish to deploy Proposition 10.2.1, so we need to check that any expectation we propose results in (Kt) being a semigroup. The next proposition provides one result along these lines.
In the proof of Proposition 10.2.2, we will use the fact that (Xt)t⩾0 satisfies the Markov property. In particular, if H is a real-valued function on the path space C(R+,X), then
Many studies of continuous-time dynamic programming with discounting use a constant discount rate. In this setting, the lifetime value in (10.68) becomes
for some δ∈R and h∈RX. Here (Xt)t⩾0 is a continuous-time Markov chain on finite state X generated by Markov semigroup (Pt)t⩾0 with intensity operator Q. The idea is that h(Xt) is an instantaneous reward at each time t, while δ is a fixed discount rate.
Equation (10.82) is the continuous-time version of (3.33).
Given two finite sets A and X, called the state and action spaces respectively, we define a continuous-time Markov decision process (or continuous-time MDP) to be a tuple C=(Γ,δ,r,Q) consisting of
a nonempty correspondence Γ from X to A, referred to as the feasible correspondence, which in turn defines the feasible state-action pairs
Informally, at state x with action a over the short interval from t to t+h, the controller receives instantaneous reward r(x,a)h and the state transitions to state x′ with probability Q(x,a,x′)h+o(h).
Paralleling our discussion of the discrete time case in Chapter 5, the set of feasible policies is
Choosing policy σ from Σ means that we respond to state Xt with action At:=σ(Xt) at every t⩾0. The state then evolves according to the intensity operator
Like the discrete time case, a v-greedy policy chooses actions optimally to trade off high current rewards versus high rate of flow into future states with high values. Unlike the discrete time case, the discount factor does not appear in (10.95) because the trade-off is instantaneous.
We introduce a continuous-time policy iteration algorithm that parallels discrete time HPI for Markov decision processes, as described in Section 5.1.4.2.
The continuous-time HPI routine is given in Algorithm 10.2, with the intuition being similar to that for the discrete time MDP version given. We provide convergence results in Section 10.2.3.
Here we study a continuous-time version of the job search problem with separation considered in Section 3.3.2. As before, a worker can be either unemployed (state 0) or employed (state 1). When the worker is employed, she can be fired at any time. Firing occurs at rate α>0, meaning that the probability of being fired over the short interval from t to t+h is approximately αh. When unemployed, the worker receives flow unemployment compensation c and job offers at rate κ. She can choose either to accept or to reject an offer; she discounts the future at rate δ>0.
We assume that job offers are associated with wage offers that take values in finite set W. Let P∈M(RW) give probabilities for new wage draws, so that, conditional on previous draw w, the next offer is drawn from P(w,⋅).
For the state space we set X={0,1}×W, with typical state x=(s,w). Here s is binary and indicates current employment status, while w is the current wage. Let
denote the state-dependent jump rate, which switches between κ and α depending on employment status.
Let a∈A:={0,1} indicate the action, where 0 means reject and 1 means accept. Let Π(x,a,x′) represent the jump probabilities, with
Π((0,w),a,(0,w′))Π((0,w),a,(1,w′))Π((1,w),a,(0,w′))Π((1,w),a,(1,w′))=P(w,w′)(1−a)(unemployed to unemployed)=P(w,w′)a(unemployed to employed)=P(w,w′)(employed to unemployed)=0.(employed to employed)
The first two lines consider jump probabilities for the state (s,w) when unemployed and the action is a. The second two consider jump probabilities when employed. The reason that the probability assigned to the last line is zero is that a jump from s=1 occurs because the worker is fired, so the value of s after the jump is zero.
Motivated by the jump chain construction of intensity matrices in (10.56), we set
then lifetime value is given by (10.93), where (Xt)t⩾0 is Qσ-Markov and X0=x.
With Γ defined by Γ(x)=A for all x∈X, the tuple C=(Γ,δ,r,Q) is a continuous-time MDP and Theorem 10.2.4 applies. In particular, an optimal policy exists and can be computed with HPI in a finite number of iterations.
Figure Figure 10.4 shows an optimal policy computed in this way. (Code and parameter values can be found in cont_time_js.jl.) The policy is of threshold type, with a reservation wage of around 12. Figure Figure 10.5 shows how this reservation wage changes with parameters. The reservation wage increases as the separation rate falls, as the offer rate increases, as the discount rate falls, and as unemployment compensation increases.
Applebaum (2019) and Engel & Nagel (2006) provide elegant introductions to semigroup theory and its applications in studying partial and stochastic differential equations. The beautiful book by Lasota & Mackey (1994) and covers connections among semigroups, Markov processes, and stochastic differential equations. Norris (1998) provides a good introduction to continuous-time Markov chains, while Liggett (2010) is more advanced.
A rigorous treatment of continuous-time MDPs can be found in Hernández-Lerma & Lasserre (2012), which also handles the case where X is countably infinite. Our approach is somewhat different, since our main optimality results rest on the ADP theory in Chapter 9.
In recent years, continuous-time dynamic programming has become more common in macroeconomic analysis. Influential references include Nuño & Moll (2018), Kaplan et al. (2018), Achdou et al. (2022), and Fernández-Villaverde et al. (2023). For computational aspects, see Duarte (2018), Ráfales & Vázquez (2021), Rendahl (2022), and Eslami & Phelan (2023).
Other names for intensity matrices include “Q-matrices” (which is fine until you need to use another symbol), “Kolmogorov matrices,” and “infinitesimal stochastic matrices.”
Hirsh, M., & Smale, S. (1974). Differential Equations, Dynamical Systems and Linear Algebra. Academic Press.
Engel, K.-J., & Nagel, R. (2006). A short course on operator semigroups. Springer Science & Business Media.
Applebaum, D. (2019). Semigroups of Linear Operators (Vol. 93). Cambridge University Press.
Garman, M. B. (1985). Towards a semigroup pricing theory. The Journal of Finance, 40(3), 847–861.
Duffie, D., & Garman, M. B. (1986). Intertemporal Arbitrage and the Markov Valuation of Securities. Citeseer.
Liggett, T. M. (2010). Continuous Time Markov Processes: An Introduction (Vol. 113). American Mathematical Society.
Lasota, A., & Mackey, M. C. (1994). Chaos, Fractals, and Noise: Stochastic Aspects of Dynamics (2nd ed., Vol. 97). Springer Science & Business Media.
Norris, J. R. (1998). Markov Chains. Cambridge University Press.
Hernández-Lerma, O., & Lasserre, J. B. (2012). Further Topics on Discrete-Time Markov Control Processes (Vol. 42). Springer Science & Business Media.
Nuño, G., & Moll, B. (2018). Social optima in economies with heterogeneous agents. Review of Economic Dynamics, 28, 150–180.
Kaplan, G., Moll, B., & Violante, G. L. (2018). Monetary policy according to HANK. American Economic Review, 108(3), 697–743.
Achdou, Y., Han, J., Lasry, J.-M., Lions, P.-L., & Moll, B. (2022). Income and wealth distribution in macroeconomics: A continuous-time approach. The Review of Economic Studies, 89(1), 45–86.
Fernández-Villaverde, J., Hurtado, S., & Nuno, G. (2023). Financial frictions and the wealth distribution. Econometrica, 91(3), 869–901.
Duarte, V. (2018). Machine learning for continuous-time economics [Techreport]. SSRN, 3012602.
Ráfales, J., & Vázquez, C. (2021). Equilibrium models with heterogeneous agents under rational expectations and its numerical solution. Communications in Nonlinear Science and Numerical Simulation, 96, 105673.