A recurring task in mathematics is establishing when two apparently different objects are, in a precise sense, the same. Such equivalences are formalized by invertible, structure-preserving maps. For example, a group isomorphism is a bijection that preserves the group operation, revealing that two groups share identical algebraic structure despite different representations. A topological conjugacy between two dynamical systems is a homeomorphism intertwining their transition maps, establishing that the systems have the same dynamic behavior up to a relabeling of the state space. In the setting of dynamic programming, the relevant structure is order, and the appropriate notion of equivalence is that of an order isomorphism between posets.
We now investigate transformations of dynamic programs that preserve optimality structure. First, in Section 5.1, we investigate order isomorphisms, under which connections are exact. Two dynamic programs linked by isomorphisms are “the same” in terms of their optimality properties. In elementary terms, this is analogous to the way that g=ϕ∘f has the same maximizer as f when ϕ is strictly increasing; and that a maximizer of g is a minimizer of f when ϕ is strictly decreasing.
Isomorphisms are useful but there is also a sense in which they preserve too much structure. Often we want to transform a dynamic program into another one that differs along at least some dimensions -- for example, in terms of dimensionality, or smoothness -- and try to understand the original problem by studying the second more tractable one. We investigate such transformations in Section 5.2.2, under the title of “factored” dynamic programs. In Section 5.2, we introduce factored dynamic programs (FDPs), which involve non-bijective transformations that link a primary ADP to a subordinate ADP of potentially lower dimension. We show how optimality properties transfer between the two. Section 5.3 applies both isomorphisms and FDPs to concrete settings, including Q-factor models, structural estimation, and Epstein--Zin preferences.
In this section we introduce a concept of “isomorphic” dynamic programs. In particular, we describe an isomorphic relationship that leads to essentially equivalent optimality properties. The basic idea can be explained with a simple example. Consider the savings problem from Section 1.3.2.3, with Bellman equation
If u has heavy curvature near zero, one might consider taking an exponential transformation, aiming to work with functions that are easier to approximate numerically. Applying exp to both sides, writing u^ for exp∘u and v^ for exp∘v, we get
Not surprisingly, these two dynamic programs turn out to have exactly the same optimal policies, giving us two viable angles of attack. The theory in this section clarifies and generalizes this idea, with the aim of allowing us to apply it effectively to both simple and complex problems.
Along the way we will meet useful concepts such as topological conjugacy for dynamic systems and order isomorphisms. We will exploit these ideas to study additional optimization problems and algorithms, including time iteration methods and decision problems with ambiguity.
We begin with conjugate dynamical systems in Section 5.1.1.1, which preserve fixed point structure. In Section 5.1.1.2, we strengthen conjugacy to topological conjugacy, which additionally preserves stability. In Section 5.1.1.3, we develop order conjugacy, an order-theoretic alternative that is better suited to passing optimality properties between ADPs.
We recall that a (discrete time) dynamical system is a pair (V,S), where V is any set and S is a self-map on V. Two dynamical systems (V,S) and (V^,S^) are said to be conjugate under F (or just conjugate) if
F is a bijection from V into V^ and F∘S=S^∘F on V.
We can also write the last equality as S=F−1∘S^∘F. This helps us understand the conjugacy relationship: shifting a point v∈V to Sv via S is equivalent to
moving v into V^ via v^=Fv,
applying S^ to produce S^v^, and then
moving the result S^v^ back to the original space V using F−1.
The next result lists some consequences of conjugacy.
The proofs of these claims are straightforward. For example, regarding (i), suppose that Sn=F−1S^nF at some fixed n. Then, using conjugacy,
For us, perhaps the most important consequence of Proposition 5.1.1 is that, if two dynamical systems (V,S) and (V^,S^) have this property, then the system (V,S) has a unique fixed point if and only if (V^,S^) has a unique fixed point. At the same time, conjugacy is not enough to pass on stability properties. For this we need topological conjugacy.
To state this property, let V and V^ be Hausdorff topological spaces. A map F:V→V^ is called a homeomorphism if F is a bijection from V to V^ and, in addition, both F and its inverse are continuous. The dynamical systems (V,S) and (V^,S^) are called topologically conjugate when these two systems are conjugate under a bijection F and, in addition, F is a homeomorphism.
In this setting, we have the following result:
The following example gives an elementary but nonetheless important illustration of the value of Proposition 5.1.2.
In the previous two sections we introduced conjugacy and then strengthened it to topological conjugacy, a fairly standard approach. Now, however, we will step back to conjugacy and strengthen it using order rather than topology. The order-theoretic notion we develop will be analogous to topological conjugacy, while at the same time being better suited to passing optimality properties from one ADP to another.
To begin, we take V and V^ to be posets and consider two dynamical systems (V,S) and (V^,S^). We call these systems order conjugate under F if
In this section, we use order conjugacy to connect dynamic programs. We are interested in whether or not ADPs can be connected by order isomorphisms (or anti-isomorphisms), and what implications this has for optimality. In Section 5.1.2.1, we define isomorphic ADPs and show that isomorphism is an equivalence relation. In Section 5.1.2.2, we prove that isomorphic ADPs share the same optimality and convergence properties. Section 5.1.2.3 extends the analysis to anti-isomorphic ADPs, where maximization in one ADP corresponds to minimization in the other. Finally, in Section 5.1.3, we apply both isomorphic and anti-isomorphic relationships to establish optimality for an Epstein--Zin preference model.
In other words, if A is the set of all ADPs and, for (V,T),(V^,T^)∈A, the symbol (V,T)∼(V^,T^) means (V,T) and (V^,T^) are isomorphic, then ∼ is reflexive, symmetric and transitive.
We seek relationships between optimality properties of isomorphic ADPs. For all of this section, we take (V,T) and (V^,T^) to be two ADPs with T={Tσ:σ∈Σ} and T^={T^σ:σ∈Σ}. When they exist, we let
vσ (resp., v^σ) be the unique fixed point of Tσ (resp., T^σ),
T (resp., T^) be the Bellman operator of (V,T) (resp., (V^,T^)), and
v∗ (resp., v^∗) be the value function of (V,T) (resp., (V^,T^)).
The next theorem shows that isomorphic ADPs share the same regularity and optimality properties:
The next theorem studies the case when (V,T) and (V^,T^) are regular and well-posed. It tells us that isomorphic ADPs have the same optimality properties.
The next theorem considers convergence of algorithms.
The proof is long but straightforward and presented as a solved exercise.
In this section, we switch to studying anti-isomorphic ADPs. In doing so, we will consider minimization as well as maximization. We follow the notational conventions and terminology introduced in Section 2.2.3. Most readers will find it helpful to review that section before reading this one.
Let (V,T) and (V^,T^) be ADPs with the same policy set. In line with the notation in Section 5.1.2.2, we let
T▽ (resp., T^▽) be the Bellman min-operator of (V,T) (resp., (V^,T^)),
v▽∗ (resp., v^▽∗) be the min-value function of (V,T) (resp., (V^,T^)),
H▽ (resp., H^▽) be the Howard policy min-operator of (V,T) (resp., (V^,T^)), and
W▽ (resp., W^▽) be the optimistic policy min-operator of (V,T) (resp., (V^,T^)).
As was the case in Section 2.2.3, we enhance clarity by adding a “max-” prefix to previously introduced definitions that pertain to maximization. For example,
“optimal policies” will be referred to as “max-optimal policies”,
the “Bellman equation” will be referred to as the “Bellman max-equation”,
the “Bellman operator” will be referred to as the “Bellman max-operator”,
and so on.
We call (V,T) and (V^,T^)anti-isomorphic under F if these two ADPs have the same policy set Σ and, in addition, F is an anti-isomorphism from V to V^ such that (5.7) holds.
We can also express this relationship in terms of isomorphisms and duality of ADPs, as defined in Section 2.2.3.2:
Here is an optimality result for anti-isomorphic ADPs that parallels Theorem 5.1.5.
Now we consider convergence of algorithms in the anti-isomorphic case.
In this section, we use the theory of isomorphic and anti-isomorphic ADPs to study optimality properties of a modified MDP model that incorporates Epstein--Zin preferences (motivation for which was provided in Section 1.3.3). We will show how isomorphic and anti-isomorphic relationships can be used to simplify analysis.
To begin, consider a variation of the finite state MDP from Section 1.2 where the Bellman equation is modified to
Here X, A, r, Γ, β, and P are all as in Section 1.2, while G is the feasible state-action pairs. The parameters α and ν are connected to the notation of Section 1.3.3 via α=1−1/ψ and ν=1−γ, where ψ is the EIS and γ is the coefficient of relative risk aversion. The change of notation is made to simplify the presentation. If α=ν=1, then this Epstein--Zin model reduces to an ordinary finite state MDP. In this section, however, we allow α and ν to take any nonzero values.
We assume that r is strictly positive, so that Tσ maps (0,∞)X into itself. Sargent & Stachurski (2025) establish optimality properties for this specification when Pσ is irreducible for all σ∈Σ. Here we drop this assumption and establish the same optimality properties.
To this end, let θ=ν/α. Fix ϵ>0 with minrα−ϵ>0. Let
We are interested in optimality properties of (V,TEZ). While we can try to tackle this ADP directly, the arguments become significantly easier after a transformation. To pursue this path, we introduce the auxiliary ADP (V^,T^EZ) with V^ as defined above and
In the next exercise, f≪g means f(x)<g(x) for all x, as in Section 4.1.3.
The results from Exercise 5.9 and Exercise 5.10 allow us to establish optimality properties for the auxiliary ADP (V^,T^EZ).
The next result follows easily from the conclusion of Exercise 5.11.
We are now ready to state and prove the main result of this section.
The relationship between (V,TEZ) and (V^,T^EZ) allows us to use either one to solve for an optimal policy. For example, if ν<0, then, by Theorem 5.1.8, any min-optimal policy for (V^,T^EZ) will be max-optimal for (V,TEZ). Hence we can solve for the Epstein--Zin max-optimal policy either by directly solving (V,TEZ) or by solving (V^,T^EZ) for a min-optimal policy. The best choice depends on computational simplicity and numerical stability.
In this section we introduce an asymmetric relationship between ADPs that involves a form of factorization. This factorization produces two versions of an ADP: a primary one and a subordinate one. Typically, the primary ADP will be relatively standard, while the subordinate ADP can be thought of as a variation that might have some analytical or computational advantages. Under certain conditions, studying the “simpler” subordinate ADP will shed light on outcomes and solutions for the primary ADP.
In the applications we consider, the associated transformations are not bijective, unlike the isomorphic relationships considered in Section 5.1. Sometimes the lack of bijective transformations occurs because one dynamic program (the primary ADP) evolves in a higher dimensional space than the other (the subordinate ADP). Although the transformations in question are not bijections, we nonetheless show that the primary and subordinate ADPs have tight connections in terms of optimality.
This section is related to the dynamic programs associated with the modified Bellman equations we introduced in Chapter 5 of Sargent & Stachurski (2025). Relative to that theory, the exposition below is more concise and more general. As a result, we can easily cover additional variations on the MDP Bellman equation, of which there are many, as well as studying relationships between ADPs beyond the traditional MDP setting.
Let (V,S) and (V^,S^) be dynamical systems, where V and V^ are posets. In this setting, we call (V,S) and (V^,S^)strongly semiconjugate under F,G when there exist maps F:V→V^ and G:V^→V such that
An immediate implication of the definitions is: if either F or G is an order isomorphism, then the systems (V,S) and (V^,S^) are order conjugate. However, in the applications we consider, neither F nor G will be bijective.
Figure 5.1 helps to illustrate the difference between conjugacy and strong semiconjugacy.
Figure 5.1:Comparison of conjugacy and strong semiconjugacy
Like order conjugacy, strong semiconjugacy can be used to derive useful relationships between dynamical systems. The next lemma lists relationships that will be helpful when we turn to dynamic programming.
The next lemma extends Lemma 5.2.1 to cover order stability.
When we get to applications, our main aim will be to convert a given system (V,S) into a “nicer” system (V^,S^) and then learn about (V,S) by studying (V^,S^). In particular, we wish to (a) deduce the existence of a unique fixed point of (V,S) only by studying (V^,S^), and (b) compute this unique fixed point, working only with (V^,S^). The next theorem shows the way. It draws on Lemma 5.2.1 and Lemma 5.2.2 for translation of fixed points and order stability, and then adds a convergence result.
Notice that, in (5.35), we iterate only in the “nice” system (V^,S^), and finally transfer back to V using the mapping G.
Lemma 5.2.2 already tells us that order stability transfers between strongly semiconjugate systems when F and G are both order reversing. The next result parallels Theorem 5.2.3 for this case.
Note that the initial condition changes from w⪯S^w in Theorem 5.2.3 to S^w⪯w in Theorem 5.2.4. Starting above the fixed point in V^ and iterating down generates a sequence that, after applying the order-reversing map G, converges up to vˉ.
To illustrate strong semiconjugacy and its implications, we now study a firm entry problem from Fajgelbaum et al. (2017), slightly extended to allow discount rates to change over time. The basic structure of the model is similar to the firm problem we studied in Section 1.1. We show that the functional equation that describes firm value can be solved in lower-dimensional space and then mapped back to the original higher-dimensional space.
Analogous to (1.12) on page , we take the lifetime value of a firm to be a function v that solves
Here z is an exogenous state, taking values in Rk, the real number f represents an iid fixed cost, with distribution ϕ, and Q is a stochastic kernel for the exogenous state. The value s(z) represents the present value of profits for the firm if it chooses to enter in the current period. Discounting is implemented via a state-dependent factor β(z)>0.
Let E⊆R+ be the set where f takes values and let Z⊆Rk be the set where z takes values. Let X be the cartesian product Z×E. Let bX and bZ be real-valued bounded and Borel measurable functions on X and Z respectively. We endow both function spaces with the pointwise partial order ⩽ and the supremum norm.
In (iii), the process (Zt) is Q-Markov with initial condition z. Note that, in the traditional constant discount rate setting, the term in (iii) is just βn for some constant β∈(0,1), so the condition is automatically satisfied.
To solve (5.37) we introduce an operator S, defined at each v∈bX by
Evidently, v is a fixed point of S if and only if it solves the firm valuation equation (5.37).
Although we can study S directly, this fixed point problem can be solved in a lower-dimensional space by working with an alternative operator T:bZ→bZ defined at each w∈bZ by
Now we switch from studying dynamical systems to studying dynamic programs. This is straightforward because we view dynamic programs (ADPs) as families of dynamical systems.
A factored dynamic program (FDP) is a tuple (V,F,V^,G) where
V and V^ are nonempty posets,
F is a map from V to V^, and
G:={Gσ}σ∈Σ is a family of maps from V^ to V, and
the set {Gσv^}σ∈Σ has a greatest element for every v^∈V^.
If F and all Gσ are all order preserving, then we call (V,F,V^,G) an order preserving FDP. (In Section 5.2.3, we introduce order-reversing FDPs. This case is rarer, typically involving some form of nonstandard preferences.)
Given an FDP (V,F,V^,G), we introduce an operator family T:={Tσ:σ∈Σ} by setting
We call (V^,T^) the subordinate ADP generated by (V,F,V^,G). The figure below illustrates, with the numbers indicating the order in which mappings are applied. (The ordering of the tuple (V,F,V^,G) traces the primary cycle on the left.)
which is well-defined by (iv) in the definition of FDPs.
We now state some preliminary results concerning the ADPs generated by (V,F,V^,G).
Let (V,F,V^,G) be an order-preserving FDP with generated ADP (V,T) and subordinate ADP (V^,T^). We have already observed that the policy operators of (V,T) and (V^,T^) obey
for all σ∈Σ. It follows that each pair of policy systems (V,Tσ) and (V^,T^σ) is strongly semiconjugate under F,Gσ. As shown in Lemma 5.2.10, we also have
The strong semiconjugacy results in Lemma 5.2.11 imply helpful similarity properties for the two ADPs. The next lemma helps to illustrate.
Note that (5.52) implies that (V,T) and (V^,T^) are strongly semiconjugate under order-preserving F,G. This fact allows us to connect optima for the two ADPs and we use it repeatedly below.
Now we are ready to study the extent to which optimality properties are transferred under subordination. As before, the context is that (V,F,V^,G) is an order-preserving FDP with primary ADP (V,T) and subordinate ADP (V^,T^). The symbols T and T^ denote their respective Bellman operators. When they exist,
We begin with our main optimality result for order-preserving FDPs and the two ADPs they generate.
In the proof of Theorem 5.2.13, we repeatedly use the strong semiconjugacy results in Lemma 5.2.1 to transfer fixed points from one value space to another.
We continue to assume that (V,F,V^,G) is an order-preserving FDP with primary ADP (V,T) and subordinate ADP (V^,T^). We saw in Theorem 5.2.13 that the following implication holds: if σ is optimal for (V,T), then σ is optimal for (V^,T^). The converse is not in general true. Often this is because, at an intuitive level, the subordinate ADP is blind to policy behavior at states that are unreachable under the transition dynamics, while the primary ADP cares about every state. (An example of this scenario is given in Section 8.3.2.3.)
To obtain such a converse, we use a strict monotonicity condition on F. In what follows, ≺ refers to the strict inequality defined in Section A.1.2.8. In particular, for elements u,v, the statement u≺v means that u⪯v and not u=v. Also, F is strictly order preserving when F is order preserving and u≺v implies Fu≺Fv.
An order-reversing FDP is an FDP (V,F,V^,G) where F is order-reversing and each Gσ is order-reversing. As in the order-preserving case, Tσ:=Gσ∘F and T^σ:=F∘Gσ define the policy operators for the primary ADP (V,T) and subordinate ADP (V^,T^), respectively. Since F and Gσ are both order reversing, each Tσ and T^σ is order preserving, so these are valid ADPs.
Throughout this section,
(V,F,V^,G) is a given FDP,
(V,T) is the primary ADP, and
(V^,T^) is the subordinate ADP.
The primary ADP behaves just as in the order-preserving case:
The subordinate ADP is where the order-reversing case diverges. Because F reverses order, it converts the supremum G into an infimum, so F∘G gives the Bellman min-operator rather than the max-operator.
The policy operator identities Tσ=Gσ∘F and T^σ=F∘Gσ hold for any FDP, regardless of order properties, so each pair of policy systems (V,Tσ) and (V^,T^σ) is strongly semiconjugate under F,Gσ. Combining these with Lemma 5.2.15 and Lemma 5.2.16, the Bellman operators obey
The proof is immediate from (5.58). Since F and each Gσ are order reversing, the next exercise gives a parallel of Lemma 5.2.12.
We now study optimality for order-reversing FDPs. The primary ADP (V,T) is a standard maximization problem, while the subordinate ADP (V^,T^) is a minimization problem, with Bellman min-operator T^▽=F∘G. When they exist,
will denote the max-value function of (V,T) and the min-value function of (V^,T^), respectively.
The next result is the order-reversing analog of Theorem 5.2.13, extended to include a converse implication under a strict monotonicity condition. In the statement, F is called strictly order reversing when F is order reversing and u≺v implies Fv≺Fu.
In Section 5.3.1, we show that the standard MDP and its Q-factor variant are the primary and subordinate ADPs of a single order-preserving FDP, unifying results previously proved separately. In Section 5.3.2, we apply the same framework to connect discrete choice Bellman equations with the post-action value function formulation used in structural estimation. In Section 5.3.3, we revisit the Epstein--Zin model and use factored dynamic programs to obtain a lower-dimensional subordinate ADP that is more efficient to solve.
In Section 3.2.1 we discussed both MDPs and Q-factor MDPs, proving separately that they satisfy the fundamental optimality properties. Of course, these two models are connected. Here we unify the models by representing them as the primary and subordinate pairs of an FDP.
We work with the finite-state MDP environment described in Section 3.2.1. Using the primitives described there, we form an order-preserving FDP (V,F,V^,G) by setting
with V:=RX and V^:=RG. As required, F maps V to V^, and Gσ maps V^ to V, and both are order preserving. Also, fixing f∈V^ and choosing σ such that σ(x)∈argmaxa∈Γ(x)f(x,a), we verify the existence of a policy σ with Gτf⩽Gσf for all τ∈Σ. Hence (V,F,V^,G) is an order-preserving FDP, as claimed.
The primary ADP (V,T) generated by (V,F,V^,G) is produced by setting Tσ=Gσ∘F for each σ, which gives
This map T^σ is identical to the Q-factor policy operator Sσ we constructed in (3.14), on page . Thus, (V^,T^) is just the Q-factor ADP we examined in Section 3.2.1.2 (where the ADP was written as (RG,S)).
We already know that the fundamental optimality properties hold for both of these models, and that VFI, OPI, and HPI all converge. We took the time to prove these facts separately, in Section 3.2.1.1 and Section 3.2.1.2. Theorem 5.2.13 now tells us that one of these steps was unnecessary. Since these two ADPs are the primary and subordinate elements of the FDP (V,F,V^,G), establishing these facts for either one of the pairs is enough to establish them for the other.
In addition, we can use Theorem 5.2.13 to formally connect the value functions and optimal policies. In the next proposition, v∗ is the MDP value function and q∗ is the Q-factor value function.
In Section 5.2.2.4, we discussed the fact that the converse to (ii) fails to hold without additional conditions. In particular, to get the converse to (ii), we require that F is strictly order preserving. Here’s how that result looks in the present case. In the result, the statement that X has no isolated point under P means that there is no x′∈X such that P(x,a,x′)=0 for all (x,a)∈G.
In Section 4.2.3 we considered post-action value functions for discrete choice models in the context of structural estimation. Post-action value functions are a transformation of a more standard Bellman equation. Here we unify these two models through the lens of FDPs.
The setting we consider is the same as in Section 4.2.3.1. The state space X is a metric space. The action set A is finite. The set of policies Σ is all measurable maps from X to A and, correspondingly, G:=X×A. The reward function r∈RG is assumed to be bounded and Borel measurable, while P is a stochastic kernel from G to X.
We claim that (bX,F,bG,G) is an order-preserving FDP. Clearly F is an order-preserving map from bX to bG, while each Gσ is an order preserving map from bG to bX. Moreover, we proved in Exercise 4.8 that there exists a measurable map σ:X→A obeying
For any such σ we have Gτg⩽Gσg for all τ∈Σ. These facts confirm our claim.
The primary ADP (bX,TSE) for this model is the “natural” discrete choice version of the dynamic program. To describe it, we use Tσ=Gσ∘F to determine the policy operators, yielding
which is exactly the post-action value function Bellman equation we examined in (4.45).
We proved in Proposition 4.2.4 that, for this ADP (bG,T^SE), the fundamental optimality properties hold. Theorem 5.2.13 now implies that the same properties hold for (bX,TSE). It also tells us that any policy optimal for (bX,TSE) is also optimal for (bG,T^SE), and that a policy σ is optimal for (bX,TSE) whenever Gσg∗=Gg∗. (Here G=⋁σGσ and g∗ is the value function for (bG,T^SE), which is the unique solution to the post-action value function Bellman equation.)
In the present setting, this means we can compute an optimal policy for the primary ADP as follows:
Compute the unique (in bG) fixed point g∗ of the Bellman operator T^ corresponding to the subordinate (post-action value) ADP (bG,T^SE) and
Also, if the function F in (5.67) is strictly order preserving, then by Proposition 5.2.14, we can compute an optimal policy for (bX,TSE) by solving for an optimal policy for (bG,T^SE). The strictly order preserving property does not hold for F in general. For a sufficient condition when X is finite, see Example A.1.12.
We consider a special case of the Epstein--Zin model from Section 5.1.3 involving optimal savings with iid endowments. Using the FDP framework, we construct a subordinate ADP that operates on functions of wealth alone, rather than wealth and endowment jointly. This yields significant computational savings, which we quantify numerically.
In this section we consider a special case of the Epstein--Zin ADP (V,T) analyzed in Section 5.1.3. The special case concerns optimal savings in the presence of an iid endowment process. We will produce a subordinate ADP via a transformation reminiscent of the expected value transformation of an ordinary MDP in Section 5.3.2. We will see that this subordinate ADP is easier to analyze and solve.
We begin with a finite set W of possible wealth values and a finite set E of possible values for the endowment process. (Finiteness helps simplify the exposition and can be replaced by continuity and compactness conditions.) The Bellman equation takes the form
Here Γ(w,e)⊂W is the set of all feasible choices for next period wealth w′ given current wealth w and current endowment e. The new endowment e′ is drawn independently from distribution ϕ, which maps E into [0,1].
This model is a special case of the Epstein--Zin ADP from Section 5.1.3. To see this we set X:=W×E, with typical element x=(w,e). Let V be the order interval [v11/ν,v21/ν]⊂(0,∞)X defined in Section 5.1.3 (see (5.25)). With a=w′ and A=W, the Bellman equation (5.75) is a special case of (5.19).
To improve analysis we produce an order-preserving FDP where the primary ADP is the model just discussed and the subordinate ADP operates in a lower-dimensional state space. To do so we set V as above,
Both F and Gσ are order-preserving. Assuming that Γ(w,e) is nonempty at each (w,e)∈X, one easily verifies the existence, for each h∈V^, of a policy σ such that
σ(w,e)∈w′∈Γ(w,e)argmax{(1−β)r(w,w′,e)α+βh(w′)α}1/αfor all (w,e)∈X.
For this σ we have Gτh⩽Gσh for all τ∈Σ. Hence, with G={Gσ}σ∈Σ, the tuple (V,F,V^,G) is an order-preserving FDP.
Inspecting (5.79), we see that the corresponding Bellman equation is (5.75). Thus, the primary ADP (V,T) corresponds to the original problem we considered at the start of this section. Let’s now look at the subordinate problem.
The benefit of working with (V^,T^) is that T^σ acts on functions that depend only on w rather than on both w and e (as is the case for Tσ). These lower dimensional operations are significantly more efficient, even when E is relatively small.
Since (V,T) is a special case of the ADP discussed in Section 5.1.3, Proposition 5.1.13 implies that the fundamental optimality properties hold for (V,T). As a result, Theorem 5.2.13 implies that they also hold for (V^,T^). It also tells us that we can obtain an optimal policy for (V,T) by finding the value function v^∗ for (V^,T^) and then calculating a policy σ obeying Gσv^∗=Gv^∗. By the definition of Gσ in (5.77), this means that we solve for σ satisfying (5.78) after setting h=v^∗.
To compute v^∗, we can use Theorem 2.2.6, which tells us that Howard policy iteration applied to (V^,T^) converges to v^∗ in finitely many steps. Summarizing this analysis, an optimal policy for (V,T) can be computed via Algorithm 5.3.1.
Figure 5.2 shows w↦σ∗(w,e) for two values of e (smallest and largest) when σ∗ is the optimal policy, calculated using Algorithm 5.3.1. In the computation we set Γ(w,e)=[0,w] and r(w,s,e)=w−s+e. We chose α and ν to match the values used in Schorfheide et al. (2018).
In Figure 5.3 we display the relative speed gain from using the lower-dimensional model (V^,T^) instead of (V,T) across multiple choices of ∣W∣ and ∣E∣. The speed gain is the time required to solve an optimal policy for (V,T) using HPI applied to (V,T) (as in Theorem 2.2.6), divided by the time required to solve for the same optimal policy via Algorithm 5.3.1. The speed gain increases linearly in the size of E.
Figure 5.2:Optimal savings policy with Epstein--Zin preference
Figure 5.3:Speed gain from replacing (V,T) with subordinate model (V^,T^)
A recurring theme in applied dynamic programming has been the rearrangement of the Bellman equation into alternative functional forms in order to obtain analytical, computational, or statistical advantages. Familiar instances include reservation-wage and continuation-value formulations of optimal stopping problems, used as early as the job-search analysis of McCall (1970) and developed extensively in the real-options literature (see, e.g., Dixit & Pindyck (2012)); the expected and integrated value-function transformations used in structural estimation of discrete choice models, pioneered by Rust (1987); the Q-factor representations introduced for reinforcement learning by Watkins (1989); and lower-dimensional reformulations that integrate out independent or uncontrollable states. Systematic studies of such transformations can be found in Ma & Stachurski (2019) and Ma & Stachurski (2021). The order-theoretic framework adopted in this chapter, building on the abstract foundations of Sargent & Stachurski (2025), unifies and extends those analyses by treating the underlying maps in their own right rather than as recipes for individual Bellman equations. Closely related ideas in the more concrete setting of finite-state MDPs can be found in Chapter 5 of Sargent & Stachurski (2025).
The topological conjugacy results in Section 5.1.1.2 are applied in Section 8.3.4 of Chapter 8 to establish the equivalence of value function iteration and time iteration.
The applications in Section 5.3 unify topics treated in their own right elsewhere in this volume. References to the underlying literatures can be found in the corresponding chapter notes: see Section 3.3 for Q-factor representations and reinforcement learning, Section 4.3 for structural estimation of dynamic discrete choice models, and Section 1.6 for Epstein--Zin preferences. The optimal harvest model in Section 8.3.2 of Chapter 8 is another example of a factored dynamic program in the sense of Section 5.2.2.
The firm-entry application in Section 5.2 is adapted from Fajgelbaum et al. (2017), slightly extended to permit time-varying discounting; for related industry-dynamics models see Hopenhayn (1992). The parameter values used in Figure 5.2 follow Schorfheide et al. (2018).
Sargent, T. J., & Stachurski, J. (2025). Dynamic Programming: Finite States. Cambridge University Press.
Fajgelbaum, P. D., Schaal, E., & Taschereau-Dumouchel, M. (2017). Uncertainty traps. The Quarterly Journal of Economics, 132(4), 1641–1692.
Stachurski, J., & Zhang, J. (2021). Dynamic programming with state-dependent discounting. Journal of Economic Theory, 192, 105190.
Schorfheide, F., Song, D., & Yaron, A. (2018). Identifying long-run risks: A Bayesian mixed-frequency approach. Econometrica, 86(2), 617–654.
McCall, J. J. (1970). Economics of Information and Job Search. The Quarterly Journal of Economics, 84(1), 113–126.
Dixit, R. K., & Pindyck, R. S. (2012). Investment under uncertainty. In Investment Under Uncertainty. Princeton University Press.
Rust, J. (1987). Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher. Econometrica, 999–1033.
Watkins, C. J. C. H. (1989). Learning from delayed rewards [Techreport]. PhD Thesis, King’s College, Cambridge United Kingdom.
Ma, Q., & Stachurski, J. (2019). Optimal timing of decisions: A general theory based on continuation values. Journal of Economic Dynamics and Control, 101, 62–81.
Ma, Q., & Stachurski, J. (2021). Dynamic programming deconstructed: Transformations of the Bellman equation and computational efficiency. Operations Research, 69(5), 1591–1607.
Sargent, T. J., & Stachurski, J. (2025). Dynamic Programs on Partially Ordered Sets. SIAM Journal on Optimization and Control, in press.
Hopenhayn, H. A. (1992). Entry, exit, and firm dynamics in long run equilibrium. Econometrica: Journal of the Econometric Society, 1127–1150.