Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Notation

Mathematical Notation

1{P}\1\{P\}

indicator function (1 if statement PP is true, 0 otherwise)

α1\alpha \coloneq 1

α\alpha is defined as equal to 1

f1f \equiv 1

function ff is everywhere equal to 1

\bigvee and \bigwedge

supremum and infimum (see Section A.1.2.4)

(A)\wp(A)

the power set of AA; that is, the set of all subsets of given set AA

[n]\natset{n}

{1,,n}\{1, \ldots, n\}

C\CC

the complex numbers

N\NN, Z\ZZ and R\RR

the natural numbers, integers and real numbers respectively

Z+\ZZ_+, R+\RR_+, etc.

the nonnegative elements of Z\ZZ, R\RR, etc.

x|x|

absolute value of scalar or vector xx (modulus if xCx \in \CC)

B|B| for set BB

the cardinality of BB

Rn\RR^n

all nn-tuples of real numbers

Rm×n\RR^{m \times n}

all m×nm \times n real matrices

xy    (x,yRn)x \leq y \;\; (x,y \in \RR^n)

xiyix_i \leq y_i for i=1,ni=1, \ldots n (pointwise partial order)

RX\RR^\Xsf

all functions from X\Xsf to R\RR

bXb\Xsf

all bounded (or bounded measurable) functions in RX\RR^\Xsf (see Example A.1.14)

bcXbc\Xsf

all continuous functions in bXb\Xsf (see Example A.1.17)

f(n)=O(βn)f(n) = \OO(\beta^n)

there exists C<C < \infty with f(n)Cβnf(n) \leq C \beta^n for all nNn \in \NN

D(X)\dD(\Xsf)

the set of Borel probability measures on X\Xsf (see Section A.5.4)

B(E,F)\blop(E, F)

the set of bounded linear operators from EE to FF (see Section A.4.3)

a,b\la a, b \ra

the inner product of aa and bb

vnvv_n \uparrow v

(vn)(v_n) is increasing and nv=v\bigvee_n v = v (see Section A.1.2.6)

IID

independent and identically distributed

X=dYX \eqdist Y

XX and YY have the same distribution

XFX \sim F

XX has distribution FF

FFGF \lefsd G

FF first order stochastically dominates GG (see Section A.5.5)

Dynamic Programming Notation and Terminology

(V,T)(V, \TT)

an ADP with value space VV and policy operators TσTT_\sigma \in \TT (see Section 2.1.1.1)

vσv_\sigma

a σ\sigma-value function; fixed point of TσT_\sigma (see Section 2.1.1.1)

TT

the Bellman operator, defined by Tv=σTσvTv = \bigvee_\sigma T_\sigma \, v (see (2.4))

HH

the Howard operator, defined by Hv=vσHv = v_\sigma where σ\sigma is vv-greedy (see Section 2.2.1.1)

WW

the optimistic policy operator (see (2.10))

VGV_G

all vVv \in V with at least one vv-greedy policy (see Section 2.1.1.4)

VUV_U

all vVGv \in V_G such that vTvv \preceq Tv (see Section 2.1.1.4)

VΣV_\Sigma

the set of fixed points of the policy operators (see Section 2.1.1.4)

v\vmax

the value function; greatest element of VΣV_\Sigma (see Section 2.1.2.1)

VFI

value function iteration (see Section 1.2.1.3 and Section 2.2.1)

OPI

optimistic policy iteration (see Section 1.2.1.3 and Section 2.2.1)

HPI

Howard policy iteration (see Section 1.2.1.3 and Section 2.2.1)