This article provides a quick and comprehensive overview of optimal control theory and offers intuition behind the Pontryagin maximum principle.

Introduction

Optimal control theory is a branch of mathematical optimization that deals with finding control functions that can drive a dynamical system from one state to another while minimizing a cost function. This blog post aims to provide an intuitive understanding of the fundamental concepts.

Optimal Control Theory

Dynamics of the System

In a continuous time system, the dynamics of the environment can be described by:

\[\dot{x}(t) = f(x(t),\alpha(t))\] \[x(0) = x^{0}\]

Where \(x^{0}\) is the initial state and \(x(t)\) is the state of the system at time \(t\). Here \(x : [0, ∞) → R^{n}\). Given that \(A\) is the set of control parameters, the control function is given as \(α : [0, ∞) → A\).

The dynamics of the system are controlled by:

  • The initial state \(x^{0}\)
  • The sequence of control actions \(\alpha(t)\)
  • The laws governing the state transition

Payoff

The control sequence \(\alpha^{*}(t)\) is considered optimal when:

\[P[\alpha^{*}(·)] ≥ P[\alpha(·)]\]

where \(P[\alpha(·)]\) is the payoff functional, represented by:

\[P[\alpha(·)] := \int_{0}^{T} r(\mathbf{x}(t), \alpha(t)) \, dt + g(\mathbf{x}(T))\]

Here \(r(\mathbf{x}(t), \alpha(t))\) represents the running cost and \(g(\mathbf{x}(T))\) represents the Terminal cost.

Controllability

A system is controllable when there exists a trajectory from the initial state \(x^{0}\) to the target, influenced by state dynamics and a specific sequence of control inputs.

Unique Solution

For any linear non-homogeneous system of ordinary differential equations (ODE), the unique solution is given by:

\[x(t) = \mathbf{X}(t) {x}^0 + \mathbf{X}(t) \int_{0}^{t} \mathbf{X}^{-1}(s) N \alpha(s) \, ds\]

Here \(\mathbf{X}(t)\) is the fundamental solution:

\[\mathbf{X}(t) = e^{tM} := \sum_{k=0}^{\infty} \frac{t^k M^k}{k!}\]

Controllability Matrix

For any given optimal control problem, we define controllability matrix \(G\) as:

\[G = G(M,N) := [N,MN,M^{2}N,...,M^{n−1}N]\]

Theorem: A system is controllable when rank of \(G = n\) and \(Re \lambda ≤ 0\) for each eigenvalue \(\lambda\) of \(M\)

Observability

For linear systems, observability and controllability are dual concepts. A system is observable if the knowledge of \(y(·)\) on any time interval [0,t] allows us to compute \(x^{0}\).

References

  1. An Introduction to Mathematical Control Theory
  2. Optimal Control Theory