Introduction
Let’s try and understand the gradient descent algorithm and it’s use cases in optimization.
The Problem Setup
Firstly, we want to formulate some kind of optimization problem, which typically involves a sufficiently “regular” function \(f: \mathbb R^{n} \rightarrow \mathbb R^{m}\) that we want to maximize/minimize on a “good” subset of \(\mathbb R^{m}\). The exact notion of “regular” and “good” will be highlighted later on.
\[ \textrm{Maximize} \ \ f(\boldsymbol{x}) \ \ , \ \ x\in S \subseteq \mathbb R^{m} \]