General methods for MDO
A unified theory for computing derivatives
Reference: [Martins and Hwang, AIAA Journal, 2013]
As explained in the Background page, derivatives play an important role in enabling efficient large-scale optimization. Specifically, we must be able to efficiently and accurately compute the derivatives of the objective and constraint functions with respect to all the design variables.
There are four categories of existing, known derivative computation methods:
Monolithic methods: these treat the entire model monolithically, as a black box, and approximate the derivatives by running the model at perturbed points and doing some simple algebra on the results. We interpret the model as a single vector function and apply either the finite-difference method or the complex-step method.
Algorithmic differentiation: here, we use software that parses our model's source code and differentiates it line-by-line, treating the variable assignment in each line as the definition of a new variable.
Semi-analytic methods: here, we assume that the model internally contains a system of equations. We linearize the residual function for this system of equations, and solve a linear system with the Jacobian matrix (direct method) or its transpose (adjoint method). For the direct and adjoint methods, the derivative computation time is independent of the number of output and input variables, respectively.
Coupled methods: here, we assume that the model contains multiple disciplines that are coupled. If each discipline takes the form of an explicit function, we call this the functional form; if each is an implicit function, we call this the residual form.
We can unify the methods from all four categories by formulating the model as a nonlinear system of equations R(u)=0. Each category corresponds to a different choice of unknowns and residuals in this system: a different level of decomposition of the model. Applying the inverse function theorem to this nonlinear system, the inverse of the Jacobian of R is equal to the Jacobian of the inverse of R. The Jacobian of R is the matrix of the partial derivatives of each discipline or component of the model; the Jacobian of the inverse of R is the matrix of total derivatives. From this, we derive the unifying derivative equation (UDE), which is given below.
Solving the linear system given by the left-hand equality yields the derivatives of all variables with respect to a single model input, and solving the linear system given by the right-hand equality yields the derivatives of a single model output with respect to all derivatives. Therefore, for large-scale optimization, the right-hand equality is usually the better choice.
As shown below, all types of existing derivative computation methods can be derived from this equation using the right choice of R and u. Each choice corresponds to a different level of decomposition of the model, where one extreme is every line of code inside the model and the other extreme includes only the model inputs and model outputs. Note that below, we use slightly different notation: C(v) in place of R(u). In all cases, x refers to model inputs and f refers to model outputs.
Derivation of the monolithic methods, AD, semi-analytic methods, coupled functional method, and coupled residual method from the UDE [Martins and Hwang, AIAA Journal, 2013]
A computational architecture for large-scale MDO
Reference: [Hwang and Martins, ACM TOM, 2018]
The UDE can be used as the basis of a software framework for large-scale MDO. In general, software frameworks for computational modeling take the approach of decomposing a large, multidisciplinary model into smaller and more manageable components so that the implementation effort is reduced. For large-scale MDO, derivatives of the multidisciplinary model are needed, and the UDE can be used by the framework to centrally compute total derivatives of the model given the partial derivatives of the constituent components.
We can design a software framework in the following way. All the outputs of all the components in the model comprise the vector u, and the equations that define how to compute those outputs comprise the residual function R. In other words, each component or discipline 'owns' a subset of the equations in the nonlinear system, R(u)=0. It then follows that, when we solve the UDE in the transpose form (right-hand equality), each component or discipline contributes a subset of the columns in the Jacobian of R. In the context of computing total derivatives of the model, this means that when we build a component within a model, we contribute part of the Jacobian of R. Then the software framework assembles the parts from the various components and solves the linear system from the UDE, to compute the objective gradient and constraint Jacobian, which we need for gradient-based optimization. This approach is called the modular analysis and unified derivatives (MAUD) architecture.
MAUD is the idea of building a software framework for MDO using the UDE where each component contributes a subset of R and u. [Hwang and Martins, ACM TOM, 2018]
In the MAUD architecture, we use a parallel, hierarchical solution strategy to solve the nonlinear system R(u)=0 and the linear system given by the UDE. Since both R(u)=0 and the UDE are large systems of equations composed of parts originating from separate components, we take advantage of the hierarchical structure in the dependencies between the variables and block-solve the systems accordingly. If the groups of variables can be evaluated sequentially, we apply a single iteration of the nonlinear and linear block Gauss--Seidel solvers. If we wish to parallelize across variables that are fully decoupled and assign each group of variables to its own group of processors, we apply a single iteration the nonlinear and linear block Jacobi solvers. If the variables are coupled and no simplification is available, we can use a monolithic solver such as Newton's method in the nonlinear case and a Krylov subspace solver or a direct solver in the linear case. In this manner, MAUD casts the tasks of running the multidisciplinary simulation and computing the model derivatives as the solution of a nonlinear and a linear system, respectively.
The MAUD architecture is used by NASA's OpenMDAO software framework for MDO, which is Python-based and open-source. The source code can be downloaded here. OpenMDAO can be viewed as an implementation of MAUD as derivatives are computed in OpenMDAO using UDE. OpenMDAO is well-documented and supported as it is being continuously developed and maintained by a team of engineers and computer scientists at NASA Glenn Research Center.
Visualization of a multidisciplinary OpenMDAO model of an electric aircraft for on-demand mobility. [Hwang and Ning, AIAA 2018-1384]