2. Nonlinear Solvers
Thus far, our examples have been focused on defining explicit mathematical relationships of the form \(y = f(x)\). CSDL also provides an interface to define implicit relationships between variables of the form \(f(x,y) = 0\) where \(x\) represents parameters (inputs) and \(y\) represents states (outputs). Iterative numerical methods (nonlinear solvers) are often used to solve for \(y\) given \(x\) by perturbing the values of \(y\) to drive \(r = f(x,y)\) to zero.
In this tutorial, we will show you how to specify this using CSDL’s ImplicitVariable and nonlinear solver library.
We will use a simple nonlinear system of equations as our example (from https://engcourses-uofa.ca/books/numericalanalysis/nonlinear-systems-of-equations/fixed-point-iteration-method/):
\(2a = {x_1}^2 + {x_1}{x_2}\)
\(57 = {x_2}+3{x_1}{x_2}^2\)
where \(a\) is a parameter and \(x_1\), \(x_2\) are the states. We can rewrite this in the form \(r = f(x, y)\):
\(r_1 = {x_1}-\sqrt{2*a-{x_1}{x_2}}\)
\(r_2 = {x_2}-\sqrt{\frac{57-{x_2}}{3{x_1}}}\)
where \(r_1\) and \(r_2\) represents the residuals to drive to zero. In order to define this implicit relationship in CSDL, we start by initializing our state variables using CSDL’s ImplicitVariable class and the parameter ‘a’ can be initialized like any standard CSDL variable. We then compute the residuals explicitly from the state variables and parameters using CSDL operations.
import csdl_alpha as csdl
recorder = csdl.Recorder(inline=True)
recorder.start()
x1 = csdl.ImplicitVariable(name='x1', value=1.5) # States must be ImplicitVariable objects
x2 = csdl.ImplicitVariable(name='x2', value=3.5)
a = csdl.Variable(name='a', value=5.0) # Parameters can be standard variable objects
# compute the residuals
x1x2 = x1*x2
residual_1 = x1-csdl.sqrt(2*a-x1x2)
residual_2 = x2-csdl.sqrt((57-x2)/(3*x1))
# add names to the variables for later
x1x2.add_name('x1x2')
residual_1.add_name('residual_1')
residual_2.add_name('residual_2')
print('residual_1:', residual_1.value) # The computed values of the residuals are not zero yet.
print('residual_2:', residual_2.value)
residual_1: [-0.67944947]
residual_2: [0.05197319]
At this point in the code, we have computed the states and residuals but haven’t specified any coupling between them. We can do this by using CSDL’s nonlinear solvers. We currently support the following nonlinear solver:
csdl.nonlinear_solvers.Newton()csdl.nonlinear_solvers.GaussSeidel()csdl.nonlinear_solvers.Jacobi()csdl.nonlinear_solvers.BracketedSearch()
In this example, we initialize a GaussSeidel solver and give the solver a name ‘nlsolver_x1_x2’. Finally, we can assign states to residuals using the the nonlinear solver’s add_state method.
solver = csdl.nonlinear_solvers.GaussSeidel('nlsolver_x1_x2')
solver.add_state(x1, residual_1) # Specify that the residual of the state x1 is residual_1
solver.add_state(x2, residual_2) # Specify that the residual of the state x2 is residual_2
Once the states and residual pairs have been assigned to a solver, call the run method to finalize the implicit operation. This step builds an implicit operation node in the graph by analyzing the computational graph built thus far (this is important for later). If inline = True was set when building the recorder, this is the point when the nonlinear solver is ran.
solver.run()
# The solved states
print('state x1: ', x1.value)
print('state x2: ', x2.value)
# The solved residuals
print('residual x1: ', residual_1.value)
print('residual x2: ', residual_2.value)
# intermediate values also get updated after the solver.run() call
print('x1x2: ', x1x2.value)
recorder.stop()
nonlinear solver: nlsolver_x1_x2 converged in 21 iterations.
state x1: [2.]
state x2: [3.]
residual x1: [-3.80129261e-11]
residual x2: [2.52153853e-12]
x1x2: [6.]
It can be assumed that any implicit variables (x1 and x2) represent the solved state at any point of the code and can be used like any other variable. If running inline, the state variables and any variables that depend on them are updated, such as x1x2 = x1*x2 defined earlier. We can see the nesting of graphs using recorder.display_graphs().
recorder.print_graph_structure()
graph
|----nlsolver_x1_x2
| |----sub
| | `----(+2 ops)
| |----sub
| | `----(+2 ops)
| |----sub
| | `----(+2 ops)
| |----sub
| | `----(+2 ops)
| `----(+5 ops)
`----(+1 ops)
This shows that within the top-level graph, there exists another nested graph called ‘nl_solver_x1_x2’ which is contained within the implicit operation node.
2.1. Under the Hood
Because CSDL keeps track of all relationships between variables under the hood, it automatically edits the graph and creates an implicit operation node when the .run() method is called. The updated graph is then executed to update all the variables.

The figure above shows the computational graph right before .run() is called and right after. On the left, the graph represents only the explicit calculation of the residuals because CSDL the coupling has not yet been set. On the right is the graph that represents the proper coupling.
2.2. Nonlinear Solver Hierarchies
More complicated models often require multiple nonlinear solvers. The structure of the nonlinear solvers and the interfacing between each other can have a significant impact on the performance of the numerical method. Lets see how we solve the same problem but with two separate nonlinear solvers instead of one.
recorder = csdl.Recorder(inline=True)
recorder.start()
x1 = csdl.ImplicitVariable(name='x1', value=1.5) # States must be ImplicitVariable objects
x2 = csdl.ImplicitVariable(name='x2', value=3.5)
a = csdl.Variable(name='a', value=5.0) # Parameters can be standard variable objects
# compute the residuals
x1x2 = x1*x2
residual_1 = x1-csdl.sqrt(2*a-x1x2)
residual_2 = x2-csdl.sqrt((57-x2)/(3*x1))
# add names to the variables for later
x1x2.add_name('x1x2')
residual_1.add_name('residual_1')
residual_2.add_name('residual_2')
print('residual_1:', residual_1.value) # The computed values of the residuals are not zero yet.
print('residual_2:', residual_2.value)
# Move the states to two different solvers
solver = csdl.nonlinear_solvers.GaussSeidel('nlsolver_x1_inner')
solver.add_state(x1, residual_1) # Specify that the residual of the state x1 is residual_1
solver.run()
solver = csdl.nonlinear_solvers.GaussSeidel('nlsolver_x2_outer')
solver.add_state(x2, residual_2) # Specify that the residual of the state x2 is residual_2
solver.run()
# The solved states
print('state x1: ', x1.value)
print('state x2: ', x2.value)
# The solved residuals
print('residual x1: ', residual_1.value)
print('residual x2: ', residual_2.value)
# intermediate values also get updated after the solver.run() call
print('x1x2: ', x1x2.value)
residual_1: [-0.67944947]
residual_2: [0.05197319]
nonlinear solver: nlsolver_x1_inner did not converge in 100 iterations.
state 0
name: x1
value: [1.8635385]
residual: [-0.00129802]
nonlinear solver: nlsolver_x1_inner did not converge in 100 iterations.
state 0
name: x1
value: [1.8635385]
residual: [-0.00129802]
nonlinear solver: nlsolver_x1_inner converged in 93 iterations.
nonlinear solver: nlsolver_x1_inner converged in 93 iterations.
nonlinear solver: nlsolver_x1_inner converged in 81 iterations.
nonlinear solver: nlsolver_x1_inner converged in 81 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x1_inner converged in 79 iterations.
nonlinear solver: nlsolver_x2_outer converged in 13 iterations.
state x1: [2.]
state x2: [3.]
residual x1: [-8.55122639e-11]
residual x2: [2.51096921e-11]
x1x2: [6.]
We can see that the nonlinear solvers converged to the same values as our previous example. However, we used two nonlinear solvers instead of one.
Here are what the computational graphs looked like before the first run call, right after the second run call and right after the third run call.

Because we ran the solver ‘nlsolver_x2_outer’ after the solver ‘nlsolver_x1_inner’, the solver ‘nlsolver_x1_inner’ ended up being nested in the solver ‘nlsolver_x2_outer’. We can see this in the graph hierarchy.
recorder.print_graph_structure()
print()
csdl.save_all_variables()
csdl.inline_export('nonlinear_solver_example', summary_csv = True, do_print=True)
recorder.stop()
graph
|----nlsolver_x2_outer
| |----nlsolver_x1_inner
| | |----sub
| | | `----(+2 ops)
| | |----sub
| | | `----(+2 ops)
| | `----(+2 ops)
| |----sub
| | `----(+2 ops)
| |----sub
| | `----(+2 ops)
| `----(+3 ops)
`----(+1 ops)
Variable Min Max Mean Shape Graphs
x1 1.999999999929445 1.999999999929445 1.999999999929445 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
x2 3.000000000075917 3.000000000075917 3.000000000075917 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub,sub
a 5.0 5.0 5.0 (1,) graph
x1x2 5.99999999994017 5.99999999994017 5.99999999994017 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
variable_0 2.0 2.0 2.0 (1,) graph
variable_1 10.0 10.0 10.0 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
variable_2 4.00000000005983 4.00000000005983 4.00000000005983 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
variable_3 2.0000000000149574 2.0000000000149574 2.0000000000149574 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
residual_1 -8.551226393649358e-11 -8.551226393649358e-11 -8.551226393649358e-11 (1,) graph,nlsolver_x2_outer,nlsolver_x1_inner,sub
variable_4 57.0 57.0 57.0 (1,) graph,nlsolver_x2_outer,sub
variable_5 53.999999999924086 53.999999999924086 53.999999999924086 (1,) graph,nlsolver_x2_outer,sub
variable_6 3.0 3.0 3.0 (1,) graph,nlsolver_x2_outer
variable_7 5.9999999997883355 5.9999999997883355 5.9999999997883355 (1,) graph,nlsolver_x2_outer
variable_8 9.000000000304844 9.000000000304844 9.000000000304844 (1,) graph,nlsolver_x2_outer
variable_9 3.0000000000508074 3.0000000000508074 3.0000000000508074 (1,) graph,nlsolver_x2_outer,sub
residual_2 2.510969210334224e-11 2.510969210334224e-11 2.510969210334224e-11 (1,) graph,nlsolver_x2_outer,sub
variable_10 sub
variable_11 sub
variable_12 sub
variable_13 sub