10.7. Nonlinear Optimization

This video is a brief introduction to the broad field of optimization-based approaches for robot motion planning.

In this last video of Chapter 10, we consider a very different approach to motion planning, based on nonlinear optimization. The goal is to design a control history u of t, a trajectory q of t, and a trajectory duration capital T minimizing some cost functional J, such as the total energy consumed or the duration of the motion, such that the dynamic equations are satisfied at all times, the controls are feasible, the motion is collision free, and the trajectory takes the start state to the goal state.

We refer to J as the cost, and the last 5 lines are the constraints.

Nonlinear optimization requires an initial guess at the solution. Let's say that this control history is our initial guess. To turn the motion planning problem into a nonlinear optimization, we need a finite parameter representation of the control. There are many ways to do this; we could use a set of knot points interpolated by polynomials or splines, or piecewise constant controls, or piecewise linear controls. For this example, let's assume piecewise linear controls. Integrating the equations of motion, we get the robot's trajectory. This trajectory intersects obstacles and does not end at the goal configuration, so our initial guess is not a solution to the problem.

To evaluate the constraints on the motion due to obstacles, we can choose a finite set of test points along the trajectory and evaluate whether those points are collision free. The collision constraints can be expressed as constraints on the distances between the test points and the obstacles, where a positive distance means that there is no collision and a negative distance implies a collision.

To update our guess at the control history, we need to calculate the gradients of the constraints with respect to the trajectory. These gradients provide information on how the test points should move to satisfy the constraints. These gradients map through the gradient of the trajectory with respect to the controls to suggest a direction in which to modify our guess at the control history. We also use information on the gradient of the cost with respect to the controls in calculating the direction to update our control history. We then update the control history and integrate the new control to get an updated trajectory. Since our new trajectory still does not satisfy the constraints, we repeat the process of calculating a deformation direction for the trajectory then map this through the sensitivity of the trajectory with respect to the controls to get a direction to deform the control history. After taking a step in this direction in the control space, we integrate the equations of motion again to get the new trajectory. This process repeats until we find a trajectory that satisfies the constraints and locally minimizes the cost.

Returning to the problem formulation, and focusing on the first three lines, the method I just described is called shooting. With shooting, the design variables are the total time duration capital T and the parameters describing the control history. The trajectory is found by simulation of the equations of motion, ensuring that the dynamic constraints are satisfied. This method is called shooting, because designing the controls is like aiming a cannon. You see what happens when you fire and update your aim so that the goal is more closely achieved on the next try.

Another popular approach is called collocation, in which you simultaneously design the control history and the trajectory. Since you design both, you have to enforce that the controls and trajectory are consistent according to the dynamics. This is commonly done by ensuring that the equations of motion are satisfied at a finite set of test points.

The process of turning the problem statement into a standard finite parameter nonlinear optimization, which can be solved by techniques such as sequential quadratic programming, is called transcription. There are many ways you could choose to represent the controls, trajectory, cost, and constraints, and your choice will affect the performance of the optimization. One thing that is critical, though, is that you are able to calculate the gradients of the cost and the constraints with respect to your design variables, as these gradients guide the search through the design variable space. Ideally you would be able to calculate these gradients analytically, but failing that, you should be able to numerically evaluate them both efficiently and accurately.

Even with good gradient calculation, gradient-based nonlinear optimization is inherently a local method, and the optimization is prone to getting stuck in local minima, where the solution either does not globally minimize the cost or does not satisfy the constraints. Nonlinear optimization is a good choice when other methods can be used to provide a reasonable initial guess. For example, nonlinear optimization could be used to smooth a jerky motion found by an RRT. The RRT would handle the global search among clutter, while the nonlinear optimization would deform the RRT's solution to locally minimize the cost.

So this concludes Chapter 10. Motion planning is one of the most active subfields of robotics, but you should now have an understanding of the key concepts of some of the most popular approaches. In Chapter 11, on robot control, we study the problem of designing feedback controllers to drive robots along the trajectories produced by motion planners.