next up previous
Next: Output Up: Conjugate Gradient Learning Previous: Conjugate Gradient Learning

The line search

Although inspired by algorithms in [23,73], the line search algorithm implemented in JETNET 3.0 is somewhat unorthodox. In contrast to traditional line searches that start out in one point, evaluate the function at a different point, and then move back to the original point, the line search in JETNET 3.0 does not move back to its original position after evaluating the error function. This results in somewhat confusing behaviour since JETNET 3.0 outputs the error value at its current position in weight space. Hence there is no cause for alarm if the error value fluctuates during learning. It is the minimum error achieved during the learning that is important. However, this means that JETNET 3.0 must be told when the user stops training so that it can move to the position with the minimum error so far. This is done by setting MSTJN(5) to 8 or 14, depending on whether CG or SCG is used, and continue training until the value of MSTJN(5) changes to 9, which signals that JETNET 3.0 has moved to the best configuration so far and stopped.

In coarse outline, the line search works as follows: First it computes the error at the initial position and the gradient along the search direction. It then computes the error at two subsequent positions along the search direction, with the first step equal to the learning rate PARJN(1) and the second step computed from a parabolic fit using the gradient information. From these three error values it makes a new parabolic fit and moves to the predicted minimum position. Such parabolic steps are then repeated until the line search finds a satisfactory minimum or until it has used up the prespecified number of trial steps. In the latter case the line search is restarted with a new value for PARJN(1). If the line search does not find a satisfactory minimum even within the prespecified number of restarts, the whole CG process is restarted from the current minimum position, using a rescaled PARJN(1).

The most important control parameters for the line search are MSTJN(35), MSTJN(36), PARJN(24), and PARJN(25). The first two control the number of iterations and number of restarts that are allowed in searching for the minimum, and the latter two set the convergence criteria. The default value for both MSTJN(35) and MSTJN(36) is 10, which works fine for toy problems but need to be increased for real problems. The convergence criteria are

 

where is the first control parameter PARJN(24), is the error at the initial position (where the line search was started), is the initial gradient along the line search direction d, r is the current distance from the initial position, is the predicted position of the minimum, and is the second control parameter ( PARJN(25)). These convergence criteria are only checked for if the minimum has been bracketed.



next up previous
Next: Output Up: Conjugate Gradient Learning Previous: Conjugate Gradient Learning



System PRIVILEGED Account
Fri Feb 24 11:28:59 MET 1995