Although inspired by algorithms in [23,73], the line search
algorithm implemented in ` JETNET 3.0` is
somewhat unorthodox. In contrast to traditional line searches that start
out in one point, evaluate the function at a different point, and then move
back
to the original point, the line search in ` JETNET 3.0` does not move back
to its original position after evaluating the error function. This results in
somewhat confusing behaviour since ` JETNET 3.0` outputs the error value at
its current position in weight space. Hence there is no cause for alarm if
the error value fluctuates during learning. It is the minimum error achieved
during the learning that is important.
However, this means that ` JETNET 3.0` must be told
when the user stops training so that it can move to the position
with the minimum error so far. This is done by setting ` MSTJN(5)` to
8 or 14, depending on whether CG or SCG is used, and continue
training until the value of ` MSTJN(5)` changes to 9, which signals that
` JETNET 3.0` has moved to the best configuration so far and stopped.

In coarse outline, the line search works as follows: First it computes the error
at the initial position and the gradient along the search direction. It then
computes the error at two subsequent positions along the search direction, with
the first step equal to the learning rate ` PARJN(1)` and the second step
computed from a parabolic fit using the gradient information. From
these three error values it makes a new parabolic fit and moves to the predicted
minimum position. Such parabolic steps are then repeated until
the line search finds a satisfactory minimum or until it has used up the
prespecified number of trial steps. In the
latter case the line search is restarted with a new value for ` PARJN(1)`.
If the line search does not find a satisfactory minimum even
within the prespecified number of restarts, the whole CG process is restarted
from the current minimum position, using a rescaled ` PARJN(1)`.

The most important control parameters for the line search are
` MSTJN(35)`, ` MSTJN(36)`, ` PARJN(24)`, and ` PARJN(25)`.
The first two control the number of iterations and number of restarts that
are allowed in searching for the minimum, and the latter two set the
convergence criteria. The default value for both ` MSTJN(35)` and
` MSTJN(36)` is 10, which works fine for toy problems but need to be
increased for real problems. The convergence
criteria are

where is the first control parameter ` PARJN(24)`,
is the error at the initial position (where the line search was
started), is the initial gradient along the line
search direction ** d**, **r** is the current distance from the initial
position, is the predicted position of the minimum, and
is the second control parameter (` PARJN(25)`). These
convergence criteria are only checked for if the minimum has been
bracketed.

Fri Feb 24 11:28:59 MET 1995