In theory, an MLP with one hidden layer is sufficient to model any continuous function . In practice, two hidden layers can be more efficient [41,43,49] but more difficult to train. In our experience, MLP networks with one hidden layer are sufficient for most classification tasks, whereas two hidden layers are preferable for function fitting problems. We emphasize though that many HEP classification problems seem to have simple discrimination surfaces, which would explain why one hidden layer often is enough. Networks with many hidden layers are not justified unless the decision surface is complicated. In fact, it is completely unneccesary to use an ANN at all if the decision surface is very simple, like a hyperplane or a hypersphere.