Self-Adaptive ReLU Neural Network Method in Least-Squares Data Fitting

Self-Adaptive ReLU Neural Network Method in Least-Squares Data Fitting

Zhiqiang Cai, Min Liu
Copyright: © 2024 |Pages: 21
DOI: 10.4018/979-8-3693-0230-9.ch011
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter provides a comprehensive introduction to a self-adaptive ReLU neural network method proposed. The purpose is to design a nearly minimal neural network architecture to achieve the prescribed accuracy for a given task in scientific machine learning such as approximating a function or a solution of partial differential equation. Starting with a small one hidden-layer neural network, the method enhances the network adaptively by adding neurons in the current or new hidden-layer based on accuracy of the current approximation. In addition, the method provides a natural process for obtaining a good initialization in training the current network. Moreover, initialization of newly added neurons at each adaptive step is discussed in detail.
Chapter Preview
Top

1. Introduction

Given a data set 979-8-3693-0230-9.ch011.m01 with xi∈Ω=[-1,1]d and positive weights 979-8-3693-0230-9.ch011.m02,consider the discrete least-squares problem: finding 979-8-3693-0230-9.ch011.m03 such that

979-8-3693-0230-9.ch011.m04
(1) where 979-8-3693-0230-9.ch011.m05 is a ReLU neuron network defined in section 2 with l hidden-layers and L(∙) is a least-squares loss functional given by

979-8-3693-0230-9.ch011.m06
.

For a prescribed tolerance 𝜀>0, this chapter presents a self-adaptive algorithm, the adaptive neuron enhancement method (ANE), to adaptively construct a nearly optimal network 979-8-3693-0230-9.ch011.m07 such that the neural network approximation fnn(x) satisfiesL(fnn) ≤ 𝜀L(0),(2) where 979-8-3693-0230-9.ch011.m08 is the square of the weighted l2 norm of the output data 979-8-3693-0230-9.ch011.m09.

Multi-layer ReLU neural network is described in this chapter as a set of continuous piece-wise linear functions. Hence each network function is piece-wise linear with respect to a partition of the domain. This partition, referred as the (domain) physical partition (see section 3), provides geometric feature of the function and hence plays a critical role in the design of self-adaptive neural network method. Determination of this physical partition for a network function is in general computationally expensive, especially when the input dimension d is high. To circumvent this difficulty, we introduce a network indicator function that can easily determine such partition.

The idea of the ANE is similar to that of standard adaptive mesh-based numerical methods, and may be written as loops of the form

trainestimatemarkenhance(3)

Starting with a small one hidden layer network, the step train is to iteratively solve the optimization problem of the current network; the step estimate is to compute error of the current approximation; the step mark is to identify local regions that need refinement; and the step enhance is to add new neurons to the current network with good initialization. This adaptive algorithm learns not only from given information (data, function, partial differential equation) but also from the current computer simulation.

When the current error does not satisfy (1.2), an efficient ANE method relies on strategies to address the following questions at each adaptive step:

  • (a)

    how many new neurons should be added at the last hidden layer?

  • (b)

    when should a new hidden layer be added?

By exploiting the geometric feature of the current approximation, the enhancement strategy (see Section 4) determines the number of new neurons to be added at the last hidden layer. A new layer is added if a computable quantity measuring the improvement rate of two consecutive networks per the relative increase of parameters is small.

Complete Chapter List

Search this Book:
Reset