Least mean squares (LMS) applications in geophysics
Brian H. Russell
In geophysical analysis, we are familiar with the least-squares or Wiener-Levinson algorithm since is the basis for solving many of our most common processing problems. The least-squares method is applied to problems in which a set of unknown weights are extracted which transform a measured signal into a desired signal. To apply the leastsquares algorithm the signal must first be fully acquired, which is standard practice in seismic acquisition. The least-mean-square (LMS) algorithm, on the other hand, is an adaptive filter that does not require knowledge of the full signal, but determines the weights adaptively on a sample by sample basis. The LMS algorithm was developed in the electrical engineering community and has found great success in such applications as the adaptive equalization of telephone channels and blood pressure regulation. It also lead to the development of the first feedforward neural networks, which were only able to solve linearly separable problems. The LMS algorithm was then was extended to feedforward networks that could solve nonlinearly separable problems. Thus, an understanding of the LMS algorithm is the first step in understanding neural networks and machine learning. In this study I will not specifically look at neural network algorithms but rather will focus the least-squares technique and gradient search methods such as steepest descent and conjugate gradient, and then see how LMS compares to these methods. To this this, I will use two straightforward examples, one from electrical engineering and one from geophysics.
The geophysical example will be the wavelet deconvolution problem. We will first solve wavelet deconvolution using the least-squares, steepest descent and conjugate gradient algorithms. These methods all converge to the correct result. We will then look at how the LMS algorithm works on the deconvolution problem. Although LMS does not converge completely in our example because of the lack of samples, it works much better on longer datasets and also potentially solves the problem of time-varying wavelets.