Unsupervised seismic faces classification using K-means and Gaussian Mixture Modeling

Brian H. Russell

In this study, I will apply the techniques of K-means clustering and Gaussian Mixture Modeling (GMM) to the task of multidimensional unsupervised seismic facies classification. The K-means algorithm is the most popular and well understood clustering algorithm. For N M-dimensional points, K-means is implemented by dividing the input data points randomly into K clusters, computing the M-dimensional means of the clusters, and then assigning each point to the cluster for which its distance to the mean is a minimum. The means are then re-computed based on the new cluster assignments and this process is iterated until convergence. Traditionally, the distance metric used in K-means is Euclidean, but it can be modified to use a Mahalanobis, or statistical, distance. The Gaussian Mixture Model (GMM) is a mixture pdf of N M-dimensional feature vectors which are grouped into K classes. GMM starts with an initial guess of the means and covariance matrices of each class and determines the correct values by iterating to a solution. Therefore, GMM always uses statistical distance as its metric. Also, unlike the K-means algorithm, the data is never physically re-ordered during the process. I will start by illustrating the K-means and GMM methods with several two-dimensional synthetic clustering examples and a two-dimensional real data example that uses inverted elastic parameters (Vp/Vs ratio versus acoustic impedance) extracted from a Gulf of Mexico dataset. I will then move to a higher dimensional example of K-means clustering and GMM that performs unsupervised facies classification of a Cretaceous channel sand play in the Blackfoot area of Alberta.