Machine learning as a tool to predict the mass of oil from well logs
Marcelo Guarido, Daniel O. Trad
Oil saturation is the measure of the amount of oil inside the porosity of a reservoir rock. Its calculation, usually from core analysis, is an important quantity that helps to characterize the reservoir. In this work, we are not predicting the actual oil saturation due to the lack of information for the wells gathered, but the fraction of mass of oil in the core. Most of this report is focused on the data preparation prior to modeling, as our variables and targets came from two different measurement sources (well logs and core analysis), and in how to create a valid workflow to make features and targets compatible to each other. In the end, we show how to select an appropriate machine learning model to predict the target, which need to be one with non-linear properties, and how to interpret the feature importance. To predict the fraction of mass of oil, the induction log ILD is the one that brings most of the information, but it needs to be combined with other logs for the prediction to make sense. The metric used to evaluate the models was the R2, and the best model had a score of 0.82.