AI assistant using retrieval-augmented generation to extract data from mud-logging reports

Marcelo Guarido, David J. Emery, Kristopher A. Innanen

An interpreter needs to gather and work with all the available data during reservoir characterization. However, some of this data, such as the mud-logging descriptions, is lost. These logs tend to be stored in PDF files and are not used as direct data but as a reference report. Large-language models are powerful models trained on extensive text data that can be used to create apps for different tasks. We presented an application powered by GPT-4o-mini to extract mud-logging descriptions from PDF files and convert them into formatted tables. The app was tested on the Poseidon-2 mud-logging report, returning a table around 95% precise and costing less than USD$0.01 for a single file conversion.