ECB-ART-51327
Data Brief
2023 Jun 27;48:109188. doi: 10.1016/j.dib.2023.109188.
Show Gene links
Show Anatomy links
Geo Fossils-I: A synthetic dataset of 2D fossil images for computer vision applications on geology.
Abstract
Geo Fossils-I is a synthetic image dataset used as a solution for resolving the limited availability of geological datasets intended for image classification and object detection on 2D images of geological outcrops. The Geo Fossils-I dataset was created to train a custom image classification model for geological fossil identification and inspire additional work in generating synthetic geological data with Stable Diffusion models. The Geo Fossils-I dataset was generated through a custom training process and the fine-tuning of a pre-trained Stable Diffusion model. Stable Diffusion is an advanced text-to-image model that can create highly realistic images based on textual input. An effective technique for instructing Stable Diffusion on novel concepts is the application of Dreambooth, a specialized form of fine-tuning. Dreambooth was used to generate new images of fossils or to modify existing ones per the provided textual description. The Geo Fossils-I dataset contains six different fossil types present in geological outcrops, each one being characteristic of a particular depositional environment. The dataset contains a total of 1200 fossil images equally spread among different fossil types such as ammonites, belemnites, corals, crinoids, leaf fossils, and trilobites. This dataset is the first set within a series to be compiled aiming to enrich the available resources with respect to 2D outcrop images allowing geoscientists to progress in the field of automated interpretation of depositional environments.
PubMed ID: 37383796
Article link: Data Brief
References :
Shorten,
Text Data Augmentation for Deep Learning.
2021, Pubmed