Overview

fmu-dataio is a library for exporting data out of FMU workflows. In addition to making data export on the current (filename-based) standard easier, it also creates and attaches metadata to the exported data.

In a Python script running somewhere in the FMU workflow, usage of fmu-dataio looks roughly like this (more detailed working examples are provided elsewhere in this documentation):

from fmu.dataio import ExportData

df = make_my_data() # how you create a data object is same as before
cfg = get_global_config() # how you read global config is same as before

exp = ExportData( # ExportData takes a number of arguments, these are examples.
    config=cfg,
    content="volumes",
)
exp.export(df) # this is the export.

Although long-term intention is to discontinue this, exported data will still be stored to the underlying disk structure on which FMU is running (i.e. /scratch). When storing data to disk, fmu-dataio will store the data file according to the current filename-oriented FMU data standard. Next to it, with a corresponding file name, it will store the metadata according to the metadata-based FMU data standard.

Example:

share/results/tables/mytable.csv
share/results/tables/.mytable.csv.yml <-- metadata

Context on multiple levels are added to these metadata. First of all, references to Equinor master data is required, but also FMU-specific context such as which model was used to produce the data, which realization and iteration a specific file is produced in, and much more.

Static and object-specific metadata

Metadata are in general “data about the data”. For instance a pure surface export file (e.g. irap binary) will have information about data itself (like values in the map, and bounding box) but the file has no idea of which FMU run it belongs to, or which field it comes from or what data contents it is representing.

Some metadata contents are static for the model, meaning that they will be the same for all runs of a specific model setup. Some metadata contents are static to the case, meaning that they will be the same for all data belonging to a specific case. Other metadata are object-specific, i.e. they will vary per item which is exported.

Given the amount of data produced by a single FMU run, it is impossible to contextualize the data manually on this granularity. Therefore, fmu-dataio automates this.

The data model used for FMU results is a partly denormalized data model, meaning that some static data will be repeated across many data objects. Example: Each exported data object contains basic information about the FMU case it belongs to, such as a unique ID for this case, its name, the user that made it, which model template was used, etc. This information is stored in every exported .yml file. This may seem counter-intuitive and differs from a relational database (where this information would typically be stored once, and referred to when needed).

The FMU results data model is further documented here.