The qcproperties class

The qcproperties class provides a set of methods for extracting property statistics from 3D Grids, Raw and Blocked wells.

Statistics can be extracted for both continous and discrete properties. Dependent on the property type different statistics are calculated. The property type is auto-detected.

If several methods of statistics extraction has been run within the instance, a merged dataframe is available through the ‘dataframe’ property.

The methods for statistics extraction can be run individually, or a yaml-configuration file can be used to enable an automatic run of the methods.

All methods can be run from either RMS python, or from files (e.g. from an ERT job).

XTGeo is being utilized to get a dataframe from the input parameter data. XTGeo data is reused in the instance to increase performance.

Methods for extracting property statistics

Three methods exists for extracting property statistics. The method to select is dependent on the input data source (3D grid properties, wells or blocked wells). Arguments for the methods are similar and described in section below.

  • get_grid_statistics: This method extract property statistics from 3D grid data.

  • get_well_statistics: This method extract property statistics from well logs.

  • get_bwell_statistics: This method extract property statistics from blocked well logs.

  • from_yaml: Use a yaml-configuration file to enable an automatic run of the methods above.

All methods returns a Pandas DataFrame for the run in question, if several methods of statistics extraction has been run within the instance a merged dataframe is available through the ‘dataframe’ property

See also

The Using yaml input for auto execution section for description of how to use a yaml-configuration file to run the different methods automatically.

Other methods

Note: The methods below are only applicable if at least one method for extracting statistics have been run within the QCProperties instance.

dataframe

A merged dataframe with statistical data for continous properties from all runs of statistics extractions within the instance.

to_csv

Used to write the dataframe with statistics to a csv-file. Takes one arguments: csvfile: String with desired filename (required).

Arguments

The input data is given in a python dictionary (or a YAML file) and will be somewhat different for the three methods, and for the two run environments (inside/outside RMS).

Input arguments:

  • data: The input data as a Python dictionary (required). See valid keys below.

  • project: Required for usage inside RMS

Valid fields in the ‘data’ argument:

Method specific fields:
grid

Name of grid icon in RMS, or name of grid file if run outside RMS. Required with the get_grid_statistics method.

wells

Required with the get_well_statistics and the get_bwell_statistics methods. Outside RMS, “wells” is a list of files on RMS ascii well format.

Inside RMS, “wells” is a dictionary whith 3 fields that depend on the method:

get_well_statistics:

names: List of wellnames (optional). Default is all wells. logrun: Name of logrun. trajectory: Name of trajectory.

get_bwell_statistics:

names: List of wellnames (optional). Default is all wells. bwname: Name of BW object in RMS. grid: Name of grid that contains the BW object.

Note

Wildcards are supported when running from files, and python valid regular expressions are supported in “names”, see examples.

Common fields:
properties

Properties to compute statistics for. Both continous and discrete properties are supported. Standard statistics will be computed for continous properties e.g “avg” and “stddev”, while for discrete properties percentages are calculated.

Can be given as list or as dictionary. If dictionary the key will be the column name in the output dataframe, and the value will be a dictionary with valid options:

name: The actual name (or path) of the property / log.

weight: A weight parameter (name or path if outside RMS) (optional)

pfile: Name (or path) to file containing the parameter e.g. INIT file (optional)

selectors

Selectors are discrete properties/logs e.g. Zone. that are used to extract statistics for groups of the data (optional).

Can be given as list or as dictionary. If dictionary the key will be the column name in the output dataframe, and the value will be a dictionary with valid options:

name: The actual name (or path) of the property / log.

include: List of values to include (optional)

exclude: List of values to exclude (optional)

codes: A dictionary of codenames to update some/all existing codenames (optional).

pfile: Name (or path) to file containing the parameter e.g. INIT file (optional)

Note

The “codes” field can be used to merge code values that the user wants to extract combined statistics from. This is done by setting the same name on several code values, as it is the name that are used to group the data.

filters

Dictionary with additional filters (optional).

The key is the name (or path) to the filter parameter / log, and the value is a dictionary with options:

include: List of values to include for discrete parameters

exclude: List of values to exclude for discrete parameters

range: List with two entries, defining minimum and maximum values to use for continous parameters

pfile: Name (or path) to file containing the parameter e.g. INIT file

Note

If a selector or property is input as a filter, this will override any existing filters specified directly on the selector/property.

See also

Option "multiple_filters" below which can be used to extract statistics multiple times with different filters.

multiple_filters

Option that can be used to extract statistics multiple times with different filters (optional).

The input is a dictionariy where the keys are the “name” (ID string) for the dataset, and the value is the dictionary of filters (Same format as filters above)

See examples.

path

Path to where files are located (optional)

selector_combos

Bool to turn on/off calculation of statistics for every combination of selectors (optional). Default is True. For example, if True and both a ZONE and a REGION parameter is given as selectors, statistics for three groups will be calculated: ["ZONE", "FACIES"], ["ZONE"] and ["REGION"]. If False the data will only be extracted for one group: ["ZONE", "FACIES"], hence no data is available if the user wants to evaluate statistics per ZONE (or REGION) for the global grid.

Depending on number of selectors and size of grid, this process may be time consuming.

source

Source string (optional). Default values depend on the method being executed:

  • For grid statistics default is the gridname

  • For blocked wells statistics default is the name of the blocked wells object if inside RMS and bwells if outside

  • For well statistics default is wells

name

ID string for the dataset (optional). Recommended, if not given it will be set equal to the source string.

verbosity

Level of output while running None, “info” or “debug”, default is None. (optional)

Examples

get_grid_statistics examples

Example in RMS (continous properties - basic):

Example extracting statistics for porosity and permeability for each zone and facies. Result is written to csv.

from fmu.tools import QCProperties

GRID = "GeoGrid"
PROPERTIES = ["Poro", "Perm"]
SELECTORS = ["Zone", "Facies"]
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "grid": GRID,
        "verbosity": 1,
    }
    qcp.get_grid_statistics(data=usedata, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()
    print("Done")

Example in RMS (continous properties - more settings):

Example extracting statistics for porosity per region. Filters are used to extract statistics for HC zone and Water zone separately. Statistics will be combined for regions with code values 2 and 3. Both properties are weighted on a Total_bulk parameter. Result is written to csv.

from fmu.tools import QCProperties

GRID = "GeoGrid"
PROPERTIES = {
    "PORO": {"name": "PHIT", "weight": "Total_bulk"},
}
SELECTORS = {
    "REGION": {
        "name": "Regions",
        "exclude": ["Surroundings"],
        "codes": {2: "NS", 3: "NS",},
    }
}
REPORT = "../output/qc/continous_stats.csv"

FLUID_FILTERS = {
    "HC_zone": {"Fluid": {"include": ["oil", "gas"]}},
    "Water_zone": {"Fluid": {"include": ["water"]}},
}

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "grid": GRID,
        "multiple_filters": FLUID_FILTERS,
        "verbosity": 1,
    }

    qcp.get_grid_statistics(data=usedata, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()
    print("Done")

Note

The code is executed twice, filtering on the HC-zone first then the water-zone in a second run. Alternatively the fluid parameter could have been used as a selector, for extracting statistics in one run.

Example in RMS (discrete properties):

Example extracting statistics for a discrete facies parameter for each region. The facies parameter are weighted on a Total_bulk parameter.

The result is written out to csv.

from fmu.tools import QCProperties

GRID = "GeoGrid"
PROPERTIES = {
    "FACIES": {"name": "Facies", "weight": "Total_bulk"},
}
SELECTORS = ["Regions"]

REPORT = "../output/qc/discrete_stats.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "grid": GRID,
        "verbosity": 1,
    }

    qcp.get_grid_statistics(data=usedata, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()
    print("Done")

Example when executed from files:

from fmu.tools import QCProperties

PATH = "../input/qc/"
GRID = "grid.roff"
PROPERTIES = {"PORO": {"name": "poro.roff"}}
SELECTORS = {
    "ZONE": {
        "name": "zone.roff",
    },
    "FACIES": {
        "name": "facies.roff",
        "exclude": ["Carbonate"],
    },
}
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "path": PATH,
        "grid": GRID,
        "name": "MYDATA",
    }

    qcp.get_grid_statistics(data=usedata)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()

Example when executed from file using Eclipse INIT-file as input:

from fmu.tools import QCProperties

PATH = "../input/qc/"
GRID = "ECLIPSE.EGRID"
PROPERTIES = {"PERMX": {"name": "PERMX", "pfile": "ECLIPSE.INIT"}}
SELECTORS = {
    "FIPNUM": {
        "name": "FIPNUM",
        "pfile": "ECLIPSE.INIT"
    },
}
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "path": PATH,
        "grid": GRID,
        "name": "from_eclipse",
    }

    qcp.get_grid_statistics(data=usedata)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()

get_well_statistics examples

Example in RMS:

Example extracting statistics for permeability for each zone and facies. All wells starting with 33_10 and all 34_11 wells containing “A” will be included in statistics. Note the use of python regular expressions! Result is written to csv.

from fmu.tools import QCProperties

WELLS = {
  "names": ["33_10.*", "34_11-.*A.*"],
  "logrun": "log",
  "trajectory": "Drilled trajectory",
}
PROPERTIES = {"PERM": {"name": "Klogh"}}
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "wells": WELLS,
    }

    qcp.get_well_statistics(data=usedata, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()
    print("Done")

Example when executed from files:

Example extracting statistics for permeability for each zone and facies. First extracting statistics for wells starting with “34_10-A”, then wells starting with “34_10-B” in a subsequent run. Result is written to csv.

from fmu.tools import QCProperties

WELLS = ["34_10-A.*"]
PATH = "../input/qc/"
PROPERTIES = ["Phit", "Klogh"]
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "wells": WELLS,
        "path": PATH,
        "name": "A-wells",
    }

    qcp.get_well_statistics(data=usedata)

    usedata2 = usedata.copy()
    usedata2["wells"] = ["34_10-B.*"]
    usedata2["name"] = "B-wells"

    qcp.get_grid_statistics(data=usedata2, project=project)

    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()

get_bwell_statistics examples

Example in RMS:

Example extracting statistics for permeability for each zone and facies. All blocked wells will be included in statistics. Result is written to csv.

from fmu.tools import QCProperties

WELLS = {
  "bwname": "BW",
  "grid": "GeoGrid",
}
PROPERTIES = {"PERM": {"name": "Klogh"}}
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"

def extract_statistics():

    qcp = QCProperties()

    usedata = {
        "properties": PROPERTIES,
        "selectors": SELECTORS,
        "wells": WELLS,
    }

    qcp.get_bwell_statistics(data=usedata, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()
    print("Done")

Example when executed from files:

To come….

Comparison of data from different sources

Advice when comparing data from different sources

When extracting statistics from different sources there are several tips for enabling easy comparison in the post-analysis of the data in e.g. WebViz:

  • Input “properties” and “selectors” as dictionaries and keep property and selector keys identical between the sources. The keys will be the names seen in the dataframe.

  • Try to use the same selectors for all sources

  • Keep the option “selector_combos” at True to get as much overlapping data as possible. For example, if well statistics only have ZONE as selector and the grid properties are calculated with selectors ZONE and REGION and “selector_combos” where True, the ZONE level statistics can be compared.

  • Use the “codes” field on the selectors to align and match the codenames for each selector. For example if the zone codes are coarser in the grid than in the zonelogs from the wells, this field can be used to merge codes in the zonelog together under one name.

Example

Example below collects statistical data from four different sources and writes result to a csv-file. Several steps have been to ensure consistency between the sources, making the resulting csv-file easy to compare:

  • “Poro” and “Perm” will be the property names

  • “ZONE” will be the column name for the selector

  • The zone codes “UpperReek”, “MidReek”, “LowerReek” is present in the two grids, to get the same codes in the wells the codes are updated and redundant codes are excluded.

from fmu.tools import QCProperties

REPORT = "../output/qc/somefile.csv"

GEOGRIDDATA = {
    "properties": ["Poro", "Perm"],
    "selectors": {"ZONE": {"name":"Zone"}},
    "grid": "GeoGrid",
}
SIMGRIDDATA = {
    "properties": {"Poro": {"name":"PORO"}, "Perm": {"name":"PERMX"}},
    "selectors": {"ZONE": {"name":"Zone"}},
    "grid": "SimGrid",
}
BWDATA = {
    "properties": {"Poro": {"name": "Phit"}, "Perm": {"name": "Klogh"}},
    "selectors": {"ZONE": {"name": "Zonelog", "codes": {1:"UpperReek", 2:"MidReek", 3:"LowerReek"}, "exclude": ["Above_TopUpperReek", "Below_BaseLowerReek"]}},
    "wells": {"bwname": "BW", "grid": "Geogrid"},
}

WDATA = BWDATA.copy()
WDATA["wells"] = {"logrun": "log", "trajectory": "Drilled trajectory"}

def extract_statistics():

    qcp = QCProperties()

    qcp.get_grid_statistics(data=GEOGRIDDATA, project=project)
    qcp.get_grid_statistics(data=SIMGRIDDATA, project=project)
    qcp.get_bwell_statistics(data=BWDATA, project=project)
    qcp.get_well_statistics(data=WDATA, project=project)

    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()

See also

The section below for example of using the same configuration but with yaml-input.

Using yaml input for auto execution

A yaml-configuration file can be used with the method from_yaml to enable an automatic run of the methods. This is especially useful if the user wants to run multiple extractions of statistics with minimal code input.

The code evaluates what method to execute based on the value of the first level in the yaml file. The second level is a list of input ‘data’ objects, and statistics will be calculated for each list element.

Three fields are available for the first level:

  • grid: the get_grid_statistics method are executed on elements in this level

  • wells: the get_well_statistics method are executed on elements in this level

  • blockedwells: the get_bwell_statistics method are executed on elements in this level

Example in RMS with setting from a YAML file:

Example using yaml input in RMS for extracting statistics for porosity and permeability from four data sources (geogrid, simgrid, wells and blocked wells). The resulting combined dataframe are written to csv.

from fmu.tools import QCProperties

YAML_PATH = "../input/qc/somefile.yml"
REPORT = "../output/qc/somefile.csv"

def extract_statistics():
    qcp = QCProperties()
    qcp.from_yaml(YAML_PATH, project=project)
    qcp.to_csv(REPORT)

if  __name__ == "__main__":
    extract_statistics()

The YAML file may in case look like:

grid:
  - grid: GeoGrid
    properties:
      - Poro
      - Perm
    selectors:
      ZONE:
        name: Zone

  - grid: SimGrid
    properties:
      Poro:
        name: PORO
      Perm:
        name: PERMX
    selectors:
      ZONE:
        name: Zone

wells:
  - wells:
      logrun: log
      trajectory: Drilled trajectory
    properties:
      Poro:
        name: Phit
      Perm:
        name: Klogh
    selectors:
      ZONE:
        name: Zonelog
        codes:
          1: UpperReek
          2: MidReek
          3: LowerReek
        exclude:
          - Above_TopUpperReek
          - Below_BaseLowerReek

blockedwells:
  - wells:
      grid: GeoGrid
      bwname: BW
    properties:
      Poro:
        name: Phit
      Perm:
        name: Klogh
    selectors:
      ZONE:
        name: Zonelog
        codes:
          1: UpperReek
          2: MidReek
          3: LowerReek
        exclude:
          - Above_TopUpperReek
          - Below_BaseLowerReek

Additional Notes

Advice on performance

There are several settings that has an influence perfomance:

  • Filters can be used to remove unnecessary data, this will limit the input data before statistics is calculated and will speed up execution.

  • If many selectors, the option selector_combos can have a high impact on performance

Comparison with statistics in RMS

  • To avoid bias in the calculation, the code removes duplicates from both well and blocked well data before calculating statistics. Duplicates are data points that have the same coordinates and property values. For blocked wells this refers to cells that are penetrated by multiple wells, for raw wells this can happen if branches of multilateral wells have overlapping logs.

    This is the same as RMS does when calculating statistics for blocked wells, and statistical values extracted with this code will be identical to RMS. However RMS does not remove duplicates when calculating statistics for raw wells, and minor differences in statistical values are possible.