The qcproperties class
The qcproperties
class provides a set of methods for extracting property
statistics from 3D Grids, Raw and Blocked wells.
Statistics can be extracted for both continous and discrete properties. Dependent on the property type different statistics are calculated. The property type is auto-detected.
If several methods of statistics extraction has been run within the instance, a merged dataframe is available through the ‘dataframe’ property.
The methods for statistics extraction can be run individually, or a yaml-configuration file can be used to enable an automatic run of the methods.
All methods can be run from either RMS python, or from files (e.g. from an ERT job).
XTGeo is being utilized to get a dataframe from the input parameter data. XTGeo data is reused in the instance to increase performance.
Methods for extracting property statistics
Three methods exists for extracting property statistics. The method to select is dependent on the input data source (3D grid properties, wells or blocked wells). Arguments for the methods are similar and described in section below.
get_grid_statistics
: This method extract property statistics from 3D grid data.get_well_statistics
: This method extract property statistics from well logs.get_bwell_statistics
: This method extract property statistics from blocked well logs.from_yaml
: Use a yaml-configuration file to enable an automatic run of the methods above.
All methods returns a Pandas DataFrame for the run in question, if several methods of statistics extraction has been run within the instance a merged dataframe is available through the ‘dataframe’ property
See also
The Using yaml input for auto execution section for description of how to use a yaml-configuration file to run the different methods automatically.
Other methods
Note: The methods below are only applicable if at least one method for extracting statistics have been run within the QCProperties instance.
- dataframe
A merged dataframe with statistical data for continous properties from all runs of statistics extractions within the instance.
- to_csv
Used to write the dataframe with statistics to a csv-file. Takes one arguments:
csvfile
: String with desired filename (required).
Arguments
The input data is given in a python dictionary (or a YAML file) and will be somewhat different for the three methods, and for the two run environments (inside/outside RMS).
Input arguments:
data
: The input data as a Python dictionary (required). See valid keys below.project
: Required for usage inside RMS
Valid fields in the ‘data’ argument:
- Method specific fields:
- grid
Name of grid icon in RMS, or name of grid file if run outside RMS. Required with the
get_grid_statistics
method.- wells
Required with the
get_well_statistics
and theget_bwell_statistics
methods. Outside RMS, “wells” is a list of files on RMS ascii well format.Inside RMS, “wells” is a dictionary whith 3 fields that depend on the method:
get_well_statistics:
names
: List of wellnames (optional). Default is all wells.logrun
: Name of logrun.trajectory
: Name of trajectory.get_bwell_statistics:
names
: List of wellnames (optional). Default is all wells.bwname
: Name of BW object in RMS.grid
: Name of grid that contains the BW object.Note
Wildcards are supported when running from files, and python valid regular expressions are supported in “names”, see examples.
- Common fields:
- properties
Properties to compute statistics for. Both continous and discrete properties are supported. Standard statistics will be computed for continous properties e.g “avg” and “stddev”, while for discrete properties percentages are calculated.
Can be given as list or as dictionary. If dictionary the key will be the column name in the output dataframe, and the value will be a dictionary with valid options:
name
: The actual name (or path) of the property / log.weight
: A weight parameter (name or path if outside RMS) (optional)pfile
: Name (or path) to file containing the parameter e.g. INIT file (optional)- selectors
Selectors are discrete properties/logs e.g. Zone. that are used to extract statistics for groups of the data (optional).
Can be given as list or as dictionary. If dictionary the key will be the column name in the output dataframe, and the value will be a dictionary with valid options:
name
: The actual name (or path) of the property / log.include
: List of values to include (optional)exclude
: List of values to exclude (optional)codes
: A dictionary of codenames to update some/all existing codenames (optional).pfile
: Name (or path) to file containing the parameter e.g. INIT file (optional)Note
The “codes” field can be used to merge code values that the user wants to extract combined statistics from. This is done by setting the same name on several code values, as it is the name that are used to group the data.
- filters
Dictionary with additional filters (optional).
The key is the name (or path) to the filter parameter / log, and the value is a dictionary with options:
include
: List of values to include for discrete parametersexclude
: List of values to exclude for discrete parametersrange
: List with two entries, defining minimum and maximum values to use for continous parameterspfile
: Name (or path) to file containing the parameter e.g. INIT fileNote
If a selector or property is input as a filter, this will override any existing filters specified directly on the selector/property.
See also
Option
"multiple_filters"
below which can be used to extract statistics multiple times with different filters.- multiple_filters
Option that can be used to extract statistics multiple times with different filters (optional).
The input is a dictionariy where the keys are the “name” (ID string) for the dataset, and the value is the dictionary of filters (Same format as
filters
above)See examples.
- path
Path to where files are located (optional)
- selector_combos
Bool to turn on/off calculation of statistics for every combination of selectors (optional). Default is True. For example, if True and both a ZONE and a REGION parameter is given as selectors, statistics for three groups will be calculated:
["ZONE", "FACIES"], ["ZONE"] and ["REGION"]
. If False the data will only be extracted for one group:["ZONE", "FACIES"]
, hence no data is available if the user wants to evaluate statistics per ZONE (or REGION) for the global grid.Depending on number of selectors and size of grid, this process may be time consuming.
- source
Source string (optional). Default values depend on the method being executed:
For grid statistics default is the gridname
For blocked wells statistics default is the name of the blocked wells object if inside RMS and bwells if outside
For well statistics default is wells
- name
ID string for the dataset (optional). Recommended, if not given it will be set equal to the source string.
- verbosity
Level of output while running None, “info” or “debug”, default is None. (optional)
Examples
get_grid_statistics examples
Example in RMS (continous properties - basic):
Example extracting statistics for porosity and permeability for each zone and facies. Result is written to csv.
from fmu.tools import QCProperties
GRID = "GeoGrid"
PROPERTIES = ["Poro", "Perm"]
SELECTORS = ["Zone", "Facies"]
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"grid": GRID,
"verbosity": 1,
}
qcp.get_grid_statistics(data=usedata, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
print("Done")
Example in RMS (continous properties - more settings):
Example extracting statistics for porosity per region. Filters are used to extract statistics for HC zone and Water zone separately. Statistics will be combined for regions with code values 2 and 3. Both properties are weighted on a Total_bulk parameter. Result is written to csv.
from fmu.tools import QCProperties
GRID = "GeoGrid"
PROPERTIES = {
"PORO": {"name": "PHIT", "weight": "Total_bulk"},
}
SELECTORS = {
"REGION": {
"name": "Regions",
"exclude": ["Surroundings"],
"codes": {2: "NS", 3: "NS",},
}
}
REPORT = "../output/qc/continous_stats.csv"
FLUID_FILTERS = {
"HC_zone": {"Fluid": {"include": ["oil", "gas"]}},
"Water_zone": {"Fluid": {"include": ["water"]}},
}
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"grid": GRID,
"multiple_filters": FLUID_FILTERS,
"verbosity": 1,
}
qcp.get_grid_statistics(data=usedata, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
print("Done")
Note
The code is executed twice, filtering on the HC-zone first then the water-zone in a second run. Alternatively the fluid parameter could have been used as a selector, for extracting statistics in one run.
Example in RMS (discrete properties):
Example extracting statistics for a discrete facies parameter for each region. The facies parameter are weighted on a Total_bulk parameter.
The result is written out to csv.
from fmu.tools import QCProperties
GRID = "GeoGrid"
PROPERTIES = {
"FACIES": {"name": "Facies", "weight": "Total_bulk"},
}
SELECTORS = ["Regions"]
REPORT = "../output/qc/discrete_stats.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"grid": GRID,
"verbosity": 1,
}
qcp.get_grid_statistics(data=usedata, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
print("Done")
Example when executed from files:
from fmu.tools import QCProperties
PATH = "../input/qc/"
GRID = "grid.roff"
PROPERTIES = {"PORO": {"name": "poro.roff"}}
SELECTORS = {
"ZONE": {
"name": "zone.roff",
},
"FACIES": {
"name": "facies.roff",
"exclude": ["Carbonate"],
},
}
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"path": PATH,
"grid": GRID,
"name": "MYDATA",
}
qcp.get_grid_statistics(data=usedata)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
Example when executed from file using Eclipse INIT-file as input:
from fmu.tools import QCProperties
PATH = "../input/qc/"
GRID = "ECLIPSE.EGRID"
PROPERTIES = {"PERMX": {"name": "PERMX", "pfile": "ECLIPSE.INIT"}}
SELECTORS = {
"FIPNUM": {
"name": "FIPNUM",
"pfile": "ECLIPSE.INIT"
},
}
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"path": PATH,
"grid": GRID,
"name": "from_eclipse",
}
qcp.get_grid_statistics(data=usedata)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
get_well_statistics examples
Example in RMS:
Example extracting statistics for permeability for each zone and facies. All wells starting with 33_10 and all 34_11 wells containing “A” will be included in statistics. Note the use of python regular expressions! Result is written to csv.
from fmu.tools import QCProperties
WELLS = {
"names": ["33_10.*", "34_11-.*A.*"],
"logrun": "log",
"trajectory": "Drilled trajectory",
}
PROPERTIES = {"PERM": {"name": "Klogh"}}
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"wells": WELLS,
}
qcp.get_well_statistics(data=usedata, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
print("Done")
Example when executed from files:
Example extracting statistics for permeability for each zone and facies. First extracting statistics for wells starting with “34_10-A”, then wells starting with “34_10-B” in a subsequent run. Result is written to csv.
from fmu.tools import QCProperties
WELLS = ["34_10-A.*"]
PATH = "../input/qc/"
PROPERTIES = ["Phit", "Klogh"]
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"wells": WELLS,
"path": PATH,
"name": "A-wells",
}
qcp.get_well_statistics(data=usedata)
usedata2 = usedata.copy()
usedata2["wells"] = ["34_10-B.*"]
usedata2["name"] = "B-wells"
qcp.get_grid_statistics(data=usedata2, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
get_bwell_statistics examples
Example in RMS:
Example extracting statistics for permeability for each zone and facies. All blocked wells will be included in statistics. Result is written to csv.
from fmu.tools import QCProperties
WELLS = {
"bwname": "BW",
"grid": "GeoGrid",
}
PROPERTIES = {"PERM": {"name": "Klogh"}}
SELECTORS = ["Zonelog", "Facies_log"]
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
usedata = {
"properties": PROPERTIES,
"selectors": SELECTORS,
"wells": WELLS,
}
qcp.get_bwell_statistics(data=usedata, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
print("Done")
Example when executed from files:
To come….
Comparison of data from different sources
Advice when comparing data from different sources
When extracting statistics from different sources there are several tips for enabling easy comparison in the post-analysis of the data in e.g. WebViz:
Input “properties” and “selectors” as dictionaries and keep property and selector keys identical between the sources. The keys will be the names seen in the dataframe.
Try to use the same selectors for all sources
Keep the option “selector_combos” at True to get as much overlapping data as possible. For example, if well statistics only have ZONE as selector and the grid properties are calculated with selectors ZONE and REGION and “selector_combos” where True, the ZONE level statistics can be compared.
Use the “codes” field on the selectors to align and match the codenames for each selector. For example if the zone codes are coarser in the grid than in the zonelogs from the wells, this field can be used to merge codes in the zonelog together under one name.
Example
Example below collects statistical data from four different sources and writes result to a csv-file. Several steps have been to ensure consistency between the sources, making the resulting csv-file easy to compare:
“Poro” and “Perm” will be the property names
“ZONE” will be the column name for the selector
The zone codes “UpperReek”, “MidReek”, “LowerReek” is present in the two grids, to get the same codes in the wells the codes are updated and redundant codes are excluded.
from fmu.tools import QCProperties
REPORT = "../output/qc/somefile.csv"
GEOGRIDDATA = {
"properties": ["Poro", "Perm"],
"selectors": {"ZONE": {"name":"Zone"}},
"grid": "GeoGrid",
}
SIMGRIDDATA = {
"properties": {"Poro": {"name":"PORO"}, "Perm": {"name":"PERMX"}},
"selectors": {"ZONE": {"name":"Zone"}},
"grid": "SimGrid",
}
BWDATA = {
"properties": {"Poro": {"name": "Phit"}, "Perm": {"name": "Klogh"}},
"selectors": {"ZONE": {"name": "Zonelog", "codes": {1:"UpperReek", 2:"MidReek", 3:"LowerReek"}, "exclude": ["Above_TopUpperReek", "Below_BaseLowerReek"]}},
"wells": {"bwname": "BW", "grid": "Geogrid"},
}
WDATA = BWDATA.copy()
WDATA["wells"] = {"logrun": "log", "trajectory": "Drilled trajectory"}
def extract_statistics():
qcp = QCProperties()
qcp.get_grid_statistics(data=GEOGRIDDATA, project=project)
qcp.get_grid_statistics(data=SIMGRIDDATA, project=project)
qcp.get_bwell_statistics(data=BWDATA, project=project)
qcp.get_well_statistics(data=WDATA, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
See also
The section below for example of using the same configuration but with yaml-input.
Using yaml input for auto execution
A yaml-configuration file can be used with the method from_yaml
to enable an automatic run of the methods.
This is especially useful if the user wants to run multiple extractions of statistics with minimal
code input.
The code evaluates what method to execute based on the value of the first level in the yaml file. The second level is a list of input ‘data’ objects, and statistics will be calculated for each list element.
Three fields are available for the first level:
grid
: the get_grid_statistics method are executed on elements in this levelwells
: the get_well_statistics method are executed on elements in this levelblockedwells
: the get_bwell_statistics method are executed on elements in this level
Example in RMS with setting from a YAML file:
Example using yaml input in RMS for extracting statistics for porosity and permeability from four data sources (geogrid, simgrid, wells and blocked wells). The resulting combined dataframe are written to csv.
from fmu.tools import QCProperties
YAML_PATH = "../input/qc/somefile.yml"
REPORT = "../output/qc/somefile.csv"
def extract_statistics():
qcp = QCProperties()
qcp.from_yaml(YAML_PATH, project=project)
qcp.to_csv(REPORT)
if __name__ == "__main__":
extract_statistics()
The YAML file may in case look like:
grid:
- grid: GeoGrid
properties:
- Poro
- Perm
selectors:
ZONE:
name: Zone
- grid: SimGrid
properties:
Poro:
name: PORO
Perm:
name: PERMX
selectors:
ZONE:
name: Zone
wells:
- wells:
logrun: log
trajectory: Drilled trajectory
properties:
Poro:
name: Phit
Perm:
name: Klogh
selectors:
ZONE:
name: Zonelog
codes:
1: UpperReek
2: MidReek
3: LowerReek
exclude:
- Above_TopUpperReek
- Below_BaseLowerReek
blockedwells:
- wells:
grid: GeoGrid
bwname: BW
properties:
Poro:
name: Phit
Perm:
name: Klogh
selectors:
ZONE:
name: Zonelog
codes:
1: UpperReek
2: MidReek
3: LowerReek
exclude:
- Above_TopUpperReek
- Below_BaseLowerReek
Additional Notes
Advice on performance
There are several settings that has an influence perfomance:
Filters can be used to remove unnecessary data, this will limit the input data before statistics is calculated and will speed up execution.
If many selectors, the option
selector_combos
can have a high impact on performance
Comparison with statistics in RMS
To avoid bias in the calculation, the code removes duplicates from both well and blocked well data before calculating statistics. Duplicates are data points that have the same coordinates and property values. For blocked wells this refers to cells that are penetrated by multiple wells, for raw wells this can happen if branches of multilateral wells have overlapping logs.
This is the same as RMS does when calculating statistics for blocked wells, and statistical values extracted with this code will be identical to RMS. However RMS does not remove duplicates when calculating statistics for raw wells, and minor differences in statistical values are possible.