Data#
The geoh5
format allows storing data (values) on different parts of an Object
. The data types currently supported by geoh5py
are
Float
Integer
Text
Colormap
Well log
import numpy as np
from geoh5py import Workspace
from geoh5py.objects import Curve
# Re-use the previous workspace
workspace = Workspace.create("my_project.geoh5")
# Create some curve object for demo
curve = Curve.create(workspace, vertices=np.c_[np.arange(100), np.zeros((100, 2))])
Float#
Numerical float
data can be attached to the various elements making up object. Data can be added to an Object
entity using the add_data
method.
curve.add_data(
{
"my_cell_values": {
"association": "CELL",
"values": np.random.randn(curve.n_cells),
}
}
)
<geoh5py.data.float_data.FloatData at 0x70c5769eece0>
The association
can be one of:
OBJECT: Single element characterizing the parent object
VERTEX: Array of values associated with the parent object vertices
CELL: Array of values associated with the parent object cells
The length and order of the array of values must be consistent with the corresponding element of association
. If the association
argument is omited, geoh5py
will attempt to assign the data to the correct part based on the shape of the data values, either object.n_values
or object.n_cells
# Add multiple data vectors on a single call
data = {}
for ii in range(8):
data[f"Period:{ii}"] = {
"association": "VERTEX",
"values": (ii + 1)
* np.cos(ii * curve.vertices[:, 0] * np.pi / curve.vertices[:, 0].max() / 4.0),
}
data_list = curve.add_data(data)
print([obj.name for obj in data_list])
['Period:0', 'Period:1', 'Period:2', 'Period:3', 'Period:4', 'Period:5', 'Period:6', 'Period:7']
The newly created data is directly added to the project’s geoh5
file and available for visualization:
Integer#
Same implementation as for Float data type but with values provided as integer (int32
).
Text#
Text (string) data can only be associated to the object itself.
curve.add_data({"my_comment": {"association": "OBJECT", "values": "hello_world"}})
<geoh5py.data.text_data.TextData at 0x70c56c7fe5f0>
Colormap#
The colormap data type can be used to store or customize the color palette used by Geoscience ANALYST.
from geoh5py.data.color_map import ColorMap
from geoh5py.objects import Grid2D
# Create some data on a grid2D entity.
# Create the Surface object
grid = Grid2D.create(
workspace,
u_cell_size=2.5,
v_cell_size=2.5,
u_count=64,
v_count=16,
)
# Add data
radius = grid.add_data({"radial": {"values": np.linalg.norm(grid.centroids, axis=1)}})
# Create a simple colormap that spans the data range
nc = 10
rgba = np.vstack(
[
np.linspace(radius.values.min(), radius.values.max(), nc), # Values
np.linspace(0, 255, nc), # Red
np.linspace(255, 0, nc), # Green
np.linspace(125, 15, nc), # Blue,
np.ones(nc) * 255, # Alpha,
]
).T
We now have an array that contains a range of integer values for red, green, blue and alpha (RGBA) over the span of the data values. This array can be used to implicitly create a MyColorMap from the EntityType
.
# Assign the colormap to the data type
radius.entity_type.color_map = rgba
The resulting ColorMap
stores the values to geoh5
as a numpy.recarray
with fields for Value
, Red
, Green
, Blue
and Alpha
.
radius.entity_type.color_map._values
rec.array([( 1.76776695, 0, 255, 125, 255),
( 19.72811601, 28, 226, 112, 255),
( 37.68846506, 56, 198, 100, 255),
( 55.64881412, 85, 170, 88, 255),
( 73.60916317, 113, 141, 76, 255),
( 91.56951223, 141, 113, 63, 255),
(109.52986128, 170, 85, 51, 255),
(127.49021034, 198, 56, 39, 255),
(145.45055939, 226, 28, 27, 255),
(163.41090845, 255, 0, 15, 255)],
dtype=[('Value', '<f8'), ('Red', 'u1'), ('Green', 'u1'), ('Blue', 'u1'), ('Alpha', 'u1')])
Files#
Raw files can be added to groups and objects and stored as blob (bytes) data in geoh5
.
with open("docs.txt", mode="w") as file:
file.write("Hello world")
file_data = grid.add_file("docs.txt")
The information can easily be re-exported out to disk with the save
method.
file_data.save_file(path="./temp", name="exported.txt")
import shutil
shutil.rmtree("./temp")
Well Data#
In the case of Drillhole
objects, data are always stored as from-to
interval values.
Depth Data#
Depth data are used to represent measurements recorded at discrete depths along the well path. A depth
attribute is required on creation. Depth markers are converted internally to from-to
intervals by adding a small depth values defined by the collocation_distance
. If the Drillhole
object already holds depth data at the same location, geoh5py
will group the datasets under the same PropertyGroup
.
from geoh5py.groups import DrillholeGroup
from geoh5py.objects import Drillhole
dh_group = DrillholeGroup.create(workspace)
well = Drillhole.create(workspace, collar=(0, 0, 0), parent=dh_group)
depths_A = np.arange(0, 50.0) # First list of depth
# Second list slightly offsetted on the first few depths
depths_B = np.arange(0.01, 50.01)
# Add both set of log data with 0.5 m tolerance
well.add_data(
{
"my_log_values": {
"depth": depths_A,
"values": np.random.randn(depths_A.shape[0]),
},
"log_wt_tolerance": {
"depth": depths_B,
"values": np.random.randn(depths_B.shape[0]),
},
}
)
[<abc.ConcatenatedFloatData at 0x70c56c49d150>,
<abc.ConcatenatedFloatData at 0x70c56c49c190>]
Interval (From-To) Data#
Interval data are defined by constant values bounded by a start (FROM) and an end (TO) depth. A from-to
attribute defined as a numpy.ndarray (nD, 2)
is expected on creation. Subsequent data are appended to the same interval PropertyGroup
if the from-to
values match within the collocation distance parameter. Users can control the tolerance for matching intervals by supplying a collocation_distance
argument in meters, or by setting the default on the drillhole entity (default_collocation_distance = 1e-2
meters).
# Define a from-to array
from_to = np.vstack([[0.25, 25.5], [30.1, 55.5], [56.5, 80.2]])
# Add some reference data
well.add_data(
{
"interval_values": {
"values": np.asarray([1, 2, 3]),
"from-to": from_to,
"value_map": {1: "Unit_A", 2: "Unit_B", 3: "Unit_C"},
"type": "referenced",
}
}
)
# Add float data on the same intervals
well.add_data(
{
"random_values": {
"values": np.random.randn(from_to.shape[0]),
"from-to": from_to,
}
}
)
<abc.ConcatenatedFloatData at 0x70c56c49d5d0>
Get data#
Just like any Entity
, data can be retrieved from the Workspace
using the get_entity
method. For convenience, Objects
also have a get_data_list
and get_data
method that focusses only on their respective children Data
.
my_list = curve.get_data_list()
print(my_list, curve.get_data(my_list[0]))
['Period:0', 'Period:1', 'Period:2', 'Period:3', 'Period:4', 'Period:5', 'Period:6', 'Period:7', 'my_cell_values', 'my_comment'] [<geoh5py.data.float_data.FloatData object at 0x70c56c7fed40>]
Property Groups#
Data
entities sharing the same parent Object
and association
can be linked within a property_groups
and made available through profiling. This can be used to group data that would normally be stored as 2D array.
# Add another VERTEX data and create a group with previous
curve.add_data_to_group([obj.uid for obj in data_list], "my_trig_group")
<geoh5py.groups.property_group.PropertyGroup at 0x70c56c49dfc0>
workspace.close()