Updating schemas¶
This document contains instructions and guidelines for making changes to schemas.
Changing a schema¶
This section collects some things to do, or not to do, when adding or changing fields to a schema. Some of these are generic but some are specific to how Pydantic translates its models into JSON schemas.
Tips and guidelines¶
Add a doc string to every class and field.
These docstrings are built into the data model documentation and included in the schema as description fields. Try to align them to existing examples!
Avoid free text fields as much as possible.
Use string enums if the string should be from a controlled vocabulary. If a string should be of a particular form apply a regex, if possible.
Prefer to make fields required.
At the time of writing we have to annotate optional fields with
Optional
, but this term is slightly deceiving. As a schema this makes the field nullable, meaning that it is a required field that is either some typeT
ornull
.A truly optional field must be a union between two types: one with the optional field, and one without it.
Ensure
Optional
fields havedefault=None
.There are issues in the JSON schema that occur when an optional field is not given a default of
None
. A test should catch when you forget to do this, but you should remember to do it.Apply numerical validation.
Many numerics have known ranges due to representing things physical and geometric. A cube cannot have a
z
depth of0
, a thickness cannot be negative. (Or can it? 🧐). PydanticField
s make this sort of validation simple to add; see its numeric constraints documentation.Take union orderings as important.
Pydantic has different modes it resolves union types by. This means when you create a field with a unioned type you should remember to consider that validating the union may be more nuanced than it appears.
Running the update script¶
To update schemas use the tool included in fmu-dataio.
./tools/update-schemas
If any schemas have changed, this command will fail and alert you that a
version bump is required for said schemas. You can also include the --diff
flag to display what has changed.
Under normal circumstances this means you must update the schema version in
accordance with the schema versioning protocol.
Schema versions are located in the code, as a class variable constant
VERSION
in the schema class derived from SchemaBase
. Changing a schema
version is cause for a discussion among the team and possibly stakeholders.
Under some circumstances, you may need to force update the schemas with some
changes. You can give the --force
flag for this.
Preparing schemas for production¶
To prepare schemas for production involves a few steps.
Check-out the commit or tagged version from
upstream/main
with the correct schema changes:git fetch upstream git checkout 3.2.1 # for version # or git checkout 1a2b3c4 # for a particular commit
Branch off from this
git checkout -b schema-release-2030.01
Rebase the
upstream/staging
branchgit rebase upstream/staging git log # Ensure it looks right
Check that the schema URLs are changing for the production version
./tools/update-schemas --prod --diff
Update the schemas for production
./tools/update-schemas --prod --force
Build and inspect the documentation relevant for the schema locally and ensure the information (version, fields, etc) are up to date. Append the changes that occurred to the changelog of each schema.
sphinx-build -b html docs/src build/docs/html -j auto open build/docs/html/index.html # in macOS
Commit and push to your fork. Create a PR that merges into the
staging
(!) branch, not the main branchgit add schemas/ docs/ git commit -m "REL: Update schema X" git push -u origin HEAD
Carefully look over the changes in the schema to ensure nothing looks out of order (typos, URLs pointing to dev rather than prod, etc)
Once merged this will create a new build on the
staging
Radix environment. Once tested and validated by all relevant parties it can be promoted to theproduction
environment.