Data Contracts are agreements that define how data is structured, formatted, and
managed when different parts of a Unified Namespace (UNS) architecture
communicate. They cover metadata, data models, and service levels to ensure that
all systems work together smoothly and reliably.
Simply put, data contracts specify where a message is going, the format it must
follow, how it’s delivered, and what happens when it arrives - all based on
agreed-upon rules and services. It is similar to an API: you send a specific message, and
it triggers a predefined action. For example, sending data
to _historian automatically stores it in TimescaleDB,
just like how a REST API’s POST endpoint would store data
in its database.
Example Historian
To give you a simple example, just think about the _historian schema. Perhaps
without realizing it, you have already used the Historian Data Contract by using
this schema.
Whenever you send a message to a topic that contains the _historian schema via
MQTT, you know that it will be bridged to Kafka and end up in TimescaleDB.
You could also send it directly into Kafka, and you know that it gets
bridged to MQTT as well.
But you also know that you have to follow the correct payload and topic
structure that we as UMH have defined. If there are any issues like a missing
timestamp in the message, you know that you could look them up in the
Management Console.
These rules ensure that the data can be written into the intended database
tables without causing errors, and that the data can be read by other programs,
as it is known what data and structure to expect.
For example, the timestamp is an easy way to avoid errors by making each message
idempotent (can be safely processed multiple times without changing the result).
Each data point associated with a tag is made completely unique by its timestamp, which is
critical because messages are sent using “at least once” semantics, which can
lead to duplicates. With idempotency, duplicate messages are ignored, ensuring
that each message is only stored once in the database.
If you want a lot more information and really dive into the reasons for this
approach, we recommend our article about
Data Modeling in the UNS
on our Learn page.
Rules of a Data Contract
Data Contracts can enforce a number of rules. This section provides an overview
of the two rules that are enforced by default. The specifics can vary between
Data Contracts; therefore, detailed information about the
Historian Data Contract
and Custom Data Contracts
is provided on their respective pages.
Topic Structure
As mentioned in the example, messages in the UMH must follow our ISA-95
compliant structure in order to be processed. The structure itself can be
divided into several sections.
You can check if your topics are correct in the validator below.
Topic validator
Prefix
The first section is the mandatory prefix: umh.v1. It ensures that the
structure can evolve over time without causing confusion or compatibility
problems.
Location
The next section is the Location, which consists of six parts:
enterprise.site.area.productionLine.workCell.originID.
You may be familiar with this structure as it is used by your instances and
connections. Here the enterprise field is mandatory.
When you create a Protocol Converter, it uses the Location of the instance and
the connection to prefill the topic, but you can add the unused ones or change
the prefilled parts.
Schemas
The schema, for example _historian, tells the UMH which data contract to
apply to the message. It is specified after the Location section and is
highlighted with an underscore to make it parsable for the UMH
and to clearly separate it from the location fields.
There is currently only one default schema in the UMH: _historian; for more
detailed information, see the
Historian Data Contract
page.
Depending on the schema used, the next parts of the topic may differ. For
example, in the `_historian’ schema, you can either attach your payload
directly or continue to group tags.
Allowed Characters
Topics can consist of any letters (a-z, A-Z), numbers (0-9), and the
symbols (- & _). Note that the _ cannot be used as the first character in
the Location section.
Be careful to avoid ., +, #, or / as these are
special symbols in Kafka or MQTT.
Note that our topics are case-sensitive, so umh.v1.ACMEIncorporated is
not the same as umh.v1.acmeincorporated.
Payload Structure
A Data Contract can include payload rules. For example, in the Historian Data
Contract, you must include a timestamp in milliseconds and a key-value pair.
These requirements are unique to each Data Contract.
Components of a Data Contract
In addition to the rules, a Data Contract consists of individual components.
The specifics can vary between Data Contracts; therefore, detailed information
about the Historian Data Contract
and Custom Data Contracts
is provided on their respective pages.
Data Flow Components
As the name implies, a Data Flow Component manages the movement and
transformation of data within the Unified Namespace architecture.
Data Flow Components can be of three different types: Protocol Converter, Data
Bridge, or Custom Data Flow Component. All are based on
BenthosUMH.
Protocol Converter
You have probably already created a Protocol Converter and are familiar with
its purpose: get data from different sources into your instances. You format
the data into the correct payload structure and send it to the correct topics.
When you add a Protocol Converter, the Management Console uses the configuration
of the underlying Connection and instance to automatically generate most of the
configuration for the Protocol Converter.
Data Bridges
Data Bridges are placed between two components of the Unified Namespace, such as
Kafka and MQTT, and allow messages to be passed between them. The default Data
Bridges are the two between MQTT and Kafka for the _historian schema, and the
bridge between Kafka and the database. Each Data Bridge is unidirectional and
specific to one schema.
Custom Data Flow Components
To meet everyone’s needs and enable stream processing, you can add Custom Data
Flow Components (creative naming is our passion). Unlike Protocol Converters or
Data Bridges, you have full control over their configuration, which makes them
incredibly versatile, but also complicated to set up. Therefore, they must be
manually enabled by switching to Advanced Mode in the Management Console Settings.
Other Data Contracts
Data Contracts can build on existing contracts. For example, if you use a Custom
Data Contract to automatically calculate KPIs, you can send the raw data to
_historian, process it with a Custom Data Flow Component, and publish it to a
new schema. The new Data Contract uses the Historian to collect data from the
machines and store it in the database.
1 - Historian Data Contract
This page is a deep dive of the Historian Data Contract of the UMH including the configuration and rules associated to it.
This section focuses on the specific details and configurations of the
Historian Data Contract. If you are not familiar with Data Contracts, you
should first read the
Data Contracts / API page.
Historian
The purpose of the Historian Data Contract is to govern the flow of data from
the Protocol Converter to the database.
It enforces rules for the structure of payloads and topics, and provides the
necessary infrastructure to bridge data in the Unified Namespace and write it
to the database.
This ensures that data is only stored in a format accepted by the database,
and makes it easier to integrate services like Grafana because the data
structure is already known.
It also ensures that each message is idempotent (can be safely processed
multiple times without changing the result), by making each message within a
tag completely unique by its timestamp.
This is critical because messages are sent using “at least once” semantics,
which can lead to duplicates.
With idempotency, duplicate messages are ignored, ensuring that each message
is only stored once in the database.
Topic Structure in the Historian Data Contract
The prefix and Location of the topic in the Historian Data Contract follows
the same rules as already described on the general
Data Contracts
page.
Prefix
The first section is the mandatory prefix: umh.v1..
It ensures that the structure can evolve over time without causing confusion or
compatibility problems.
Location
The next section is the Location, which consists of six parts:
enterprise.site.area.productionLine.workCell.originID.
You may be familiar with this structure as it is used by your instances and
connections. Here, the enterprise field is mandatory.
When you create a Protocol Converter, it uses the Location of the instance and
the connection to prefill the topic, but you can add the unused ones or change
the prefilled parts.
Schema: _historian
The only schema in the Historian Data Contract is _historian.
Without it, your messages will not be processed.
Tag groups
In addition to the Location, you can also use tag groups.
A tag group is just an additional part after the schema:
In the tag browser, a tag group will look like any field in the Location, except
that it is located after the schema.
Example
Tag groups can be useful for adding context to your tags or for keeping track
of them in the tag browser. For example, you might use them to categorize the
sensors on a CNC mill.
The Historian Data Contract requires that your messages be a JSON file with a
specific structure and include a timestamp and at least one tag with a value,
both as a key-value pair. The most basic message looks like this
{
"timestamp_ms": 1732280023697,
"tagname": 42}
The timestamp must be called "timestamp_ms" and contain the timestamp in
milliseconds. The value of the tag can be either a number "tagname": 123 or a
string "tagname": "string". The "tagname" is used in the tag browser or for
Grafana.
It is also possible to include multiple tags in a single payload.
The Historian Data Contract enables data acquisition and processing through the
use of Protocol Converters and the automatic deployment of three Data Bridges.
Data Bridges
There are three Data Bridges in the Historian Data Contract, which are
automatically created and configured when the instance is created.
The first bridge routes messages from Kafka to MQTT, the second from MQTT to Kafka.
The third Data Bridge bridges messages from Kafka to the TimescaleDB database.
The Data Bridges are responsible for validating the topic and payload, and
adding error logs in case a message is not valid.
Their configurations are not editable in the Management Console.
Protocol Converters
The easiest way to get data into your UNS is to use a Protocol Converter.
If you want to learn how to do this, you can follow our Get Started guide.
The configuration of a Protocol Converter consists of three sections:
Input: Here you specify the address, protocol used, authentication, and
the “location” of the data on the connected device. This could be the NodeID on
an OPC UA PLC.
Processing: In this section, you manipulate the data, build the
timestamped payload, and specify the topic.
Output: The output is completely auto-generated and cannot be modified.
The data is always sent to the instance’s Kafka broker.
Information specific to the selected protocol and section can be found by clicking on the vertical PROTOCOL CONVERTER button on the right edge of the window.
Verified Protocols
Our Protocol Converters are compatible with a long list of protocols.
The most important ones are considered verified by us; look for the check mark
next to the protocol name when selecting the protocol on the Edit Protocol
Converter page in the Management Console.
If you are using one of the verified protocols, many of the fields will be
populated automatically based on the underlying connection and instance.
The input section uses the address of the connection and adds prefixes and
suffixes as necessary. If you are using OPC UA, the username and password are
autofilled. The preconfigured processing section will use the location of the
instance and the connection to build the topic and use the name of the original
tag as the tag name. It will also automatically generate a payload with a
timestamp and the value of the incoming message.
If the preconfiguration does not meet your needs, you can change it.
Database
We use TimescaleDB as the database in the UMH. By default, only tags from the
Historian Data Contract are written to the database.
Our database for the Historian Data Contract consists of three tables. We chose
this layout to allow easy lookups based on the asset, while maintaining
separation between data and names. The separation into tag and tag_string
prevents accidental lookups of the wrong data type, which could break queries
such as aggregations or averages.
asset
An asset to us is the unique combination of the parts of the Location:
enterprise, site, area, line, workcell, and origin_id. Each asset
has an id that is automatically assigned.
All keys except id and enterprise are optional.
The example below shows how the table might look.
A new asset is added to the bottom of the table.
id
enterprise
site
area
line
workcell
origin_id
1
acme-corporation
2
acme-corporation
new-york
3
acme-corporation
london
north
assembly
4
stark-industries
berlin
south
fabrication
cell-a1
3002
5
stark-industries
tokyo
east
testing
cell-b3
3005
6
stark-industries
paris
west
packaging
cell-c2
3009
7
umh
cologne
office
dev
server1
sensor0
8
cuttingincorporated
cologne
cnc-cutter
tag
This table is a Timescale hypertable.
These tables are optimized to hold a large amount of data roughly sorted by time.
For example, we send data to umh/v1/cuttingincorporated/cologne/cnc-cutter/_historian/head using the following JSON:
All tags have the same asset_id because each topic contains the same Location.
The tag groups are not part of the asset and are prefixed to the tag name.
The origin is a placeholder for a later feature, and currently defaults to unknown.
tag_string
This table is similar to the tag table, but is used for string data.
For example, a CNC cutter could also output the G-code being processed.
Posting this message to the topic from above would result in this entry:
timestamp
name
origin
asset_id
value
1670001247568
g-code
unknown
8
G01 X10 Y10 Z0
2 - Custom Data Contracts
In addition to the standard data contracts provided, you can add your own.
This section focuses on Custom Data Contracts.
If you are not familiar with Data Contracts, you should first read the
Data Contracts / API.
We are currently working on a blog post that will explain the concept of Custom Data Contracts in more detail.
Why Custom Data Contracts
The only Data Contract that exists per default in the UMH is the Historian Data Contract.
Custom Data Contracts let you add additional functionalities to your UMH, like automatically calculate KPIs or further processing of data.
Example of a custom Data Contract
One example for a Custom Data Contract is the automated interaction of MES and PLCs.
Every time a machine stops, the latest order ID from the MES needs to be automatically written into the PLC.
We begin by utilizing the existing _historian data contract to continuously send and store the latest order ID from the MES in the UNS.
Additionally, a custom schema (for example, _action) is required to handle action requests and responses, enabling commands like writing data to the PLC.
The next step is to implement Protocol Converters to facilitate communication between systems.
For ingoing messages, a Protocol Converter fetches the latest order ID from the MES and publishes it to the UNS using the _historian data contract.
For outgoing messages, another Protocol Converter listens for action requests in the manually added _action data contract and executes them by getting the last order ID from the UNS and writing the order ID to the PLC.
Protocol Converters can be seen as an interface between the UMH and external systems.
Finally, we have to set up a Custom Data Flow Component as a stream processor that monitors the UNS for specific conditions, such as a machine stoppage. When such a condition is detected, it generates an action request in the _action data contract for the output protocol converter to process.
Additionally, we have to add Data Bridges for the _action schema.
In these you enforce a specific topic and payload structure.
The combination of the Historian Data Contract, the additional _action schema, custom Data Bridges, the two Protocol Converters and the stream processor and enforcement of payload and topic structure from this new Data Contract.
Topic Structure in Custom Data Contracts
The topic structure follows the same rules as specified in the Data Contracts / API page, until the schema-dependent content.
The schema-dependent content depends on your configuration of the deployed custom Data Bridges.
Add custom schema
More information about custom schemas will be added here when the feature is ready to use.