Describes all available _schema and there structure
Data Model (v1)
The Data Infrastructure of the UMH consists out of the three components: Connectivity, Unified Namespace, and Historian (see also Architecture). Each of the components has their own standards and best-practices, so a consistent data model across multiple building blocks need to combine all of them.
Incoming data is often unstructured, therefore our standard allows either conformant data in our
_historian schema, or any kind of data in any other schema.
Our key considerations where:
- Event driven architecture: We only look at changes, reducing network and system load
- Ease of use: We allow any data in, allowing OT & IT to process it as they wish
The UNS employs MQTT and Kafka in a hybrid approach, utilizing MQTT for efficient data collection and Kafka for robust data processing. The UNS is designed to be reliable, scalable, and maintainable, facilitating real-time data processing and seamless integration or removal of system components.
These elements are the foundation for our data model in UNS:
Incoming data based on OT standards: Data needs to be contextualized here not by IT people, but by OT people. They want to model their data (topic hierarchy and payloads) according to ISA-95, Weihenstephaner Standard, Omron PackML, Euromap84, (or similar) standards, and need e.g., JSON as payload to better understand it.
Hybrid Architecture: Combining MQTT’s user-friendliness and widespread adoption in Operational Technology (OT) with Kafka’s advanced processing capabilities. Topics and payloads can not be interchanged fully between them due to limitations in MQTT and Kafka, so some trade-offs needs to be done.
Processed data based on IT standards: Data is sent after processing to IT systems, and needs to adhere with standards: the data inside of the UNS needs to be easy processable for either contextualization, or storing it in a Historian or Data Lake.
We choose TimescaleDB as our primary database.
Key elements we considered:
- IT best-practice: used SQL and Postgres for easy compatibility, and therefore TimescaleDb
- Straightforward queries: we aim to make easy SQL queries, so that everyone can build dashboards
- Performance: because of time-series and typical workload, the database layout might not be optimized fully on usability, but we did some trade-offs that allow it to store millions of data points per second
_historian data is stored and can be queried