The Data Infrastructure of the UMH consists out of the three components: Connectivity, Unified Namespace, and Historian (see also Architecture). Each of the components has their own standards and best-practices, so a consistent data model across
multiple building blocks need to combine all of them.
If you like to learn more about our data model & ADR’s checkout our learn article.
Connectivity
Incoming data is often unstructured, therefore our standard allows either conformant data in our _historian schema, or any kind of data in any other schema.
Our key considerations where:
Event driven architecture: We only look at changes, reducing network and system load
Ease of use: We allow any data in, allowing OT & IT to process it as they wish
The UNS employs MQTT and Kafka in a hybrid approach, utilizing MQTT for efficient data collection and Kafka for robust data processing.
The UNS is designed to be reliable, scalable, and maintainable, facilitating real-time data processing and seamless integration or removal of system components.
These elements are the foundation for our data model in UNS:
Incoming data based on OT standards: Data needs to be contextualized here not by IT people, but by OT people.
They want to model their data (topic hierarchy and payloads) according to ISA-95, Weihenstephaner Standard, Omron PackML, Euromap84, (or similar) standards, and need e.g., JSON as payload to better understand it.
Hybrid Architecture: Combining MQTT’s user-friendliness and widespread adoption in Operational Technology (OT) with Kafka’s advanced processing capabilities.
Topics and payloads can not be interchanged fully between them due to limitations in MQTT and Kafka, so some trade-offs needs to be done.
Processed data based on IT standards: Data is sent after processing to IT systems, and needs to adhere with standards: the data inside of the UNS needs to be easy processable for either contextualization, or storing it in a Historian or Data Lake.
IT best-practice: used SQL and Postgres for easy compatibility, and therefore TimescaleDb
Straightforward queries: we aim to make easy SQL queries, so that everyone can build dashboards
Performance: because of time-series and typical workload, the database layout might not be optimized fully on usability, but we did some trade-offs that allow it to store millions of data points per second
1 - Unified Namespace
Describes all available _schema and their structure
Topic structure
Versioning Prefix
The umh/v1 at the beginning is obligatory. It ensures that the structure can evolve over time without causing confusion or compatibility issues.
Topic Names & Rules
All part of this structure, except for enterprise and _schema are optional.
They can consist of any letters (a-z, A-Z), numbers (0-9) and therein symbols (- & _).
Be careful to avoid ., +, # or / as these are special symbols in Kafka or MQTT.
Ensure that your topic always begins with umh/v1, otherwise our system will ignore your messages.
Be aware that our topics are case-sensitive, therefore umh.v1.ACMEIncorperated is not the same as umh.v1.acmeincorperated.
Throughout this documentation we will use the MQTT syntax for topics (umh/v1), the corresponding Kafka topic names are the same but / replaced with .
Topic validator
OriginID
This part identifies where the data is coming from.
Good options include the senders MAC address, hostname, container id.
Examples for originID: 00-80-41-ae-fd-7e, E588974, e5f484a1791d
Messages tagged with _analytics will be processed by our analytics pipeline.
They are used for automatic calculation of KPI’s and other statistics.
_local
This key might contain any data, that you do not want to bridge to other nodes (it will however be MQTT-Kafka bridged on its node).
For example this could be data you want to pre-process on your local node, and then put into another _schema.
This data must not necessarily be JSON.
Other
Any other schema, which starts with an underscore (for example: _images), will be forwarded by both MQTT-Kafka & Kafka-Kafka bridges but never processed or stored.
This data must not necessarily be JSON.
Converting other data models
Most data models already follow a location based naming structure.
KKS Identification System for Power Stations
KKS (Kraftwerk-Kennzeichensystem) is a standardized system for identifying and classifying equipment and systems in power plants, particularly in German-speaking countries.
In a flow diagram, the designation is: 1 2LAC03 CT002 QT12
Level 0 Classification:
Block 1 of a power plant site is designated as 1 in this level.
Level 1 Classification:
The designation for the 3rd feedwater pump in the 2nd steam-water circuit is 2LAC03. This means:
Main group 2L: 2nd steam, water, gas circuit
Subgroup (2L)A: Feedwater system
Subgroup (2LA)C: Feedwater pump system
Counter (2LAC)03: third feedwater pump system
Level 2 Classification:
For the 2nd temperature measurement, the designation CT002 is used. This means:
Main group C: Direct measurement
Subgroup (C)T: Temperature measurement
Counter (CT)002: second temperature measurement
Level 3 Classification:
For the 12th immersion sleeve as a sensor protection, the designation QT12 is used. This means:
Main group Q: Control technology equipment
Subgroup (Q)T: Protective tubes and immersion sleeves as sensor protection
Counter (QT)12: twelfth protective tube or immersion sleeve
The above example refers to the 12th immersion sleeve at the 2nd temperature measurement of the 3rd feed pump in block 1 of a power plant site.
Translating this in our data model could result in:
umh/v1/nuclearCo/1/2LAC03/CT002/QT12/_schema
Where:
nuclearCo: Represents the enterprise or the name of the nuclear company.
1: Maps to the site, corresponding to Block 1 of the power plant as per the KKS number.
2LAC03: Fits into the area, representing the 3rd feedwater pump in the 2nd steam-water circuit.
CT002: Aligns with productionLine, indicating the 2nd temperature measurement in this context.
QT12: Serves as the workCell or originID, denoting the 12th immersion sleeve.
_schema: Placeholder for the specific data schema being applied.
1.1 - _analytics
Messages for our analytics feature
Topic structure
Work Order
Create
Use this topic to create a new work order.
This replaces the addOrder message from our v0 data model.
Fields
external_work_order_id (string): The work order ID from your MES or ERP system.
product (object): The product being produced.
external_product_id (string): The product ID from your MES or ERP system.
cycle_time_ms (number) (optional): The cycle time for the product in seconds. Only include this if the product has not been previously created.
quantity (number): The quantity of the product to be produced.
status (number) (optional): The status of the work order. Defaults to 0 (created).
0 - Planned
1 - In progress
2 - Completed
start_time_unix_ms (number) (optional): The start time of the work order. Will be set by the corresponding start message if not provided.
end_time_unix_ms (number) (optional): The end time of the work order. Will be set by the corresponding stop message if not provided.
start_time_unix_ms (number): The start time of the state.
end_time_unix_ms (number): The end time of the state.
1.2 - _historian
Messages for our historian feature
Topic structure
Message structure
Our _historian messages are JSON containing a unix timestamp as milliseconds (timestamp_ms) and one or more key value pairs.
Each key value pair will be inserted at the given timestamp into the database.
If you use a boolean value, it will be interpreted as a number.
Tag grouping
Sometimes it makes sense to further group data together.
In the following example we have a CNC cutter, emitting data about it’s head position.
If we want to group this for easier access in Grafana, we could use two types of grouping.
Using Tags / Tag Groups in the Topic:
This will result in 3 new database entries, grouped by head & pos.
This can be useful, if we also want to monitor the cutter head temperature and other attributes, while still preserving most of the readability of the above method.
This function is an optimized version of get_asset_id that is defined as immutable.
It is the fastest of the three functions and should be used for all queries, except when you plan to manually modify values inside the asset table.
This function returns the id of the given asset.
It takes a variable number of arguments, where only the first (enterprise) is mandatory.
This function is only kept for compatibility reasons and should not be used in new queries, see get_asset_id_stable or get_asset_id_immutable instead.
There is no immutable version of get_asset_ids, as the returned values will probably change over time.
[Legacy] get_asset_ids
This function returns the ids of the given assets.
It takes a variable number of arguments, where only the first (enterprise) is mandatory.
It is only kept for compatibility reasons and should not be used in new queries, see get_asset_ids_stable instead.
This table holds all assets.
An asset for us is the unique combination of enterprise, site, area, line, workcell & origin_id.
All keys except for id and enterprise are optional.
In our example we have just started our CNC cutter, so it’s unique asset will get inserted into the database.
It already contains some data we inserted before so the new asset will be inserted at id: 8
id
enterprise
site
area
line
workcell
origin_id
1
acme-corporation
2
acme-corporation
new-york
3
acme-corporation
london
north
assembly
4
stark-industries
berlin
south
fabrication
cell-a1
3002
5
stark-industries
tokyo
east
testing
cell-b3
3005
6
stark-industries
paris
west
packaging
cell-c2
3009
7
umh
cologne
office
dev
server1
sensor0
8
cuttingincoperated
cologne
cnc-cutter
work_order
This table holds all work orders.
A work order is a unique combination of external_work_order_id and asset_id.
work_order_id
external_work_order_id
asset_id
product_type_id
quantity
status
start_time
end_time
1
#2475
8
1
100
0
2022-01-01T08:00:00Z
2022-01-01T18:00:00Z
product_type
This table holds all product types.
A product type is a unique combination of external_product_type_id and asset_id.
product_type_id
external_product_type_id
cycle_time_ms
asset_id
1
desk-leg-0112
10.0
8
product
This table holds all products.
product_type_id
product_batch_id
asset_id
start_time
end_time
quantity
bad_quantity
1
batch-n113
8
2022-01-01T08:00:00Z
2022-01-01T08:10:00Z
100
7
shift
This table holds all shifts.
A shift is a unique combination of asset_id and start_time.
shiftId
asset_id
start_time
end_time
1
8
2022-01-01T08:00:00Z
2022-01-01T19:00:00Z
state
This table holds all states.
A state is a unique combination of asset_id and start_time.
asset_id
start_time
state
8
2022-01-01T08:00:00Z
20000
8
2022-01-01T08:10:00Z
10000
2.2 - Historian
How _historian data is stored and can be queried
Our database for the umh.v1 _historian datamodel currently consists of three tables.
These are used for the _historian schema.
We choose this layout to enable easy lookups based on the asset features, while maintaining separation between data and names.
The split into tag & tag_string prevents accidental lookups of the wrong datatype, which might break queries such as aggregations, averages, …
asset
This table holds all assets.
An asset for us is the unique combination of enterprise, site, area, line, workcell & origin_id.
All keys except for id and enterprise are optional.
In our example we have just started our CNC cutter, so it’s unique asset will get inserted into the database.
It already contains some data we inserted before so the new asset will be inserted at id: 8
id
enterprise
site
area
line
workcell
origin_id
1
acme-corporation
2
acme-corporation
new-york
3
acme-corporation
london
north
assembly
4
stark-industries
berlin
south
fabrication
cell-a1
3002
5
stark-industries
tokyo
east
testing
cell-b3
3005
6
stark-industries
paris
west
packaging
cell-c2
3009
7
umh
cologne
office
dev
server1
sensor0
8
cuttingincoperated
cologne
cnc-cutter
tag
This table is a timescale hypertable.
These tables are optimized to contain a large amount of data which is roughly sorted by time.
In our example we send data to umh/v1/cuttingincorperated/cologne/cnc-cutter/_historian/head using the following JSON:
The origin is a placeholder for a later feature, and currently defaults to unknown.
tag_string
This table is the same as tag, but for string data.
Our CNC cutter also emits the G-Code currently processed.
umh/v1/cuttingincorperated/cologne/cnc-cutter/_historian
Unknown (30000-59999): These states represent that the asset is in an unspecified state.
Glossary
OEE: Overall Equipment Effectiveness
KPI: Key Performance Indicator
Conclusion
This documentation provides a comprehensive overview of the states used in the United Manufacturing Hub software stack and their respective categories. For more information on each state category and its individual states, please refer to the corresponding subpages.
3.1 - Active (10000-29999)
These states represent that the asset is actively producing
10000: ProducingAtFullSpeedState
This asset is running at full speed.
Examples for ProducingAtFullSpeedState
WS_Cur_State: Operating
PackML/Tobacco: Execute
20000: ProducingAtLowerThanFullSpeedState
Asset is producing, but not at full speed.
Examples for ProducingAtLowerThanFullSpeedState
WS_Cur_Prog: StartUp
WS_Cur_Prog: RunDown
WS_Cur_State: Stopping
PackML/Tobacco : Stopping
WS_Cur_State: Aborting
PackML/Tobacco: Aborting
WS_Cur_State: Holding
Ws_Cur_State: Unholding
PackML:Tobacco: Unholding
WS_Cur_State Suspending
PackML/Tobacco: Suspending
WS_Cur_State: Unsuspending
PackML/Tobacco: Unsuspending
PackML/Tobacco: Completing
WS_Cur_Prog: Production
EUROMAP: MANUAL_RUN
EUROMAP: CONTROLLED_RUN
Currently not included:
WS_Prog_Step: all
3.2 - Unknown (30000-59999)
These states represent that the asset is in an unspecified state
30000: UnknownState
Data for that particular asset is not available (e.g. connection to the PLC is disrupted)
Examples for UnknownState
WS_Cur_Prog: Undefined
EUROMAP: Offline
40000 UnspecifiedStopState
The asset is not producing, but the reason is unknown at the time.
Examples for UnspecifiedStopState
WS_Cur_State: Clearing
PackML/Tobacco: Clearing
WS_Cur_State: Emergency Stop
WS_Cur_State: Resetting
PackML/Tobacco: Clearing
WS_Cur_State: Held
EUROMAP: Idle
Tobacco: Other
WS_Cur_State: Stopped
PackML/Tobacco: Stopped
WS_Cur_State: Starting
PackML/Tobacco: Starting
WS_Cur_State: Prepared
WS_Cur_State: Idle
PackML/Tobacco: Idle
PackML/Tobacco: Complete
EUROMAP: READY_TO_RUN
50000: MicrostopState
The asset is not producing for a short period (typically around five minutes), but the reason is unknown at the time.
3.3 - Material (60000-99999)
These states represent that the asset has issues regarding materials.
60000 InletJamState
This machine does not perform its intended function due to a lack of material flow in the infeed of the machine, detected by the sensor system of the control system (machine stop). In the case of machines that have several inlets, the condition o lack in the inlet refers to the main flow , i.e. to the material (crate, bottle) that is fed in the direction of the filling machine (Central machine). The defect in the infeed is an extraneous defect, but because of its importance for visualization and technical reporting, it is recorded separately.
Examples for InletJamState
WS_Cur_State: Lack
70000: OutletJamState
The machine does not perform its intended function as a result of a jam in the good flow discharge of the machine, detected by the sensor system of the control system (machine stop). In the case of machines that have several discharges, the jam in the discharge condition refers to the main flow, i.e. to the good (crate, bottle) that is fed in the direction of the filling machine (central machine) or is fed away from the filling machine. The jam in the outfeed is an external fault 1v, but it is recorded separately, because of its importance for visualization and technical reporting.
Examples for OutletJamState
WS_Cur_State: Tailback
80000: CongestionBypassState
The machine does not perform its intended function due to a shortage in the bypass supply or a jam in the bypass discharge of the machine, detected by the sensor system of the control system (machine stop). This condition can only occur in machines with two outlets or inlets and in which the bypass is in turn the inlet or outlet of an upstream or downstream machine of the filling line (packaging and palleting machines). The jam/shortage in the auxiliary flow is an external fault, but it is recoded separately due to its importance for visualization and technical reporting.
Examples for the CongestionBypassState
WS_Cur_State: Lack/Tailback Branch Line
90000: MaterialIssueOtherState
The asset has a material issue, but it is not further specified.
Examples for MaterialIssueOtherState
WS_Mat_Ready (Information of which material is lacking)
PackML/Tobacco: Suspended
3.4 - Process(100000-139999)
These states represent that the asset is in a stop, which belongs to the process and cannot be avoided.
100000: ChangeoverState
The asset is in a changeover process between products.
Examples for ChangeoverState
WS_Cur_Prog: Program-Changeover
Tobacco: CHANGE OVER
110000: CleaningState
The asset is currently in a cleaning process.
Examples for CleaningState
WS_Cur_Prog: Program-Cleaning
Tobacco: CLEAN
120000: EmptyingState
The asset is currently emptied, e.g. to prevent mold for food products over the long breaks, e.g. the weekend.
Examples for EmptyingState
Tobacco: EMPTY OUT
130000: SettingUpState
This machine is currently preparing itself for production, e.g. heating up.
Examples for SettingUpState
EUROMAP: PREPARING
3.5 - Operator (140000-159999)
These states represent that the asset is stopped because of operator related issues.
140000: OperatorNotAtMachineState
The operator is not at the machine.
150000: OperatorBreakState
The operator is taking a break.
This is different from a planned shift as it could contribute to performance losses.
Examples for OperatorBreakState
WS_Cur_Prog: Program-Break
3.6 - Planning (160000-179999)
These states represent that the asset is stopped as it is planned to stopped (planned idle time).
160000: NoShiftState
There is no shift planned at that asset.
170000: NO OrderState
There is no order planned at that asset.
3.7 - Technical (180000-229999)
These states represent that the asset has a technical issue.
180000: EquipmentFailureState
The asset itself is defect, e.g. a broken engine.
Examples for EquipmentFailureState
WS_Cur_State: Equipment Failure
190000: ExternalFailureState
There is an external failure, e.g. missing compressed air.
Examples for ExternalFailureState
WS_Cur_State: External Failure
200000: ExternalInterferenceState
There is an external interference, e.g. the crane to move the material is currently unavailable.
210000: PreventiveMaintenanceStop
A planned maintenance action.
Examples for PreventiveMaintenanceStop
WS_Cur_Prog: Program-Maintenance
PackML: Maintenance
EUROMAP: MAINTENANCE
Tobacco: MAINTENANCE
220000: TechnicalOtherStop
The asset has a technical issue, but it is not specified further.