This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Data Infrastructure

An overview of UMH’s Data Infrastructure, integrating and managing diverse data sources.

1: Data Connectivity

1.1: Barcodereader
1.2: Node Red
1.3: Sensorconnect

2: Unified Namespace

2.1: Data Bridge
2.2: Kafka Broker
2.3: Kafka Console
2.4: MQTT Broker

3: Historian

3.1: Cache
3.2: Database
3.3: Factoryinsight
3.4: Grafana
3.5: Kafka to Postgresql V2
3.6: Umh Datasource V2

The United Manufacturing Hub’s Data Infrastructure is where all data converges. It extends the ISA95 Automation Pyramid, the usual model for data flow in factory settings. This infrastructure links each level of the traditional pyramid to the Unified Namespace (UNS), incorporating extra data sources that the typical automation pyramid doesn’t include. The data is then organized, stored, and analyzed to offer useful information for frontline workers. Afterwards, it can be sent to the a data lake or analytics platform, where business analysts can access it for deeper insights.

It comprises three primary elements:

Data Connectivity: This component includes an array of tools and services designed to connect various systems and sensors on the shop floor, facilitating the flow of data into the Unified Namespace.
Unified Namespace: Acts as the central hub for all events and messages on the shop floor, ensuring data consistency and accessibility.
Historian: Responsible for storing events in a time-series database, it also provides tools for data visualization, enabling both real-time and historical analytics.

Together, these elements provide a comprehensive framework for collecting, storing, and analyzing data, enhancing the operational efficiency and decision-making processes on the shop floor.

graph LR 2["`fa:fa-user **OT Professional / Shopfloor** Monitors and manages the shopfloor, including safety, automation and maintenance`"] style 2 fill:#dddddd,stroke:#9a9a9a,color:#000000 4["`**Data Warehouse/Data Lake** Stores data for analysis, on-premise or in the cloud`"] style 4 fill:#f4f4f4,stroke:#f4f4f4,color:#000000 5["`**Automation Pyramid** Represents the layered structure of systems in manufacturing operations based on the ISA-95 model`"] style 5 fill:#f4f4f4,stroke:#f4f4f4,color:#000000 16["`**Management Console** Configures, manages, and monitors Data and Device & Container Infrastructures in the UMH Integrated Platform`"] style 16 fill:#aaaaaa,stroke:#47a0b5,color:#000000 subgraph 50 [Data Infrastructure] style 50 fill:#ffffff,stroke:#47a0b5,color:#47a0b5 51["`**Unified Namespace** The central source of truth for all events and messages on the shop floor.`"] style 51 fill:#aaaaaa,stroke:#47a0b5,color:#000000 64["`**Historian** Stores events in a time-series database and provides visualization tools.`"] style 64 fill:#aaaaaa,stroke:#47a0b5,color:#000000 85["`**Connectivity** Includes tools and services for connecting various shop floor systems and sensors.`"] style 85 fill:#aaaaaa,stroke:#47a0b5,color:#000000 end 16-. Manages & monitors .->85 85-. Provides contextualized data .->51 85-. Provides and extracts data .->5 51-. Provides data .->4 51-. Stores data in a predefined schema .->64 5<-. Works with .-2 2-. Visualize real-time dashboards .->64

1 - Data Connectivity

Learn about the tools and services in UMH’s Data Connectivity for integrating shop floor systems.

The Data Connectivity module in the United Manufacturing Hub is designed to enable seamless integration of various data sources from the manufacturing environment into the Unified Namespace. Key components include:

Node-RED: A versatile programming tool that links hardware devices, APIs, and online services.
barcodereader: Connects to USB barcode readers, pushing data to the message broker.
benthos-umh: A specialized version of benthos featuring an OPC UA plugin for efficient data extraction.
sensorconnect: Integrates with IO-Link Masters and their sensors, relaying data to the message broker.

These tools collectively facilitate the extraction and contextualization of data from diverse sources, adhering to the ISA-95 automation pyramid model, and enhancing the Management Console’s capability to monitor and manage data flow within the UMH ecosystem.

graph LR 5["`**Automation Pyramid** Represents the layered structure of systems in manufacturing operations based on the ISA-95 model`"] style 5 fill:#f4f4f4,stroke:#f4f4f4,color:#000000 16["`**Management Console** Configures, manages, and monitors Data and Device & Container Infrastructures in the UMH Integrated Platform`"] style 16 fill:#aaaaaa,stroke:#47a0b5,color:#000000 51["`**Unified Namespace** The central source of truth for all events and messages on the shop floor.`"] style 51 fill:#aaaaaa,stroke:#47a0b5,color:#000000 subgraph 85 [Connectivity] style 85 fill:#ffffff,stroke:#47a0b5,color:#47a0b5 86["`**Node-RED** A programming tool for wiring together hardware devices, APIs, and online services.`"] style 86 fill:#aaaaaa,stroke:#47a0b5,color:#000000 87["`**Barcode Reader** Connects to USB barcode reader devices and pushes data to the message broker.`"] style 87 fill:#aaaaaa,stroke:#47a0b5,color:#000000 88["`**Sensor Connect** Reads out IO-Link Master and their connected sensors, pushing data to the message broker.`"] style 88 fill:#aaaaaa,stroke:#47a0b5,color:#000000 89["`**benthos-umh** Customized version of benthos with an OPC UA plugin`"] style 89 fill:#aaaaaa,stroke:#47a0b5,color:#000000 end 16-. Manages & monitors .->89 89-. Provides contextualized data .->51 86-. Provides contextualized data .->51 87-. Provides contextualized data .->51 88-. Provides contextualized data .->51 89-. Extracts data via OPC UA .->5 86-. Extracts data via S7, and many more protocols .->5

1.1 - Barcodereader

This microservice is still in development and is not considered stable for production use.

Barcodereader is a microservice that reads barcodes and sends the data to the Kafka broker.

How it works

Connect a barcode scanner to the system and the microservice will read the barcodes and send the data to the Kafka broker.

What’s next

Read the Barcodereader reference documentation to learn more about the technical details of the Barcodereader microservice.

1.2 - Node Red

Node-RED is a programming tool for wiring together hardware devices, APIs and online services in new and interesting ways. It provides a browser-based editor that makes it easy to wire together flows using the wide range of nodes in the Node-RED library.

How it works

Node-RED is a JavaScript-based tool that can be used to create flows that interact with the other microservices in the United Manufacturing Hub or external services.

See our guides for Node-RED to learn more about how to use it.

What’s next

Read the Node-RED reference documentation to learn more about the technical details of the Node-RED microservice.

1.3 - Sensorconnect

Sensorconnect automatically detects ifm gateways connected to the network and reads data from the connected IO-Link sensors.

How it works

Sensorconnect continuosly scans the given IP range for gateways, making it effectively a plug-and-play solution. Once a gateway is found, it automatically download the IODD files for the connected sensors and starts reading the data at the configured interval. Then it processes the data and sends it to the MQTT or Kafka broker, to be consumed by other microservices.

If you want to learn more about how to use sensors in your asstes, check out the retrofitting section of the UMH Learn website.

IODD files

The IODD files are used to describe the sensors connected to the gateway. They contain information about the data type, the unit of measurement, the minimum and maximum values, etc. The IODD files are downloaded automatically from IODDFinder once a sensor is found, and are stored in a Persistent Volume. If downloading from internet is not possible, for example in a closed network, you can download the IODD files manually and store them in the folder specified by the IODD_FILE_PATH environment variable.

If no IODD file is found for a sensor, the data will not be processed, but sent to the broker as-is.

What’s next

Read the Sensorconnect reference documentation to learn more about the technical details of the Sensorconnect microservice.

2 - Unified Namespace

Discover the Unified Namespace’s role as a central hub for shop floor data in UMH.

The Unified Namespace (UNS) within the United Manufacturing Hub is a vital module facilitating the streamlined flow and management of data. It comprises various microservices:

data-bridge: Bridges data between MQTT and Kafka and between multiple Kafka instances, ensuring efficient data transmission.
HiveMQ: An MQTT broker crucial for receiving data from IoT devices on the shop floor.
Redpanda (Kafka): Manages large-scale data processing and orchestrates communication between microservices.
Redpanda Console: Offers a graphical interface for monitoring Kafka topics and messages.

The UNS serves as a pivotal point in the UMH architecture, ensuring data from shop floor systems and sensors (gathered via the Data Connectivity module) is effectively processed and relayed to the Historian and external Data Warehouses/Data Lakes for storage and analysis.

graph LR 4["`**Data Warehouse/Data Lake** Stores data for analysis, on-premise or in the cloud`"] style 4 fill:#f4f4f4,stroke:#f4f4f4,color:#000000 64["`**Historian** Stores events in a time-series database and provides visualization tools.`"] style 64 fill:#aaaaaa,stroke:#47a0b5,color:#000000 85["`**Connectivity** Includes tools and services for connecting various shop floor systems and sensors.`"] style 85 fill:#aaaaaa,stroke:#47a0b5,color:#000000 subgraph 51 [Unified Namespace] style 51 fill:#ffffff,stroke:#47a0b5,color:#47a0b5 52["`**Redpanda (Kafka)** Handles large-scale data processing and communication between microservices.`"] style 52 fill:#aaaaaa,stroke:#47a0b5,color:#000000 53["`**HiveMQ** MQTT broker used for receiving data from IoT devices on the shop floor.`"] style 53 fill:#aaaaaa,stroke:#47a0b5,color:#000000 54["`**Redpanda Console** Provides a graphical view of topics and messages in Kafka.`"] style 54 fill:#aaaaaa,stroke:#47a0b5,color:#000000 55["`**databridge** Bridges messages between MQTT and Kafka as well as between Kafka and other Kafka instances.`"] style 55 fill:#aaaaaa,stroke:#47a0b5,color:#000000 end 54-.->52 52<-.->55 55<-.->53 55-. Provides data .->4 52-. Stores data in a predefined schema .->64 85-. Provides contextualized data .->53 85-. Provides contextualized data .->52

2.1 - Data Bridge

Data-bridge is a microservice specifically tailored to adhere to the UNS data model. It consumes topics from a message broker, translates them to the proper format and publishes them to the other message broker.

How it works

Data-bridge connects to the source broker, that can be either Kafka or MQTT, and subscribes to the topics specified in the configuration. It then processes the messages, and publishes them to the destination broker, that can be either Kafka or MQTT.

In the case where the destination broker is Kafka, messages from multiple topics can be merged into a single topic, making use of the message key to identify the source topic. For example, subscribing to a topic using a wildcard, such as umh.v1.acme.anytown..*, and a merge point of 4, will result in messages from the topics umh.v1.acme.anytown.foo.bar, umh.v1.acme.anytown.foo.baz, umh.v1.acme.anytown and umh.v1.acme.anytown.frob being merged into a single topic, umh.v1.acme.anytown, with the message key being the missing part of the topic name, in this case foo.bar, foo.baz, etc.

Here is a diagram showing the flow of messages:

graph LR source((MQTT or Kafka broker)) subgraph Messages direction TB msg1(topic: umh/v1/acme/anytown/foo/bar
value: 1) msg2(topic: umh/v1/acme/anytown/foo/baz
value: 2) msg3(topic: umh/v1/acme/anytown
value: 3) msg4(topic: umh/v1/acme/anytown/frob
value: 4) end source --> msg1 source --> msg2 source --> msg3 source --> msg4 msg1 --> bridge msg2 --> bridge msg3 --> bridge msg4 --> bridge bridge{{data-bridge
subscribes to: umh/v1/acme/anytown/#
topic merge point: 4}} subgraph Grouped messages direction TB gmsg1(topic: umh.v1.acme.anytown
key: foo.bar
value: 1) gmsg2(topic: umh.v1.acme.anytown
key: foo.baz
value: 2) gmsg3(topic: umh.v1.acme.anytown
value: 3) gmsg4(topic: umh.v1.acme.anytown
key: frob
value: 4) end bridge --> gmsg1 bridge --> gmsg2 bridge --> gmsg3 bridge --> gmsg4 dest((Kafka broker)) gmsg1 --> dest gmsg2 --> dest gmsg3 --> dest gmsg4 --> dest

The value of the message is not changed, only the topic and key are modified.

Another important feature is that it is possible to configure multiple data bridges, each with its own source and destination brokers, and each with its own set of topics to subscribe to and merge point.

The brokers can be local or remote, and, in case of MQTT, they can be secured using TLS.

What’s next

Read the Data Bridge reference documentation to learn more about the technical details of the data-bridge microservice.

2.2 - Kafka Broker

The Kafka broker in the United Manufacturing Hub is RedPanda, a Kafka-compatible event streaming platform. It’s used to store and process messages, in order to stream real-time data between the microservices.

How it works

RedPanda is a distributed system that is made up of a cluster of brokers, designed for maximum performance and reliability. It does not depend on external systems like ZooKeeper, as it’s shipped as a single binary.

Read more about RedPanda in the official documentation.

What’s next

Read the Kafka Broker reference documentation to learn more about the technical details of the Kafka broker microservice

2.3 - Kafka Console

Kafka-console uses Redpanda Console to help you manage and debug your Kafka workloads effortlessy.

With it, you can explore your Kafka topics, view messages, list the active consumers, and more.

How it works

You can access the Kafka console via its Service.

It’s automatically connected to the Kafka broker, so you can start using it right away. You can view the Kafka broker configuration in the Broker tab, and explore the topics in the Topics tab.

What’s next

Read the Kafka Console reference documentation to learn more about the technical details of the Kafka Console microservice.

2.4 - MQTT Broker

The MQTT broker in the United Manufacturing Hub is HiveMQ and is customized to fit the needs of the stack. It’s a core component of the stack and is used to communicate between the different microservices.

How it works

The MQTT broker is responsible for receiving MQTT messages from the different microservices and forwarding them to the MQTT Kafka bridge.

What’s next

Read the MQTT Broker reference documentation to learn more about the technical details of the MQTT Broker microservice.

3 - Historian

Insight into the Historian’s role in storing and visualizing data within the UMH ecosystem.

The Historian in the United Manufacturing Hub serves as a comprehensive data management and visualization system. It includes:

kafka-to-postgresql-v2: Archives Kafka messages adhering to the Data Model V2 schema into the database.
TimescaleDB: An open-source SQL database specialized in time-series data storage.
Grafana: A software tool for data visualization and analytics.
factoryinsight: An analytics tool designed for data analysis, including calculating operational efficiency metrics like OEE.
grafana-datasource-v2: A Grafana plugin facilitating connection to factoryinsight.
Redis: Utilized as an in-memory data structure store for caching purposes.

This structure ensures that data from the Unified Namespace is systematically stored, processed, and made visually accessible, providing OT professionals with real-time insights and analytics on shop floor operations.

graph LR 2["`fa:fa-user **OT Professional / Shopfloor** Monitors and manages the shopfloor, including safety, automation and maintenance`"] style 2 fill:#dddddd,stroke:#9a9a9a,color:#000000 51["`**Unified Namespace** The central source of truth for all events and messages on the shop floor.`"] style 51 fill:#aaaaaa,stroke:#47a0b5,color:#000000 subgraph 64 [Historian] style 64 fill:#ffffff,stroke:#47a0b5,color:#47a0b5 65["`**kafka-to-postgresql-v2** Stores in the database the Kafka messages that follow the Data Model V2 schema`"] style 65 fill:#aaaaaa,stroke:#47a0b5,color:#000000 66["`**TimescaleDB** An open-source time-series SQL database`"] style 66 fill:#aaaaaa,stroke:#47a0b5,color:#000000 67["`**Grafana** Visualization and analytics software`"] style 67 fill:#aaaaaa,stroke:#47a0b5,color:#000000 68["`**factoryinsight** Analytics software that allows data analysis, like OEE`"] style 68 fill:#aaaaaa,stroke:#47a0b5,color:#000000 69["`**grafana-datasource-v2** Grafana plugin to easily connect to factoryinsight`"] style 69 fill:#aaaaaa,stroke:#47a0b5,color:#000000 70["`**Redis** In-memory data structure store used for caching`"] style 70 fill:#aaaaaa,stroke:#47a0b5,color:#000000 end 65-. Stores data .->66 51-. Stores data in a predefined schema via .->65 67-. Performs SQL queries .->66 67-. Includes .->69 69-. Extracts KPIs and other high-level metrics .->68 68-. Queries data .->66 68<-.->70 65<-.->70 2-. Visualize real-time dashboards .->67

3.1 - Cache

The cache in the United Manufacturing Hub is Redis, a key-value store that is used as a cache for the other microservices.

How it works

Recently used data is stored in the cache to reduce the load on the database. All the microservices that need to access the database will first check if the data is available in the cache. If it is, it will be used, otherwise the microservice will query the database and store the result in the cache.

By default, Redis is configured to run in standalone mode, which means that it will only have one master node.

What’s next

Read the Cache reference documentation to learn more about the technical details of the cache microservice.

3.2 - Database

The database microservice is the central component of the United Manufacturing Hub and is based on TimescaleDB, an open-source relational database built for handling time-series data. TimescaleDB is designed to provide scalable and efficient storage, processing, and analysis of time-series data.

You can find more information on the datamodel of the database in the Data Model section, and read about the choice to use TimescaleDB in the blog article.

How it works

When deployed, the database microservice will create two databases, with the related usernames and passwords:

grafana: This database is used by Grafana to store the dashboards and other data.
factoryinsight: This database is the main database of the United Manufacturing Hub. It contains all the data that is collected by the microservices.

Then, it creates the tables based on the database schema.

If you want to learn more about how TimescaleDB works, you can read the TimescaleDB documentation.

What’s next

Read the Database reference documentation to learn more about the technical details of the database microservice.

3.3 - Factoryinsight

Factoryinsight is a microservice that provides a set of REST APIs to access the data from the database. It is particularly useful to calculate the Key Performance Indicators (KPIs) of the factories.

How it works

Factoryinsight exposes REST APIs to access the data from the database or calculate the KPIs. By default, it’s only accessible from the internal network of the cluster, but it can be configured to be accessible from the external network.

The APIs require authentication, that can be either a Basic Auth or a Bearer token. Both of these can be found in the Secret factoryinsight-secret.

What’s next

Read the Factoryinsight reference documentation to learn more about the technical details of the Factoryinsight microservice.

3.4 - Grafana

The grafana microservice is a web application that provides visualization and analytics capabilities. Grafana allows you to query, visualize, alert on and understand your metrics no matter where they are stored.

It has a rich ecosystem of plugins that allow you to extend its functionality beyond the core features.

How it works

Grafana is a web application that can be accessed through a web browser. It let’s you create dashboards that can be used to visualize data from the database.

Thanks to some custom datasource plugins, Grafana can use the various APIs of the United Manufacturing Hub to query the database and display useful information.

What’s next

Read the Grafana reference documentation to learn more about the technical details of the grafana microservice.

3.5 - Kafka to Postgresql V2

The Kafka to PostgreSQL v2 microservice plays a crucial role in consuming and translating Kafka messages for storage in a PostgreSQL database. It aligns with the specifications outlined in the Data Model v2.

How it works

Utilizing Data Model v2, Kafka to PostgreSQL v2 is specifically configured to process messages from topics beginning with umh.v1.. Each new topic undergoes validation against Data Model v2 before message consumption begins. This ensures adherence to the defined data structure and standards.

Message payloads are scrutinized for structural validity prior to database insertion. Messages with invalid payloads are systematically rejected to maintain data integrity.

The microservice then evaluates the payload to determine the appropriate table for insertion within the PostgreSQL database. The decision is based on the data type of the payload field, adhering to the following rules:

Numeric data types are directed to the tag table.
String data types are directed to the tag_string table.

What’s next

Read the Kafka to Postgresql v2 reference documentation to learn more about the technical details of the Kafka to Postgresql v2 microservice.

3.6 - Umh Datasource V2

The plugin, umh-datasource-v2, is a Grafana data source plugin that allows you to fetch resources from a database and build queries for your dashboard.

How it works

When creating a new panel, select umh-datasource-v2 from the Data source drop-down menu. It will then fetch the resources from the database. The loading time may depend on your internet speed.
Select the resources in the cascade menu to build your query. DefaultArea and DefaultProductionLine are placeholders for the future implementation of the new data model.
Only the available values for the specified work cell will be fetched from the database. You can then select which data value you want to query.
Next you can specify how to transform the data, depending on what value you selected. For example, all the custom tags will have the aggregation options available. For example if you query a processValue:
- Time bucket: lets you group data in a time bucket
- Aggregates: common statistical aggregations (maximum, minimum, sum or count)
- Handling missing values: lets you choose how missing data should be handled

Configuration

In Grafana, navigate to the Data sources configuration panel.
Select umh-v2-datasource to configure it.
Configurations:
- Base URL: the URL for the factoryinsight backend. Defaults to http://united-manufacturing-hub-factoryinsight-service/.
- Enterprise name: previously customerID for the old datasource plugin. Defaults to factoryinsight.
- API Key: authenticates the API calls to factoryinsight. Can be found with UMHLens by going to Secrets → factoryinsight-secret → apiKey. It should follow the format Basic xxxxxxxx.