Design ohne Titel 4 2

Customer story

Building Data Pipelines in Azure Cloud

How data pipelines are used in Azure to analyze machine data

Company

Fresenius Medical Care
Services

Software Development
Industry

Healthcare
Customer

since 2022
P

Greater flexibility

Through the modular construction of the cloud architecture.

P

Scalability

Through AKS (Kubernetes services on Azure) and auto-scaling in the cloud.
P

Robustness

Through the use of Infrastructure as Code (Terraform).
P

Faster release cycles

Through the use of CICD and GitHub Actions pipelines.
P

Modern Secrets Management

Through the use of Key Vaults and Infrastructure as Code.
P

Faster analysis

Through the automated provision of data.

In a Nutshell:

  • SECTOR: HEALTHCARE
  • Task: This customer project was about making machine data concerning treatments available for analytics and machine learning.
  • Team:
  • About 8 people in total, mainly Data Scientists
  • Engineers who develop the machines and use the insights
  • Project duration: +12 months

Challenges:

  • Dialysis machines collect data locally in the clinic and the issue is their analysis
  • Complex parsers are needed to decrypt machine data
  • Build Azure data pipelines to consolidate important data
  • Merging of different data sources so that further KPIs can be calculated
  • Data pipelines should be idempotent and modular
  • Scaling: Pipelines should be designed for increasing data volumes
  • Privacy: patient data must be handled with the utmost care

Solutions:

  • Using Azure Functions to containerize parsers with Docker
  • Upload various data sources to Azure Data Lake Gen2 (parquet)
  • Using Azure Synapse Analytics to query the data with SQL
  • Auto-scaling through the use of Azure Functions
  • Modular, simple and scalable pipelines in Azure Data Factory
  • Pipeline orchestration through Azure Data Factory
  • Structure of different data layers in the Azure Data Lake (Raw, Staging, Core, Presentation)
  • Fully automated provisioning of the complete infrastructure with Infrastructure as Code (Pulumi)

Results:

  • Greater flexibility due to the modular structure of the cloud architecture
  • Scalability through Azure Functions and auto-scaling in the cloud
  • Cost-effective storage of data in Azure Data Lake Gen2
  • Robustness through the use of Infrastructure as Code (Pulumi)
  • Faster release cycles through the use of CICD and GitHub Actions pipelines
  • Modern Secret Management through the use of Key Vaults and Infrastructure as Code
  • Faster analysis through automated provision of data is possible
  • Secure and encrypted data storage in the cloud

Project events:

Customer requirements
The goal of the project was to collect data from various local machines, transfer them to the cloud and finally analyze them. For this purpose, technologies such as Azure Data Lake, Azure Data Factory, Azure Synapse and custom Python code were used. It was important for the customer to pay attention to the privacy of the data and to design the data pipelines securely. Other requirements were the complex data formats from the machines, for which custom Python code was written and a parser developed. Another requirement of the customer was to make the Azure infrastructure scalable in order to be prepared for the increasing data volumes in the future.

Adjustment and construction of the pipelines
The pipelines were primarily developed in Azure Data Factory. The Azure Data Lake Gen2 was used as the data source, and this was regularly expanded with new data from local machines. For the further processing and reading of the data with a custom C-parser, various Azure Functions were used. Various Data Scientists took over the further processing of the data and worked primarily in Azure Databricks. To read out all the data, Azure Synapse was used, accessing both the data in the data lake and the data in Azure Data Bricks. By building different release and deployment pipelines, the average delivery time for new features was significantly reduced. The use of pipelines was modularized so that different sub-teams used them individually depending on the specific application.

Robust, modular cloud infrastructure and data security
Another important requirement was to make the cloud infrastructure scalable. By using Infrastructure as Code, the central overview of the cloud resources and infrastructure was clearly defined. Another benefit of this was the improvement of developer workflows and integration with various DevOps processes. The entire infrastructure was built to scale with the growing demands of data volumes. By storing the data in Azure Data Lake and the secure data transfer, the data security of the customers could always be guaranteed. Data from the USA is always hosted in USA regions and data from Europe in European regions.

Project status and results
The customer is very happy with the choice of Azure as the cloud infrastructure. The project continues to run today and scales beautifully with the increasing data volumes. By optimizing various DevOps processes and building the pipelines, all sub-teams were significantly faster in testing, releasing and deploying the software. Thus, internal departments were able to see results faster and agilely adapt the customer requirements in the sprints. Using Infrastructure as Code, a robust and module cloud infrastructure could be created. By creating dynamics in pipelines, different in-house teams of the customers can access the analyzed data and create further analyses.

Technology Stack:

Cloud Infrastruktur:

  • Azure
  • Azure Synapse Analytics

  • Azure Functions

Data Services:

  • Azure Synapse Analytics
  • Azure Blob Storage
  • Azure Data Factory

CICD & IaC:

  • GitHub Actions
  • Azure Key Vault
  • Azure Resource Groups
  • Infrastructure as Code – Pulumi

Software Development:

  • Python

Why choose Pexon Consulting?

Pexon Consulting is fully committed to your success and we believe in always going the extra mile for each of our clients:

PexonConsulting 32 min

Commitment to success

We believe success is threefold: for our customers, their customers and the communities they impact.
PexonConsulting 33 min

Focus on performance

We become your dedicated partner. This means that we only complete a project when both of us are 100% satisfied.
PexonConsulting 31 min

Engineering with passion

We are a network of innovators. We develop daring solutions to our customers' most complicated challenges.

Your contact persons

Send us a message using the contact form on our contact page and we will respond within a few business days. All information submitted will be treated confidentially.

bild-von-paul-niebler

Paul Niebler

GF - Management, HR                                                                          Group 8

bild-von-marco-schwarz

Marco Schwarz

Head of Google Cloud                                                  Group 8

bild-von-phillip-pham

Phillip Pham

GF - Delivery, Sales, Finance                                                                Group 8

bild-von-alexander-nenninger

Alex Nenninger

Head of  Sales                                                  Group 8

bild-von-florian-schmidl

Florian Schmidl

Enterprise Architect                                                      Group 8

bild-von-maximilian-haensel

Max Hänsel

Head of AWS                                                    Group 8

bild-von-david-das-neves

David das Neves

Head of Azure                                                  Group 8

Are you looking for a partner for your project?

We will do our best to make you satisfied.