
Data Engineering in the Context of Panelist Management in Google Cloud
RTL Deutschland GmbH
Software Development & Cloud Engineering
Media
Since 2021
In A Nutshell:
-
SECTOR: MEDIA
-
Task: Development of a tool to determine estimated reach for advertising placements of different formats, such as linear television and streaming
-
Team:
- 3 Data Engineers
- 1 UX Designer
- 1 Frontend Developer
- 1 Product Owner
- 1 Scrum Master
- 1 Data Analyst
-
Project duration: +5 months
Business Use Case:
- Tool for determining estimated reach for advertising placements for various formats such as linear TV and streaming
- Development of a dashboard for end users of the tool
- Here, users should be able to select which population segments, e.g. women 18-29 years, should be addressed by the advertising (targeting) or be addressed a second time (Retargeting)
- Furthermore, development of a tool for target group validation, i.e. analysis of which target groups have been effectively reached by a campaign.
Challenges:
- Big Data – Large amount of data that must be analyzed daily
- Data quality – Remove unrealistic cases and correctly identify users across different data sources
- High Application Availability
- Development of a precise and certifiable statistical model
Project Events:
The applications developed are rolled out to three different environments via DevOps. Each of these will be deployed in its own Kubernetes cluster. We distinguish between an Angular frontend dashboard and a Python REST API for data communication. Permissioning is completely controlled via Google Cloud. All critical components, such as credentials, are secured through a secure process between the Hashicorp Vault and Google Secret Manager.
The general development and deployment workflow was achieved with DevOps or CICD pipelines via GitLab. Here, the applications are put through their paces and tested, i.e. linting, quality gateways and unit tests. If errors occur, the deployment is stopped. This prevents faulty software from going live. Another challenge was the high availability of our applications. This could be achieved by stable deployments in Kubernetes. Through redundancy, a crashed instance can be compensated directly.
Technology Stack:
Google Cloud Platform (GCP) for Data Engineering
-
BigQuery (Data Warehouse)
-
Cloud Composer (Apache Airflow)
-
DataProc (Apache Spark)
-
Dataflow (Apache Beam)
-
GKE (Google Kubernetes Engine) | Docker
-
Cloud Storage (Data Lake)
-
VPC (Virtual Private Cloud)
-
MySQL
-
Secret Manager
Infrastructure as Code:
-
Hashicorp Terraform + Vault
DevOps:
- GitLab CICD
- PyTest
- Helm
Coding:
-
Python as a language for developing a REST FastAPI
-
Angular for the development of the frontend
-
SQL
Why Choose Pexon Consulting?
Pexon Consulting is fully committed to your success and we believe in always going the extra mile for each of our clients:

Commitment to Success

Focus on Performance

Engineering with Passion
Your contact persons
Send us a message using the contact form on our contact page and we will respond within a few business days. All information submitted will be treated confidentially.
Are you looking for a partner for your Project?
We will do our best to satisfy you.