Data Engineering in the Context of Panelist Management in Google Cloud
RTL Deutschland GmbH
Software Development & Cloud Engineering
In A Nutshell:
Task: Development of a tool to determine estimated reach for advertising placements of different formats, such as linear television and streaming
- 3 Data Engineers
- 1 UX Designer
- 1 Frontend Developer
- 1 Product Owner
- 1 Scrum Master
- 1 Data Analyst
Project duration: +5 months
Business Use Case:
- Tool for determining estimated reach for advertising placements for various formats such as linear TV and streaming
- Development of a dashboard for end users of the tool
- Here, users should be able to select which population segments, e.g. women 18-29 years, should be addressed by the advertising (targeting) or be addressed a second time (Retargeting)
- Furthermore, development of a tool for target group validation, i.e. analysis of which target groups have been effectively reached by a campaign.
- Big Data – Large amount of data that must be analyzed daily
- Data quality – Remove unrealistic cases and correctly identify users across different data sources
- High Application Availability
- Development of a precise and certifiable statistical model
The applications developed are rolled out to three different environments via DevOps. Each of these will be deployed in its own Kubernetes cluster. We distinguish between an Angular frontend dashboard and a Python REST API for data communication. Permissioning is completely controlled via Google Cloud. All critical components, such as credentials, are secured through a secure process between the Hashicorp Vault and Google Secret Manager.
The general development and deployment workflow was achieved with DevOps or CICD pipelines via GitLab. Here, the applications are put through their paces and tested, i.e. linting, quality gateways and unit tests. If errors occur, the deployment is stopped. This prevents faulty software from going live. Another challenge was the high availability of our applications. This could be achieved by stable deployments in Kubernetes. Through redundancy, a crashed instance can be compensated directly.
Google Cloud Platform (GCP) for Data Engineering
BigQuery (Data Warehouse)
Cloud Composer (Apache Airflow)
DataProc (Apache Spark)
Dataflow (Apache Beam)
GKE (Google Kubernetes Engine) | Docker
Cloud Storage (Data Lake)
VPC (Virtual Private Cloud)
Infrastructure as Code:
Hashicorp Terraform + Vault
- GitLab CICD
Python as a language for developing a REST FastAPI
Angular for the development of the frontend
Why Choose Pexon Consulting?
Pexon Consulting is fully committed to your success and we believe in always going the extra mile for each of our clients:
Commitment to Success
Focus on Performance
Engineering with Passion
Your contact persons
Send us a message using the contact form on our contact page and we will respond within a few business days. All information submitted will be treated confidentially.
Are you looking for a partner for your Project?
We will do our best to satisfy you.