Rachid EL MAAZOUZ

Software Engineer - Senior Data Engineer

relmaazouz@proton.me
+33 6 24 01 00 36
Summary

Data Engineer & Software Engineer with more than a decade of experience designing, building, and modernizing scalable data platforms. Strong expertise in Apache Spark, Azure Data services, and the Databricks Lakehouse ecosystem (Delta Lake, Unity Catalog, Workflows). Experienced in cloud-native architecture, large-scale data processing, and ETL/ELT pipelines. Engineering-driven mindset focused on automation, CI/CD, governance, and delivering reliable, high-performance data products that support business and analytics at scale.

Professional Experience

Senior Data Engineer

Candriam

Aug. 2024 - Present

  • Designed and implemented scalable data ingestion frameworks in PySpark, driven by metadata, for large-scale pipelines
  • Developed a robust data quality framework ensuring completeness, consistency, and reliability
  • Built and optimized ETL/ELT pipelines (ingestion, transformation, loading) following medallion/lakehouse architecture
  • Engineered and deployed data APIs with C# enabling low-latency access for quants and analysts
  • Designed and delivered a data testing framework for automated validation and performance benchmarking
  • Worked in Azure cloud environment (Data Lake, Synapse, Delta, Pipelines, DevOps)
  • Collaborated with risk teams to deliver fit-for-purpose data products supporting analytics and reporting
PySparkPythonAzureSynapseAzure DevOps.NetC#GitDeltaParquet

Senior Data Engineer

Informatique CDC

Jul. 2023 - Sept. 2024

  • Implemented business use cases: data structures, CI/CD pipelines, PySpark jobs
  • Established development best practices: GitFlow, CI/CD workflows, automated testing
  • Modeled and implemented Data Vault architecture across ingestion and datamart layers
  • Developed large-scale ingestion, transformation, and exposure jobs in PySpark and Python
  • Integrated external market and risk data sources (Bloomberg, ratings, etc.) via Kafka
  • Worked with Cloudera ecosystem (Hive, Spark, Kafka) with Jenkins, Control-M, Bitbucket
  • Delivered reliable data products supporting business and risk management
  • Migratre the data lake on Databricks Azure (with unity catalog): modeled and delivred the new Data Vault architecture across ingestion and datamart layers, later enhanced with Unity Catalog to enforce centralized governance, lineage, and fine-grained access control.
  • Designed new data pipelines: Refactored on-prem scripted ETL pipelines into modular Databricks notebooks, orchestrated through Databricks Jobs and Workflows with Delta Lake optimizations.
PySparkPythonKafkaClouderaHiveSparkJenkinsControl-MBitbucketDatabricksAzure

Data Engineer

Crédit Agricole

Nov 2021 - Jun 2023

  • Maintained operations of large-scale Data Lake on MapR (Hive, Hadoop, Sqoop, Tez, Oozie, PySpark)
  • Developed ingestion, transformation, and exposure jobs in PySpark for high-volume datasets
  • Led migration and redesign of Data Lake with new zoning and pipelines
  • Translated and optimized legacy Hive SQL scripts into Oracle SQL
  • Automated ODI object generation via Python scripting
  • Implemented automated data validation frameworks
  • Containerized and deployed applications with Kubernetes and ArgoCD
PySparkHiveHadoopSqoopTezOozieOracle ODIKubernetesArgoCDControl-MGit

Consultant Data

Informatique CDC

Nov 2020 - Oct 2021

  • Gathered requirements for eFront FIA migration and defined new data models and workflows
  • Authored detailed technical specifications for transaction mapping and codification framework
  • Extracted and transformed data from FIC using Front Script (funds, investors, companies, instruments, transactions)
  • Implemented a PostgreSQL-based data hub for transaction and fund exchanges with AWS S3 integration
  • Developed custom Spring Boot web application to support valuation workflows
  • Worked in hybrid environment with SQL Server, PostgreSQL, Kafka, AWS S3, Spring Boot, eFront FIA
PostgreSQLSQL ServerKafkaAWS S3Spring BooteFront FIA

Data Engineer

Consolis Group

Jul 2018 - Sep 2020

  • Designed and developed a financial performance data lake on Google BigQuery
  • Built ingestion pipelines from subsidiaries using Pub/Sub and enriched data in transformation layers
  • Developed dashboards and reports in Data Studio for business stakeholders
  • Worked within the Google Cloud Platform ecosystem (BigQuery, DataProc, Composer, Cloud Storage)
BigQueryPub/SubDataProcComposerData StudioCloud Storage

Data Engineer / Python Developer

Inwi

Aug 2016 - May 2018

  • Installed and configured Hadoop ecosystem (Ambari, RHEL, Kafka, HBase, Hive)
  • Built real-time ingestion pipelines with Apache Flink to process CRM, CDRs, HLR, VLR, PPS data
  • Designed datamarts for network, equipment, voice/SMS/MMS, mobile, and customer datasets
  • Prepared enriched customer datasets (social networks, geolocation, call/SMS history) for churn prediction
  • Developed interactive visualizations (time-series, bubble charts, word clouds) for business metrics
HadoopKafkaHBaseHiveApache FlinkPySparkAmbari

ERP Developer

OCP

May 2014 - Aug 2016

  • Developed analytical and operational reports with SQL, PL/SQL, and XML Publisher
  • Customized and built reporting interfaces with Oracle OAF and Oracle Forms
  • Developed APIs and web services to integrate Oracle ERP with legacy systems
Oracle E-Business SuiteSQLPL/SQLOracle OAFOracle FormsXML Publisher

Software Engineer

S2M

Aug 2013 - Apr 2014

  • Designed data schemas and exchange interfaces to integrate core banking data
  • Defined application and data architecture for e-banking system
  • Developed and deployed APIs and services (authentication, transfers, statements, payroll) for secure real-time banking
JavaSpring WebHibernateOracle DBPostgreSQLApache CXFJenkinsHTML5
Education

Masters in Financial Markets and Capital Management

CNAM

September 2025

Sep 2023 - May 2025

Software Engineer

Mohammadia School of Engineers

Jully 2013

Sep 2023 - May 2025

Certifications

GCP Professional Data Engineer

Google

August 2023

MongoDB Certified DBA

MongoDB

September 2022

Passionate about building scalable data platforms and solving complex problems through code.