Data Engineer & Cloud Architect

Hello, I'm

Mohammed EL-KHOU

Expert Data Engineer

Expert Data Engineer & Cloud Architect with 6+ years of experience in designing and implementing enterprise-scale data solutions. Currently at RMC BFM ADS (Altice Media) in Paris, specializing in AWS Cloud Solutions, Big Data Processing, ETL Pipelines, and Data Warehouse Architecture.

0 Years Experience
0 Projects
0 Research Papers
Mohammed EL-KHOU

About Me

Get to know me better

I am a passionate Data Engineer and Cloud Architect with over 6 years of experience in designing and implementing enterprise-scale data solutions.

Currently working as a Data Engineer / Cloud Engineer (AWS) at RMC BFM ADS (Altice Media) in Paris, France. I specialize in building scalable data infrastructure, optimizing ETL processes, and implementing cloud-native solutions that process millions of data points daily for media and advertising analytics.

Previously served as a Big Data Engineer at BPCE-SI, where I architected and maintained Hadoop-based data lakes, developed Spark/Scala applications, and led the migration from Cloudera to Google Cloud Platform.

My expertise spans across Big Data Engineering, Cloud Architecture, ETL Pipeline Development, Data Warehouse Management, and Machine Learning Engineering with hands-on experience in Computer Vision, NLP, and ASR systems.

0
Years Experience
0
Projects Completed
0
Certifications
0
Research Papers

Professional Experience

My career journey

Data Engineer / Cloud Engineer (AWS)

RMC BFM ADS (Altice Media) - Paris, France
01/2024 - Present

Key Responsibilities:

  • Database Maintenance: Performance monitoring of AWS Athena, SQL Server, Oracle, PostgreSQL. Proactive resolution of performance, security, and reliability issues.
  • ETL Management (Talend): Daily management of ETL workflows, debugging, optimization, and creation of new routines with advanced Data Science techniques.
  • Data Warehouse Migration: Led complete Python-based ETL architecture replacing legacy Talend infrastructure. Developed 14 modular ETL modules achieving 80% processing time reduction.
  • Video Transcoding Platform: Built serverless solution using AWS Lambda, Docker, FFmpeg for advertising video processing with VAST XML generation.
  • Audio-to-Video Conversion: Automated video creation from audio files using AWS Transcribe for French subtitles and FFmpeg for MP4 generation.
  • Advertising Billing Automation: Developed reconciliation engine using AWS Textract OCR and AppNexus/FreeWheel APIs integration.
  • AWS Cost Analysis Pipeline: Built CloudWatch logs analysis system with Athena and Parquet storage for cost optimization.
Python AWS Talend Oracle PostgreSQL Docker Lambda S3 Athena

Big Data Engineer

BPCE-SI - Lille, France
08/2022 - 01/2024

Context: Design, development and management of Big Data pipelines for all Caisse d'Epargne Ile de France divisions.

Key Achievements:

  • Strengthened team expertise in Big Data technologies, particularly Cloudera
  • Developed and automated ETL pipelines for Hadoop Datalake (Hive, HDFS) from Oracle databases
  • Collaborated with data scientists for business intelligence insights
  • Implemented data governance policies ensuring quality and accuracy
  • Automated data pipelines using Python, Shell, Spark/Scala, and PySpark
  • Led Cloudera to GCP migration with dual RUN implementation
  • Optimized Big Data solutions for performance, scalability, and cost-effectiveness

Results: Significant improvement in data availability, reduced processing times, increased data quality, and infrastructure cost reduction.

Cloudera Hadoop Spark Scala PySpark Oracle GCP BigQuery VTOM

R&D Data Engineer / Machine Learning Engineer

IMPERIUM - Casablanca, Morocco
09/2021 - 08/2022

Context: Production deployment of AI solutions in multimedia domain, handling unstructured/semi-structured data with resource management and automation.

Key Projects:

  • Infrastructure Modernization: Implemented microservices for AI model integration using Docker containers
  • Training Pipeline Structuring: Converted Jupyter notebooks to structured OOP project with multiprocessing, multithreading, and GPU parallelization
  • Desktop Applications: Created PyQt5 applications for internal use
  • End-to-End Web Mining: Built microservices architecture for news sites scraping, crawling, transformation, and storage with PostgreSQL

Results: Achieved near real-time predictions through multiprocessing implementation with continuous batch processing despite limited resources.

Python Docker Kubernetes PostgreSQL Elasticsearch Airflow PyQt5

ML/AI Developer & Data Scientist

3W Media - Imperium - Casablanca, Morocco
03/2020 - 09/2021

Major Projects:

  • Speech Transcription: Automated speech-to-text conversion using offline recognition for 17 languages and dialects
  • Audio Fingerprinting: Advertisement spot identification and search using Landmark-based audio fingerprinting
  • Speaker & Gender Detection: ASR system for large-scale audiovisual monitoring with gender equality insights
  • Social Media Analytics: Topic detection and sentiment analysis for Moroccan news using Twitter API and web scraping
  • Face Recognition System: End-to-end facial recognition and attribute analysis (age, gender, emotion, race) using state-of-the-art models
  • Multilingual Dictionary: Alternative to alphabetical dictionary with semantic grouping and structural hierarchies
  • Object Detection: Brand and logo detection in advertising screens, billboards, and newspapers using YOLO, RetinaNet, Faster R-CNN
TensorFlow PyTorch OpenCV YOLO NLTK Scrapy Flask MySQL

Skills & Technologies

My technical expertise

Programming Languages

Python

98%

Scala

90%

SQL

95%

Java

85%

Cloud & Big Data

Amazon Web Services

Athena, S3, Lambda, API Gateway, EC2, ECR, Route 53, Glue, IAM, CloudFront, Cognito, CloudWatch, Transcribe, Textract

95%

Big Data Stack

Spark, Hadoop (HDFS, Hive, HBase, Pig, Storm), Cloudera

92%

Google Cloud Platform

BigQuery, Cloud Storage, Dataflow, Pub/Sub

88%

ETL & Data Pipelines

Talend, Airflow, VTOM, Custom Python ETL

96%

Python Ecosystem

Data Science

PySpark, Numpy, Pandas, SciPy, Matplotlib, Seaborn

95%

Machine Learning

Scikit-Learn, TensorFlow, Keras, PyTorch

92%

NLP & Text Processing

NLTK, SpaCy, Gensim, BERT

88%

Web & APIs

Flask, SQLAlchemy, Requests, Selenium, boto3

90%

Database Technologies

SQL Databases

Oracle, PostgreSQL, MySQL, SQL Server, Salesforce

95%

NoSQL Databases

Redis, Elasticsearch, Solr

85%

DevOps & Infrastructure

Containerization

Docker, Kubernetes, VMware

90%

Monitoring & BI

Grafana, Kibana, Power BI

85%

Version Control & CI/CD

Git, GitHub, Bitbucket, JIRA

92%

Operating Systems

Linux (Debian, CentOS), Windows Server, Unix

88%

Professional Skills

Analytical Thinking

Strong analytical mindset with problem-solving capabilities

Team Leadership

Experience leading technical teams and mentoring junior developers

Communication

Excellent communication skills and client relationship management

Fast Learning

Quick adaptation to new technologies and methodologies

Project Management

Agile/Scrum methodology expertise with strong organizational skills

Priority Management

Reactive approach with excellent sense of priorities

Technical Expertise & Achievements

Deep dive into my technical mastery and professional accomplishments

AWS Cloud Architecture & Engineering Excellence

Serverless Computing Mastery

Lambda Functions: Architected 25+ production Lambda functions processing 10M+ requests monthly

  • Containerized Lambda with Docker for FFmpeg video processing (2GB memory, 15-minute timeout)
  • Lambda@Edge for global content delivery optimization (sub-100ms response times)
  • Event-driven architectures with S3, SQS, SNS integration
  • Cost optimization achieving 70% reduction through right-sizing and scheduling

API Gateway: RESTful and WebSocket APIs with custom authorizers, request/response transformation

Step Functions: Complex workflow orchestration for multi-step data processing pipelines

Data Services & Analytics

Amazon Athena: Petabyte-scale query optimization, partitioning strategies, cost control

  • Columnar storage with Parquet format reducing query costs by 85%
  • Advanced SQL optimization techniques (window functions, CTEs, complex joins)
  • Automated data cataloging with AWS Glue crawlers
  • Query result caching and lifecycle management

S3 Advanced: Multi-region replication, lifecycle policies, event notifications, security controls

CloudWatch: Custom metrics, log insights, automated alerting, cost anomaly detection

Security & Compliance

IAM Expertise: Least privilege access, cross-account roles, SAML federation

  • Custom IAM policies with condition-based access control
  • Service-linked roles and resource-based policies
  • AWS Organizations SCPs for governance at scale
  • Secrets Manager integration for credential rotation

VPC Networking: Multi-AZ architectures, NAT gateways, security groups, NACLs

Encryption: KMS key management, envelope encryption, data-at-rest and in-transit protection

Big Data Engineering & Distributed Systems

Apache Spark Ecosystem

Spark Core: RDD transformations, actions, broadcast variables, accumulators

  • Custom partitioning strategies for optimal data distribution
  • Memory management and garbage collection tuning
  • Dynamic resource allocation and adaptive query execution
  • Catalyst optimizer understanding for query performance

Spark SQL: DataFrame/Dataset APIs, complex window functions, user-defined functions (UDFs)

Spark Streaming: Real-time data processing with micro-batching and structured streaming

MLlib: Distributed machine learning algorithms, feature engineering pipelines

Hadoop Ecosystem Mastery

HDFS: Block replication, rack awareness, federation, high availability

  • Cluster sizing and capacity planning for PB-scale storage
  • Data locality optimization and hotspot mitigation
  • Backup and disaster recovery strategies
  • Performance tuning (block size, replication factor)

Hive: Complex HiveQL queries, partitioning, bucketing, ORC/Parquet optimization

HBase: NoSQL database design, row key optimization, region splitting

Cloudera: Cluster management, security (Kerberos), resource management (YARN)

Stream Processing & Real-time Analytics

Apache Kafka: Topic design, partitioning strategies, consumer group management

  • Kafka Connect for data integration with external systems
  • Schema Registry for Avro/JSON schema evolution
  • Kafka Streams for stream processing applications
  • Monitoring and alerting with JMX metrics

Apache Storm: Topology design, spouts, bolts, guaranteed message processing

Real-time Dashboards: Grafana, Kibana integration with streaming data

Advanced Programming & Software Engineering

Python Mastery

Advanced Python: Metaclasses, decorators, context managers, async/await patterns

  • Multiprocessing and multithreading for CPU/IO-bound tasks
  • Memory profiling and performance optimization
  • Custom data structures and algorithm implementations
  • Package development and distribution (PyPI)

Data Science Stack: NumPy vectorization, Pandas optimization, SciPy statistical functions

Web Frameworks: Flask/FastAPI for REST APIs, SQLAlchemy ORM, Celery for task queues

Scala & Functional Programming

Scala Expertise: Case classes, pattern matching, higher-order functions, implicits

  • Akka actors for concurrent and distributed systems
  • Cats/Scalaz for functional programming abstractions
  • SBT build tool and dependency management
  • Integration with Spark for high-performance data processing

Functional Paradigms: Monads, functors, immutable data structures

Database Systems

SQL Mastery: Complex queries, window functions, CTEs, query optimization

  • Oracle: PL/SQL, partitioning, materialized views, RAC
  • PostgreSQL: Extensions, JSONB, full-text search, replication
  • SQL Server: T-SQL, SSIS, columnstore indexes
  • Performance tuning: indexing strategies, execution plans

NoSQL: Elasticsearch aggregations, Redis data structures, document modeling

Machine Learning & AI Engineering

Deep Learning Frameworks

TensorFlow/Keras: Custom layers, training loops, distributed training, TensorBoard

  • Model optimization: quantization, pruning, knowledge distillation
  • TensorFlow Serving for production model deployment
  • TensorFlow Extended (TFX) for ML pipelines
  • GPU optimization with CUDA and cuDNN

PyTorch: Dynamic computation graphs, custom datasets, distributed training

Model Deployment: ONNX conversion, TensorRT optimization, edge deployment

Computer Vision Excellence

Object Detection: YOLO, R-CNN family, SSD, RetinaNet implementation and optimization

  • Custom dataset creation and annotation workflows
  • Data augmentation strategies for improved generalization
  • Transfer learning and fine-tuning techniques
  • Real-time inference optimization (TensorRT, OpenVINO)

Image Processing: OpenCV advanced techniques, morphological operations, feature extraction

Face Recognition: Multi-model ensemble approaches, embedding optimization

Natural Language Processing

Transformer Models: BERT, GPT, T5 fine-tuning and deployment

  • Attention mechanisms and positional encoding
  • Multi-language model adaptation
  • Sentiment analysis and topic modeling
  • Named entity recognition and relation extraction

Text Processing: NLTK, spaCy, Gensim for advanced text analytics

Speech Processing: ASR systems, audio feature extraction, speaker recognition

DevOps & Infrastructure Engineering

Containerization & Orchestration

Docker Mastery: Multi-stage builds, layer optimization, security scanning

  • Custom base images and distroless containers
  • Docker Compose for local development environments
  • Container registry management (ECR, Docker Hub)
  • Resource limits and health checks

Kubernetes: Pod design, services, ingress, persistent volumes, RBAC

Helm Charts: Template development, dependency management, release management

CI/CD & Automation

GitHub Actions: Workflow automation, matrix builds, custom actions

  • Automated testing pipelines with pytest, coverage reporting
  • Security scanning with Snyk, CodeQL
  • Multi-environment deployments with approval gates
  • Artifact management and versioning strategies

Infrastructure as Code: CloudFormation, Terraform for reproducible deployments

Configuration Management: Ansible playbooks, environment-specific configurations

Monitoring & Observability

Metrics & Alerting: Prometheus, Grafana, custom dashboards, SLA monitoring

  • Application performance monitoring (APM)
  • Log aggregation and analysis (ELK stack)
  • Distributed tracing with Jaeger/Zipkin
  • Incident response and post-mortem analysis

Cost Optimization: Resource utilization analysis, rightsizing recommendations

Education & Certifications

Academic background and professional development

Academic Education

2020

Master 2: Data Science & Artificial Intelligence

Institut Galilei, Université de Sorbonne Paris Nord - Paris, France

Advanced specialization in AI, Machine Learning, and Data Science methodologies

2020

Master: Web Intelligence & Data Science

Université Sidi Mohammed Ben Abdellah - Fès, Morocco

Two-year comprehensive training in development, big data processing, data science, and artificial intelligence

2018

Bachelor: Mathematics & Computer Science

Université Sidi Mohammed Ben Abdellah - Fès, Morocco

Fundamental studies in mathematical sciences and computer science

2017

DEUG: Mathematics & Computer Science

Université Sidi Mohammed Ben Abdellah - Fès, Morocco

General university studies diploma in mathematical sciences and computer science

2015

Baccalauréat: Mathematical Sciences - Series B

Lycée EL ADARISSA - Fès, Morocco

High school diploma with specialization in mathematical sciences

Professional Certifications

AWS CloudFront: Serve content from multiple S3 buckets

Coursera, IBM

2024

AWS S3 Basics

Coursera, IBM

2024

Introduction to Big Data with Spark and Hadoop

Coursera, IBM

2024

Deep Learning Specialization

Coursera, DeepLearning.AI

2022

Hadoop Platform and Application Framework

Coursera, UC San Diego

2022

ETL and Data Pipelines with Shell, Airflow and Kafka

Coursera, IBM

2022

Functional Programming Principles in Scala

Coursera, EPFL

2022

Machine Learning with Python - Level 1

IBM Cognitive Class

2019

Data Analysis Track

One Million Arab Coders, Udacity

2019

Languages

Arabic
Native
French
Fluent
English
Professional

Featured Projects & Technical Achievements

Comprehensive portfolio of enterprise-scale solutions

Current Role: RMC BFM ADS (Altice Media) - Advanced Cloud & Data Engineering

Serverless Video Transcoding Platform
Serverless Media Processing

Enterprise Video Transcoding & VAST Generation Platform

Challenge: Process thousands of advertising videos daily with strict broadcast standards (BTVS: 1920×1080, 50fps, -24 LUFS) while ensuring scalability and cost-effectiveness.

Technical Implementation:

  • Serverless Architecture: Designed containerized AWS Lambda functions using Docker for FFmpeg, OpenCV, and MediaInfo integration
  • Video Processing Pipeline: Automated transcoding with quality validation, duration correction, and format standardization
  • VAST XML Generation: Dynamic creation of Interactive Advertising Bureau (IAB) compliant VAST tags with S3 integration
  • Security & Distribution: Implemented AWS Cognito authentication, Lambda@Edge for edge computing, and CloudFront OAC for content protection
  • Monitoring & Analytics: Real-time event tracking for advertising campaign performance metrics

Business Impact:

  • Reduced video processing time by 75% through parallel serverless execution
  • Achieved 99.9% uptime with automatic scaling and fault tolerance
  • Decreased operational costs by 60% compared to traditional server-based solutions
  • Enabled real-time advertising content delivery across multiple channels
AWS Lambda Docker FFmpeg OpenCV CloudFront API Gateway S3 Cognito
ETL Migration Architecture
Data Warehouse Migration

Enterprise ETL Architecture Modernization

Challenge: Replace legacy Talend infrastructure with modern Python-based ETL system while maintaining data integrity and improving performance.

Technical Architecture:

  • Modular Design: Developed 14 specialized ETL modules (core, db, helpers, tools) for multi-source integration
  • Data Sources: Oracle, PostgreSQL, Salesforce, AWS Athena with unified connection management
  • Performance Optimization: Eliminated CSV intermediary files, implemented pickle serialization for 80% speed improvement
  • Error Handling: Robust logging system with structured error tracking and automatic retry mechanisms
  • Incremental Loading: Configurable delta processing with change data capture (CDC) capabilities

Business Results:

  • Eliminated €50K+ annual Talend licensing costs
  • Reduced data processing time from 8 hours to 1.5 hours daily
  • Improved data quality with 99.95% accuracy rate
  • Enhanced maintainability with open-source Python ecosystem
Python 3.9+ Oracle PostgreSQL Salesforce AWS Athena Pickle Logging
AI Audio-to-Video System
AI-Powered Media Generation

Intelligent Audio-to-Video Conversion with AI Transcription

Challenge: Automate video content creation from audio files for media campaigns with French subtitle generation and brand integration.

AI & Automation Features:

  • Speech Recognition: AWS Transcribe integration for accurate French language transcription
  • Video Generation: Automated MP4 creation with synchronized audio-visual elements
  • Brand Integration: Dynamic logo overlay and QR code generation for campaign tracking
  • Content Distribution: Automated S3 storage with secure URL generation for multi-platform delivery
  • Quality Assurance: Automated validation of audio-video synchronization and subtitle accuracy

Technical Innovation:

  • Reduced manual video production time by 90%
  • Achieved 95% transcription accuracy for French content
  • Enabled scalable content creation for multiple campaigns simultaneously
  • Integrated campaign tracking through dynamic QR codes
AWS Transcribe FFmpeg Python S3 Lambda Computer Vision

BPCE-SI - Enterprise Big Data Engineering

Hadoop Data Lake
Big Data Infrastructure

Enterprise Hadoop Data Lake & Cloud Migration

Scope: Managed Big Data pipelines for all Caisse d'Epargne Ile de France divisions, processing terabytes of financial data daily.

Technical Leadership:

  • Cloudera Expertise: Led team strengthening on Cloudera platform, optimizing Hadoop ecosystem performance
  • ETL Pipeline Development: Automated data ingestion from Oracle databases to Hadoop Data Lake (HDFS, Hive)
  • Spark/Scala Development: Built high-performance data processing applications with PySpark and Scala
  • Cloud Migration: Orchestrated dual-run migration from Cloudera to Google Cloud Platform (BigQuery)
  • Data Governance: Implemented comprehensive data quality policies and monitoring systems

Business Impact:

  • Improved data availability from 85% to 99.5%
  • Reduced processing times by 65% through optimization
  • Decreased infrastructure costs by 40% post-migration
  • Enhanced data quality with automated validation frameworks
Cloudera Hadoop Spark Scala PySpark GCP BigQuery Oracle

Machine Learning & AI Engineering Portfolio

Object Detection System
Computer Vision

Advanced Object Detection & Brand Recognition System

Objective: Develop state-of-the-art object detection system for brand logos and advertisements across TV, billboards, and print media.

Technical Excellence:

  • Algorithm Comparison: Comprehensive evaluation of YOLO, RetinaNet, Faster R-CNN, Mask R-CNN, and Cascade Mask R-CNN
  • Data Pipeline: End-to-end pipeline from social media scraping to production deployment
  • Data Augmentation: Innovative augmentation strategies specifically designed for object detection challenges
  • Model Training: Custom training pipelines for object detection and instance segmentation
  • Production API: Flask-based REST API for real-time model inference

Performance Achievements:

  • Achieved 94.2% mAP (mean Average Precision) on custom dataset
  • Reduced inference time to 45ms per image on GPU
  • Successfully deployed to production serving 10K+ requests daily
  • Trained team of 5 data annotators for consistent labeling quality
PyTorch YOLO Detectron2 OpenCV Flask Docker Selenium
NLP Social Media Analytics
Natural Language Processing

Social Media Intelligence & Sentiment Analysis Platform

Mission: Analyze Moroccan social media trends, detect emerging topics, and perform sentiment analysis on French content at scale.

Advanced NLP Techniques:

  • Topic Modeling: BERT-based topic detection for real-time trend identification
  • Sentiment Analysis: Custom French language sentiment classifier with 91% accuracy
  • Big Data Processing: Real-time processing of 100+ articles from multiple sources daily
  • Feature Engineering: Word2Vec, TF-IDF, and Bag-of-Words implementations for text representation
  • Scalable Architecture: Kubernetes deployment with auto-scaling capabilities

Technical Innovation:

  • Processed 50K+ social media posts daily with 95% uptime
  • Achieved 89% accuracy in French sentiment classification
  • Reduced topic detection latency to under 2 seconds
  • Enabled real-time dashboard for trend monitoring
BERT NLTK Scikit-learn Twitter API Scrapy PostgreSQL Kubernetes
Face Recognition System
Biometric AI

Multi-Model Face Recognition & Attribute Analysis System

Innovation: Hybrid framework integrating multiple state-of-the-art face recognition models for comprehensive facial analysis.

Model Integration:

  • Multi-Model Architecture: VGG-Face, Google FaceNet, OpenFace, Facebook DeepFace, DeepID, ArcFace, Dlib, SFace
  • Attribute Analysis: Age estimation, gender classification, emotion recognition, and ethnicity detection
  • Performance Optimization: GPU parallelization and multiprocessing for real-time inference
  • Scalable Deployment: Containerized solution with automated scheduling via cron jobs
  • Data Management: Elasticsearch integration for efficient face embedding storage and retrieval

Performance Metrics:

  • Achieved 99.1% face recognition accuracy on LFW dataset
  • Processed 1000+ faces per minute with GPU acceleration
  • Reduced false positive rate to 0.1% through ensemble methods
  • Deployed across multiple production environments
TensorFlow Keras OpenCV Dlib GPU Computing Elasticsearch Docker

Professional Achievements & Impact Metrics

Quantified results and recognition across my career

Performance & Optimization Achievements

80%
ETL Processing Time Reduction
Achieved through Python-based architecture replacing legacy Talend infrastructure
€50K+
Annual Cost Savings
Eliminated Talend licensing costs through open-source Python solutions
99.9%
System Uptime
Serverless video processing platform with automatic scaling and fault tolerance
10M+
Records Processed Daily
High-volume data processing across multiple enterprise systems
75%
Video Processing Speed Improvement
Parallel serverless execution for advertising content processing
99.95%
Data Quality Accuracy
Implemented robust validation and error handling mechanisms

Leadership & Team Impact

Technical Team Leadership

  • BPCE-SI: Led Big Data team strengthening on Cloudera platform, mentored 3 junior engineers
  • IMPERIUM: Managed cross-functional team of 8 members across ML engineering and data science
  • 3W Media: Trained annotation team of 5 specialists for computer vision projects
  • Knowledge Transfer: Conducted 20+ technical workshops on cloud architecture and ML deployment

Project Management Excellence

  • Migration Projects: Successfully led 3 major platform migrations with zero downtime
  • Agile Methodology: Implemented Scrum practices improving delivery speed by 40%
  • Stakeholder Management: Coordinated with C-level executives and technical teams
  • Risk Management: Developed comprehensive disaster recovery and backup strategies

Innovation & Research

  • Patent Applications: 2 pending patents for video processing and AI automation
  • Research Publications: Co-authored paper on "Scalable Video Processing in Cloud Environments"
  • Open Source: Contributed to 5+ open-source projects with 500+ GitHub stars
  • Technical Blogs: Published 15+ technical articles on Medium and LinkedIn

Industry Recognition & Certifications

AWS Expertise

AWS Solutions Architect Associate (In Progress)

Specialized certifications in CloudFront, S3, Big Data with Spark and Hadoop

AWS CloudFront AWS S3 Advanced Big Data Analytics

AI/ML Specialization

Deep Learning Specialization - DeepLearning.AI

Advanced certifications in neural networks, machine learning, and data science

Deep Learning Neural Networks Computer Vision

Big Data Engineering

Hadoop Platform Certification - UC San Diego

Expertise in ETL pipelines, Scala programming, and distributed systems

Hadoop Ecosystem Scala Programming ETL Pipelines

Business Impact & ROI

2024

RMC BFM ADS - Digital Transformation

  • Reduced operational costs by 60% through serverless architecture adoption
  • Improved content delivery speed by 75% with global CDN optimization
  • Enabled real-time advertising analytics processing 50M+ events daily
  • Achieved 99.9% SLA compliance for critical media processing workflows
2023

BPCE-SI - Big Data Modernization

  • Increased data availability from 85% to 99.5% through platform optimization
  • Reduced infrastructure costs by 40% via cloud migration strategy
  • Improved data processing speed by 65% with Spark optimization
  • Enhanced data quality with automated validation frameworks
2022

IMPERIUM - AI/ML Platform Development

  • Achieved near real-time ML inference with 95% accuracy improvement
  • Reduced model training time by 70% through distributed computing
  • Implemented scalable microservices architecture serving 1M+ requests
  • Developed automated ML pipeline reducing deployment time by 80%
2021

3W Media - Computer Vision Innovation

  • Delivered 94.2% mAP object detection accuracy on production datasets
  • Reduced inference time to 45ms enabling real-time video analysis
  • Processed 10K+ daily requests with 99.8% uptime reliability
  • Established data annotation standards improving model consistency

Get In Touch

Let's work together

Location

Paris, France

Email

m.elkhou@hotmail.com

Phone

(+33) 06 13 43 51 06