Hello, I'm
Expert Data Engineer & Cloud Architect with 6+ years of experience in designing and implementing enterprise-scale data solutions. Currently at RMC BFM ADS (Altice Media) in Paris, specializing in AWS Cloud Solutions, Big Data Processing, ETL Pipelines, and Data Warehouse Architecture.
Get to know me better
I am a passionate Data Engineer and Cloud Architect with over 6 years of experience in designing and implementing enterprise-scale data solutions.
Currently working as a Data Engineer / Cloud Engineer (AWS) at RMC BFM ADS (Altice Media) in Paris, France. I specialize in building scalable data infrastructure, optimizing ETL processes, and implementing cloud-native solutions that process millions of data points daily for media and advertising analytics.
Previously served as a Big Data Engineer at BPCE-SI, where I architected and maintained Hadoop-based data lakes, developed Spark/Scala applications, and led the migration from Cloudera to Google Cloud Platform.
My expertise spans across Big Data Engineering, Cloud Architecture, ETL Pipeline Development, Data Warehouse Management, and Machine Learning Engineering with hands-on experience in Computer Vision, NLP, and ASR systems.
My career journey
Context: Design, development and management of Big Data pipelines for all Caisse d'Epargne Ile de France divisions.
Results: Significant improvement in data availability, reduced processing times, increased data quality, and infrastructure cost reduction.
Context: Production deployment of AI solutions in multimedia domain, handling unstructured/semi-structured data with resource management and automation.
Results: Achieved near real-time predictions through multiprocessing implementation with continuous batch processing despite limited resources.
My technical expertise
Athena, S3, Lambda, API Gateway, EC2, ECR, Route 53, Glue, IAM, CloudFront, Cognito, CloudWatch, Transcribe, Textract
95%Spark, Hadoop (HDFS, Hive, HBase, Pig, Storm), Cloudera
92%BigQuery, Cloud Storage, Dataflow, Pub/Sub
88%Talend, Airflow, VTOM, Custom Python ETL
96%PySpark, Numpy, Pandas, SciPy, Matplotlib, Seaborn
95%Scikit-Learn, TensorFlow, Keras, PyTorch
92%NLTK, SpaCy, Gensim, BERT
88%Flask, SQLAlchemy, Requests, Selenium, boto3
90%Oracle, PostgreSQL, MySQL, SQL Server, Salesforce
95%Redis, Elasticsearch, Solr
85%Docker, Kubernetes, VMware
90%Grafana, Kibana, Power BI
85%Git, GitHub, Bitbucket, JIRA
92%Linux (Debian, CentOS), Windows Server, Unix
88%Strong analytical mindset with problem-solving capabilities
Experience leading technical teams and mentoring junior developers
Excellent communication skills and client relationship management
Quick adaptation to new technologies and methodologies
Agile/Scrum methodology expertise with strong organizational skills
Reactive approach with excellent sense of priorities
Deep dive into my technical mastery and professional accomplishments
Lambda Functions: Architected 25+ production Lambda functions processing 10M+ requests monthly
API Gateway: RESTful and WebSocket APIs with custom authorizers, request/response transformation
Step Functions: Complex workflow orchestration for multi-step data processing pipelines
Amazon Athena: Petabyte-scale query optimization, partitioning strategies, cost control
S3 Advanced: Multi-region replication, lifecycle policies, event notifications, security controls
CloudWatch: Custom metrics, log insights, automated alerting, cost anomaly detection
IAM Expertise: Least privilege access, cross-account roles, SAML federation
VPC Networking: Multi-AZ architectures, NAT gateways, security groups, NACLs
Encryption: KMS key management, envelope encryption, data-at-rest and in-transit protection
Spark Core: RDD transformations, actions, broadcast variables, accumulators
Spark SQL: DataFrame/Dataset APIs, complex window functions, user-defined functions (UDFs)
Spark Streaming: Real-time data processing with micro-batching and structured streaming
MLlib: Distributed machine learning algorithms, feature engineering pipelines
HDFS: Block replication, rack awareness, federation, high availability
Hive: Complex HiveQL queries, partitioning, bucketing, ORC/Parquet optimization
HBase: NoSQL database design, row key optimization, region splitting
Cloudera: Cluster management, security (Kerberos), resource management (YARN)
Apache Kafka: Topic design, partitioning strategies, consumer group management
Apache Storm: Topology design, spouts, bolts, guaranteed message processing
Real-time Dashboards: Grafana, Kibana integration with streaming data
Advanced Python: Metaclasses, decorators, context managers, async/await patterns
Data Science Stack: NumPy vectorization, Pandas optimization, SciPy statistical functions
Web Frameworks: Flask/FastAPI for REST APIs, SQLAlchemy ORM, Celery for task queues
Scala Expertise: Case classes, pattern matching, higher-order functions, implicits
Functional Paradigms: Monads, functors, immutable data structures
SQL Mastery: Complex queries, window functions, CTEs, query optimization
NoSQL: Elasticsearch aggregations, Redis data structures, document modeling
TensorFlow/Keras: Custom layers, training loops, distributed training, TensorBoard
PyTorch: Dynamic computation graphs, custom datasets, distributed training
Model Deployment: ONNX conversion, TensorRT optimization, edge deployment
Object Detection: YOLO, R-CNN family, SSD, RetinaNet implementation and optimization
Image Processing: OpenCV advanced techniques, morphological operations, feature extraction
Face Recognition: Multi-model ensemble approaches, embedding optimization
Transformer Models: BERT, GPT, T5 fine-tuning and deployment
Text Processing: NLTK, spaCy, Gensim for advanced text analytics
Speech Processing: ASR systems, audio feature extraction, speaker recognition
Docker Mastery: Multi-stage builds, layer optimization, security scanning
Kubernetes: Pod design, services, ingress, persistent volumes, RBAC
Helm Charts: Template development, dependency management, release management
GitHub Actions: Workflow automation, matrix builds, custom actions
Infrastructure as Code: CloudFormation, Terraform for reproducible deployments
Configuration Management: Ansible playbooks, environment-specific configurations
Metrics & Alerting: Prometheus, Grafana, custom dashboards, SLA monitoring
Cost Optimization: Resource utilization analysis, rightsizing recommendations
Academic background and professional development
Institut Galilei, Université de Sorbonne Paris Nord - Paris, France
Advanced specialization in AI, Machine Learning, and Data Science methodologies
Université Sidi Mohammed Ben Abdellah - Fès, Morocco
Two-year comprehensive training in development, big data processing, data science, and artificial intelligence
Université Sidi Mohammed Ben Abdellah - Fès, Morocco
Fundamental studies in mathematical sciences and computer science
Université Sidi Mohammed Ben Abdellah - Fès, Morocco
General university studies diploma in mathematical sciences and computer science
Lycée EL ADARISSA - Fès, Morocco
High school diploma with specialization in mathematical sciences
Coursera, IBM
2024
Coursera, IBM
2024
Coursera, IBM
2024
Coursera, DeepLearning.AI
2022
Coursera, UC San Diego
2022
Coursera, IBM
2022
Coursera, EPFL
2022
IBM Cognitive Class
2019
One Million Arab Coders, Udacity
2019
Comprehensive portfolio of enterprise-scale solutions
Challenge: Process thousands of advertising videos daily with strict broadcast standards (BTVS: 1920×1080, 50fps, -24 LUFS) while ensuring scalability and cost-effectiveness.
Challenge: Replace legacy Talend infrastructure with modern Python-based ETL system while maintaining data integrity and improving performance.
Challenge: Automate video content creation from audio files for media campaigns with French subtitle generation and brand integration.
Scope: Managed Big Data pipelines for all Caisse d'Epargne Ile de France divisions, processing terabytes of financial data daily.
Objective: Develop state-of-the-art object detection system for brand logos and advertisements across TV, billboards, and print media.
Mission: Analyze Moroccan social media trends, detect emerging topics, and perform sentiment analysis on French content at scale.
Innovation: Hybrid framework integrating multiple state-of-the-art face recognition models for comprehensive facial analysis.
Quantified results and recognition across my career
AWS Solutions Architect Associate (In Progress)
Specialized certifications in CloudFront, S3, Big Data with Spark and Hadoop
Deep Learning Specialization - DeepLearning.AI
Advanced certifications in neural networks, machine learning, and data science
Hadoop Platform Certification - UC San Diego
Expertise in ETL pipelines, Scala programming, and distributed systems
Let's work together
Paris, France
m.elkhou@hotmail.com
(+33) 06 13 43 51 06