resume / files /linkedin.md
Médéric Hurier (Fmind)
refactoring
13a7d69
|
raw
history blame
43.9 kB

Profile

Overview

  • First name: Médéric
  • Last name: HURIER
  • Pseudo: Fmind
  • City: Luxembourg
  • Country: Luxembourg
  • Industry: Technology, Information and Internet
  • Position: Senior MLOps Engineer at Decathlon Technology
  • Education: PhD in Artificial Intelligence and Computer Security from the University of Luxembourg
  • Headline: Freelancer: AI/FM/MLOps Engineer | Data Scientist | MLOps Community Organizer | MLflow Ambassador | Hacker | PhD
  • Note: I'm not available to work on new missions until the 31st of December 2024.

Websites

About

When I worked as a teacher, I told my students that Artificial Intelligence and Machine Learning are the most effective levers to make a difference. Every day, new AI and ML solutions are released to empower companies and individuals alike. The question is: Is your business ready to provide the best AI/ML products for your customers?

I'm a professional Machine Learning Engineer, Data Scientist, and MLOps ready to assist you in this quest. I've completed a Ph.D. in Machine Learning and several high-end AI/ML certifications to help you build leading data-driven services. My past experiences include working with companies like Google, BNP Paribas, ArcelorMittal, the European Commission, and Decathlon to frame their needs, create state-of-the-art models and deliver AI/ML artifacts at scale.

I now work as a freelancer in Luxembourg, and I can carry out missions remotely in other European countries. You can get in touch with me on LinkedIn or at contact@fmind.dev. I'll be happy to collaborate with you or discuss your favored AI/ML topics in the MLOps Community.

Experience

Senior MLOps Engineer for Decathlon Technology

  • Title: Senior MLOps Engineer
  • Company: Decathlon Technology
  • Period: September 2022 - Present
  • Location: Luxembourg (Hybrid)
  • Tasks:
    • Design and implement Decathlon's MLOps platform on top of Databricks and AWS cloud systems.
    • Create code templates and documentations to embed best practices into development processes.
    • Animate Decathlon's MLOps Community through discussions, resources sharing, and maturity matrix.
    • Provide solutions and guidelines to steer the adoption of AI/ML Monitoring/Observability principles.
    • Bring Generative AI capabilities to data scientists and ML engineers through AWS Bedrock/Databricks.
  • Skills: Artificial Intelligence (AI) · Apache Airflow · AWS SageMaker · MLflow · Python · ChatGPT · Git · Docker · Kubernetes · Méthodes agiles · AWS Bedrock · Machine Learning · DataBricks · MLOps · Jira · Apache Spark · Terraform

Mentor for Data Scientist and AI/ML Engineer for OpenClassrooms

Senior Data Scientist & Project Manager at Cronos Europa for the European Commission

  • Title: Senior Data Scientist & Project Manager
  • Company: Cronos Europa
  • Customer: European Commission
  • Period: December 2021 - September 2022
  • Location: Luxembourg (Hybrid)
  • Mission: Enhance the ARACHNE risk scoring tool (fraud detection).
  • Main tasks and responsibilities:
    • Develop a new version of Arachne using data mining techniques
    • Manage the development of the Arachne PoC/Project (SCRUM)
    • Assist data scientists in their projects (Virtual Assistant, NLP, …)
  • Skills: Artificial Intelligence (AI) · Machine Learning · MLOps · Python · Deep Learning · Data Science · Big Data · Agile Methodology · Project Management · Functional Programming · Jupyter · Pandas · Docker · Jira · Git · PostgreSQL · AWS SageMaker · Flask · UML · API REST · Terraform · Transformers · Natural Language Processing (NLP) · Data Engineering · Microsoft Azure Machine Learning · Neo4j

Project Manager & Machine Learning Engineer at SFEIR Luxembourg for Decathlon Technology

  • Title: Project Manager & Machine Learning Engineer
  • Company: SFEIR Luxembourg
  • Customer: Decathlon Technology
  • Period: December 2020 - December 2021
  • Location: Luxembourg (Remote)
  • Mission: Design and implement the next ML/MLOps platform on AWS and GCP.
  • Main tasks and responsibilities:
    • Design the functional & technical architecture of the platform
    • Manage the MLOps@Decathlon initiative (tasks, plannings)
    • Select the vendor solutions based on a user need analysis
    • Communicate the progress and success to stack-holders
    • Assist data scientists in their project (audience, forecast)
  • Technical stack:
    • Data Science: Python, TensorFlow, Spark, sklearn, Jupyter, Airflow
    • Management: Google Workspace, Jira, UML, Terraform, Jenkins
    • Environments: AWS (SageMaker), GCP (Vertex AI), DataBricks
  • Skills: Artificial Intelligence (AI) · Machine Learning · MLOps · Python · Deep Learning · Data Science · Big Data · Agile Methodology · Project Management · Functional Programming · Google Cloud Platform (GCP) · Tensorflow · MLflow · Jupyter · Pandas · Docker · Keras · Jira · Git · DataBricks · Apache Airflow · AWS SageMaker · Flask · UML · Terraform · Data Engineering · Vertex AI (GCP) · Apache Spark · Scikit-Learn · Kubernetes

Data Scientist at SFEIR Luxembourg

  • Title: Data Scientist
  • Company: SFEIR Luxembourg
  • Period: October 2020 - November 2020
  • Location: Luxembourg (Remote)
  • Mission: Improve the visibility and assets of SFEIR's Data Team.
  • Main tasks and responsibilities:
    • Design and create technical interviews for recruiting data scientists.
    • Become a Professional Machine Learning Engineer on Google Cloud.
    • Propose a strategy to improve the online visibility of SFEIR data team.
    • Share knowledge about data trends with non-technical staff members.
    • Create a group to write tutorials and kata on AI/ML for SFEIR developers.
  • Skills: Artificial Intelligence (AI) · Machine Learning · MLOps · Python · Deep Learning · Data Science · Agile Methodology · Functional Programming · Google Cloud Platform (GCP) · Tensorflow · Jupyter · Pandas · Keras · Git · MongoDB · Vertex AI (GCP) · Apache Spark · Scikit-Learn

Data Scientist at SFEIR Luxembourg for ArcelorMittal

  • Title: Data Scientist
  • Company: SFEIR Luxembourg
  • Customer: ArcelorMittal
  • Period: January 2020 - September 2020
  • Location: Luxembourg (Remote)
  • Mission: Train and optimize machine learning models to recommend steel prices.
  • Main tasks and responsibilities:
    • Create and fine-tune machine-learning models (tree-based)
    • Evaluate the performance of the model on real datasets
    • Communicate the results to business stack-holders
  • Technical stack:
    • Data Science: Python, XGBoost, sklearn, Jupyter, SQL
    • Analytics: Matplotlib, Seaborn, Tableau, Plotly, Dash
    • Environment: MS-SQL, Azure Cloud, Jira, Papermill
  • Skills: Artificial Intelligence (AI) · Machine Learning · MLOps · Python · Data Science · Agile Methodology · Functional Programming · Jupyter · Pandas · Jira · Git · Natural Language Processing (NLP) · Scikit-Learn

Research And Development Specialist at the University of Luxembourg

  • Title: Research And Development Specialist
  • Company: University of Luxembourg
  • Period: September 2019 - January 2020
  • Location: Luxembourg
  • Mission: Management and development of Natural Language Understanding (NLU) projects for BGL BNP Paribas.
  • Skills: Artificial Intelligence (AI) · Machine Learning · Python · Data Science · Big Data · Functional Programming · Tensorflow · Jupyter · Pandas · Docker · Git · PostgreSQL · Ansible · Flask · UML · JSON · API REST · Transformers · Natural Language Processing (NLP) · Apache Spark · Scikit-Learn

Doctoral researcher at the University of Luxembourg

  • Title: Doctoral researcher
  • Company: University of Luxembourg
  • Period: September 2015 - January 2020
  • Location: Luxembourg
  • Missions:
    • Research activities focused on Android security and artificial intelligence.
    • Teaching big data, machine learning and Android programming to students.
    • Collaboration with Google, San Francisco on finding malicious Android artifacts.
  • Skills: Artificial Intelligence (AI) · Machine Learning · Python · Deep Learning · Data Science · Statistics · Big Data · Cybersecurity · Functional Programming · Jupyter · Pandas · Docker · Git · NoSQL · MongoDB · PostgreSQL · ElasticSearch · Ansible · Flask · JSON · Android · API REST · Natural Language Processing (NLP) · Data Engineering · Apache Spark · Scikit-Learn

Mentor for Data Scientist for OpenClassrooms

  • Title: Mentor for Data Scientist
  • Customer: OpenClassrooms
  • Period: August 2018 - December 2019
  • Location: France
  • Mission: Tutoring adult students to become data scientists specializing in machine learning.
  • Skills: Artificial Intelligence (AI) · Machine Learning · Python · Data Science · Jupyter · Pandas · Git · Flask · JSON · API REST · Scikit-Learn

Security engineer specialized in log management and analysis at Clearstream

  • Title: Security engineer specialized in log management and analysis
  • Company: Clearstream
  • Period: April 2014 - August 2015
  • Location: Luxembourg
  • Mission: Selection and deployment of a SIEM solution, participating in security incident response.
  • Skills: Python · Big Data · ISO 27001 · Cybersecurity · Jupyter · Pandas · Git · ElasticSearch · Data Engineering

Web developer and administrator

  • Title: Web developer and administrator
  • Company: Freaxmind
  • Period: August 2011 - August 2013
  • Location: France
  • Mission: Various contracts ranging from web development to software maintenance and debugging.
  • Skills: Python · Object Oriented Programming (POO) · Git · Ansible · Flask

Web Developer for Toul'embal (internship)

  • Title: Web Developer (intern)
  • Company: Toul'embal
  • Period: June 2012 - August 2012
  • Location: Toul, France
  • Mission: Extension of a Prestashop e-commerce website and creation a portfolio website with WordPress.
  • Skills: Object Oriented Programming (POO)

Web Programmer at Empreinte Studio

  • Title: Web Programmer
  • Company: Empreinte Studio
  • Period: October 2010 - August 2011
  • Location: Épernay, France
  • Mission: Creation of modern website in PHP and MySQL with professional writers and graphic designers.
  • Skills: Object Oriented Programming (POO) · Git

Software Developer for GEOVARIANCES (apprenticeship)

  • Title: Software Developer (apprentice)
  • Company: GEOVARIANCES
  • Period: September 2009 - September 2010
  • Location: Avon, France
  • Mission: Development of a geostatistic application in C++ and Qt with experienced software engineers.
  • Skills: Object Oriented Programming (POO) · Git · UML

Web Developer for CV Champagne Nicolas Feuillatte (internship)

  • Title: Web Developer (intern)
  • Company: CV Champagne Nicolas Feuillatte
  • Period: April 2009 - August 2009
  • Location: Épernay, France
  • Mission: Integration of customer and share management modules to J.D. Edwards with PHP and Oracle.
  • Skills: Object Oriented Programming (POO)

Education

Doctor of Philosophy (PhD) in computer security and artificial intelligence

  • School: University of Luxembourg
  • Location: Luxembourg
  • Grade: Very Good
  • Period: 2015 - 2019
  • Activities and Societies: Teach Big Data and Android to students.
  • Thesis title: Creating better ground truth to further understand Android malware

Master's degree in computer and information systems security

  • School: UFR Mathématiques, Informatique, Mécanique et Automatique
  • Location: Metz (France)
  • Period: 2013 - 2014

Bachelor and master years in computer science applied to business informatics

  • School: UFR Mathématiques et Informatique de l’Université de Lorraine
  • Location: Nancy (France)
  • Period: 2011 - 2013

Professional bachelor's degree in computer security and databases

  • School: IUT Sénart-Fontainebleau
  • Location: Fontainebleau (France)
  • Period: 2009 - 2010

Professional bachelor’s degree in web development and integration

  • School: IUT Nancy-Charlemagne
  • Location: Nancy (France)
  • Period: 2008 - 2009

Technical degree in network and software development

  • School: Lycée François 1er
  • Location: Vitry-le-François (France)
  • Period: 2006 - 2008

Baccalauréat général degree in science, specialized in biology

  • School: Lycée Marc Chagall
  • Location: Reims (France)
  • Period: 2003 - 2006

Volunteering

MLOps Community Organizer (Luxembourg)

  • Community: MLOps Community
  • Role: Organizer
  • Location: Luxembourg
  • Period: November 2022 - present
  • Field: Science and Technology
  • Missions:
    • Organize regular meetups and events for the MLOps Community.
    • Coordinate and review the content for the MLOps Writer Community.
  • Partners: Amazon Web Services (AWS) and the University of Luxembourg.
  • Link: https://www.meetup.com/luxembourg-mlops-community/

MLflow Ambassador

  • Community: MLflow
  • Role: Ambassador
  • Period: April 2024 - present
  • Field: Science and Technology
  • Mission: Promote the MLflow tools and projects.
  • Link: https://mlflow.org/ambassador

Licenses & Certifications

Machine Learning Associate

  • Issuer: Databricks
  • Issued: Nov 2022
  • Credential ID: 61461287

Databricks Lakehouse Fundamentals

  • Issuer: Databricks
  • Issued: Oct 2022
  • Credential ID: 61029028

Architecting with Google Kubernetes Engine Specialization

  • Issuer: Google
  • Issued: Sep 2022
  • Credential ID: WLU4DBPSQ4B5

Architecting with Google Kubernetes Engine: Foundations

  • Issuer: Google
  • Issued: Sep 2022
  • Credential ID: DFWAC6BXLNGL

Architecting with Google Kubernetes Engine: Production

  • Issuer: Google
  • Issued: Sep 2022
  • Credential ID: K5SZHUST5HP2

Architecting with Google Kubernetes Engine: Workloads

  • Issuer: Google
  • Issued: Sep 2022
  • Credential ID: ULJQAXGDVKYK

Google Cloud Fundamentals: Core Infrastructure

  • Issuer: Google
  • Issued: Sep 2022
  • Credential ID: 4CE8WQ6AWKFF

Iterative Tools for Data Scientists and Analysts

  • Issuer: Iterative
  • Issued: Aug 2022
  • Credential ID: 62fcb79418f51945ea

Azure Data Scientist Associate

  • Issuer: Microsoft
  • Issued: Jul 2022
  • Credential ID: 992564946

Azure Machine Learning for Data Scientists

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: MZKV7LSTQ9HX

Build and Operate Machine Learning Solutions with Azure Microsoft

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: 7FBX68MH272C

Create Machine Learning Models in Microsoft Azure

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: SHALM9PM3MPX

Microsoft Azure Data Scientist Associate - DP-100 Test Prep Specialization

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: L5P3TYLAYLLT

Perform data science with Azure Databricks

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: RQ7PLFYZVLXX

Prepare for DP-100: Data Science on Microsoft Azure Exam

  • Issuer: Microsoft
  • Issued: Jun 2022
  • Credential ID: K5KW27AVMYS2

Neo4j Graph Data Science Certified

  • Issuer: Neo4j
  • Issued: Apr 2022
  • Credential ID: 17351346

Microsoft Certified: Azure AI Fundamentals

  • Issuer: Microsoft
  • Issued: Jan 2022
  • Credential ID: 1098-0884

Artificial Intelligence on Microsoft Azure

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: Z8FSWXBSAGLD

Computer Vision in Microsoft Azure

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: KDDPYLKM2DA5

Microsoft Azure AI Fundamentals AI-900 Exam Prep Specialization

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: 96944QKZH9BU

Microsoft Azure Machine Learning

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: 32ES25845Q55

Natural Language Processing in Microsoft Azure

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: XVN23N8CKRGY

Preparing for AI-900: Microsoft Azure AI Fundamentals exam

  • Issuer: Microsoft
  • Issued: Dec 2021
  • Credential ID: YC83C22L8TBL

Build a Website on Google Cloud

  • Issuer: Google
  • Issued: Aug 2021

Build and Secure Networks in Google Cloud

  • Issuer: Google
  • Issued: Aug 2021

Create ML Models with BigQuery ML

  • Issuer: Google
  • Issued: Aug 2021

Create and Manage Cloud Resources

  • Issuer: Google
  • Issued: Aug 2021

Deploy to Kubernetes in Google Cloud

  • Issuer: Google
  • Issued: Aug 2021

Implement DevOps in Google Cloud

  • Issuer: Google
  • Issued: Aug 2021

Insights from Data with BigQuery

  • Issuer: Google
  • Issued: Aug 2021

Integrate with Machine Learning APIs

  • Issuer: Google
  • Issued: Aug 2021

Perform Foundational Infrastructure Tasks in Google Cloud

  • Issuer: Google
  • Issued: Aug 2021

Apache Spark Associate Developer

  • Issuer: Databricks
  • Issued: Jun 2021
  • Credential ID: fff03919-bbc9-304e-99ad-6f2ed47455ed

Scalable Machine Learning with Apache Spark

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: 0f4adf96-0412-32f2-8232-fa50c51c9b47

Apache Spark Programming with Databricks

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: 518a1d63-8894-3ab5-aaa5-50a9f169436c

Data Science Professional

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: f05164e1-5a78-37f8-9c69-3e996fdbb21f

Delta Lake Fundamentals Accreditation

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: 0d042e3f-50d3-3821-b064-f3c12ca6c17f

Deploying a Machine Learning Project with MLflow Projects

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: 2afa0c7f-48f4-35af-b366-f7c77d2cd20a

Tracking Experiments with MLflow

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: 0cbf87b7-e096-3792-a3b7-62d86aa6380d

Unified Data Analytics Accreditation

  • Issuer: Databricks
  • Issued: May 2021
  • Credential ID: afba5402-b5e4-3f9e-95f2-51d6bbb5fa64

ML Pipelines on Google Cloud

  • Issuer: Google
  • Issued: Mar 2021
  • Credential ID: FN5PYWX5PRCP

Introduction to Trading, Machine Learning & GCP

  • Issuer: Google
  • Issued: Nov 2020
  • Credential ID: YV9H5PF4YPLZ

MLOps (Machine Learning Operations) Fundamentals

  • Issuer: Google
  • Issued: Nov 2020
  • Credential ID: 4BDA24UL7K9Z

Machine Learning for Trading Specialization

  • Issuer: Google
  • Issued: Nov 2020
  • Credential ID: YSNPABSMV6JL

Reinforcement Learning for Trading Strategies

  • Issuer: Google
  • Issued: Nov 2020
  • Credential ID: VHKJLFPLLDLU

Using Machine Learning in Trading and Finance

  • Issuer: Google
  • Issued: Nov 2020
  • Credential ID: X5YYLBMPY4BU

DeepLearning.AI TensorFlow Developer Specialization

  • Issuer: DeepLearning.AI
  • Issued: Oct 2020
  • Credential ID: LQ4GHWJ6URBS

Perform Foundational Data, ML, and AI Tasks in Google Cloud

  • Issuer: Google
  • Issued: Oct 2020

Professional Machine Learning Engineer

  • Issuer: Google
  • Issued: Oct 2020
  • Credential ID: 24896478

Sequences, Time Series and Prediction

  • Issuer: Google
  • Issued: Oct 2020
  • Credential ID: WHBV68C4WJT5

Convolutional Neural Networks in TensorFlow

  • Issuer: Google
  • Issued: Sep 2020
  • Credential ID: 78HJEJZ3T2BB

Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning

  • Issuer: Google
  • Issued: Sep 2020
  • Credential ID: SW885ZMDHTYM

Natural Language Processing in TensorFlow

  • Issuer: Google
  • Issued: Sep 2020
  • Credential ID: JZ9TBHXJFLWM

Advanced Machine Learning with TensorFlow on Google Cloud Platform Specialization

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: V492QQ4JJKEB

End-to-End Machine Learning with TensorFlow on GCP

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: QLDMNADDBSRR

Image Understanding with TensorFlow on GCP

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: HY4HSSY8JSPN

Production Machine Learning Systems

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: THZZNW22LHKT

Recommendation Systems with TensorFlow on GCP

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: 2D4LT28697TC

Sequence Models for Time Series and Natural Language Processing

  • Issuer: Google
  • Issued: Jul 2020
  • Credential ID: 6XUV7YJFM3ZA

Building Batch Data Pipelines on GCP

  • Issuer: Google
  • Issued: May 2020
  • Credential ID: 5QYSK9E5EAFN

Building Resilient Streaming Analytics Systems on GCP

  • Issuer: Google
  • Issued: May 2020
  • Credential ID: FYQW7D4F6PD4

Data Engineering with Google Cloud Specialization

  • Issuer: Google
  • Issued: May 2020
  • Credential ID: EPZ3WQFC423E

Modernizing Data Lakes and Data Warehouses with GCP

  • Issuer: Google
  • Issued: May 2020
  • Credential ID: 393P3HLZWY8H

Smart Analytics, Machine Learning, and AI on GCP

  • Issuer: Google
  • Issued: May 2020
  • Credential ID: AK77VUVN4ARJ

Google Cloud Platform Big Data and Machine Learning Fundamentals

  • Issuer: Google
  • Issued: Apr 2020
  • Credential ID: 2Q35NYHYMW5E

Devenez Mentor Evaluateur

  • Issuer: OpenClassrooms
  • Issued: Feb 2019
  • Credential ID: 8151214336

Advanced AI: Deep Reinforcement Learning in Python

  • Issuer: Udemy
  • Issued: Aug 2018
  • Credential ID: UC-5FM0CC9S

Artificial Intelligence: Reinforcement Learning in Python

  • Issuer: Udemy
  • Issued: Jul 2018
  • Credential ID: UC-XALJEH7G

Concevez un site avec Flask

  • Issuer: OpenClassrooms
  • Issued: Jul 2018
  • Credential ID: 5343531703

Les étapes de la vie du Mentor

  • Issuer: OpenClassrooms
  • Issued: Jul 2018
  • Credential ID: 8431716200

Devenez Mentor chez OpenClassrooms

  • Issuer: OpenClassrooms
  • Issued: May 2018
  • Credential ID: 6193593386

Complete Guide to ElasticSearch

  • Issuer: Udemy
  • Issued: Mar 2018
  • Credential ID: UC-H5AJQVA3

Introduction to Hadoop

  • Issuer: The Linux Foundation
  • Issued: Oct 2017
  • Credential ID: ad676a8fe7994edea33516b80b540971

Artificial Intelligence Nanodegree

  • Issuer: Udacity
  • Issued: Sep 2017
  • Credential ID: PV7A7EAA

High Performance Computing

  • Issuer: University of Luxembourg
  • Issued: Feb 2017

Machine Learning

  • Issuer: Standford University
  • Issued: Sep 2015
  • Grade: 97%

TOEIC

  • Skills: Listening, Reading
  • Issued: Jan 2014
  • Score: 975/990

Publications

Make your MLOps code base SOLID with Pydantic and Python’s ABC

MLOps projects are straightforward to initiate, but challenging to perfect. While AI/ML projects often start with a notebook for prototyping, deploying them directly in production is often considered poor practice by the MLOps community. Transitioning to a dedicated Python code base is essential for industrializing the project, yet this move presents several challenges: 1) How can we maintain a code base that is robust yet flexible for agile development? 2) Is it feasible to implement proven design patterns while keeping the code base accessible to all developers? 3) How can we leverage Python’s dynamic nature while adopting strong typing practices akin to static languages?

Throughout my career, I have thoroughly explored various strategies to make my code base both simple and powerful. In 2009, I had the opportunity to collaborate with seasoned developers and enthusiasts of design patterns in object-oriented languages such as C++ and Java. By 2015, I had devoted hundreds of hours to mastering functional programming paradigms with languages like Clojure (LISP) and Haskell. This journey led me to discover both modern and time-tested practices, which I have applied to my AI/ML projects. I am eager to share these practices and reveal the most effective solutions I’ve encountered.

In this article, I propose a method to develop high-quality MLOps projects using Python's ABC and Pydantic. I begin by emphasizing the importance of implementing SOLID software practices in AI/ML codebases. Next, I offer some background on design patterns and the SOLID principles. Then, I recount my experiences with various code architectures and their limitations. Finally, I explain how Python's ABC and Pydantic can enhance the quality of your Python code and facilitate the adoption of sound coding practices.

Maîtrise du monitoring des modèles IA : bonnes pratiques et solutions

Le monitoring des modèles de machine learning est indispensable pour valider leurs performances et leurs comportements en production. De nombreuses solutions existent sur le marché, disposant de leurs avantages et inconvénients. Dans cette présentation, nous vous proposons de revoir les critères importants pour sélectionner une solution de monitoring adaptée à votre environnement. Dans un second temps, nous ferons un retour d’expérience sur les technos et outils que nous avons testés. Finalement, nous présenterons les choix d’architectures pour la mise en place du monitoring de modèles à Decathlon.

Become the Maestro of your MLOps Abstractions🤔

In this article, I aim to delineate a roadmap for constructing robust MLOps platforms and projects. Initially, I will underscore the importance of devising and mastering your own MLOps abstractions. Following this, I will outline key design patterns essential for forging simple yet potent abstractions for your projects. Lastly, I will delve into real-world case studies, illustrating the critical role of abstractions in the success of various projects.

How to configure VS Code for AI, ML and MLOps development in Python 🛠️️

In this article, I outline the steps for configuring VS Code for data scientists and machine learning engineers. I start by listing extensions that augmente your programming environment. Then, I share some settings and keybindings to enhance your development experience. Finally, I provide tips and tricks to boost your coding efficiency with VS Code.

Is AI/ML Monitoring just Data Engineering? 🤔

While the future of machine learning and MLOps is being debated, practitioners still need to attend to their machine learning models in production. This is no easy task, as ML engineers must constantly assess the quality of the data that enters and exits their pipelines, and ensure that their models generate the correct predictions. To assist ML engineers with this challenge, several AI/ML monitoring solutions have been developed.

In this article, I will discuss the nature of AI/ML monitoring and how it relates to data engineering. First, I will present the similarities between AI/ML monitoring and data engineering. Second, I will enumerate additional features that AI/ML monitoring solutions can provide. Third, I will briefly touch on the topic of AI/ML observability and its relation to AI/ML monitoring. Finally, I will provide my conclusion about the field of AI/ML monitoring and how it should be considered to ensure the success of your AI/ML project.

A great MLOps project should start with a good Python Package 🐍

In this article, I present the implementation of a Python package on GitHub designed to support MLOps initiatives. The goal of this package is to make the coding workflow of data scientists and ML engineers as flexible, robust, and productive as possible. First, I start by motivating the use of Python packages. Then, I provide some tools and tips you can include in your MLOps project. Finally, I explain the follow-up steps required to take this package to the next level and make it work in your environment.

Fixing the MLOps Survey on LLMs with ChatGPT API: Lessons Learned

Large Language Model (LLM) is such an existing topic. Since the release of ChatGPT, we saw a surge of innovation ranging from education mentorship to finance advisory. Each week is a new opportunity for addressing new kinds of problems, increasing human productivity, or improving existing solutions. Yet, we may wonder if this is just a new hype cycle or if organizations are truly adopting LLMs at scale …

On March 2023, the MLOps Community issued a survey about LLMs in production to picture the state of adoption. The survey is full of interesting insights, but there is a catch: 80% of the questions are open-ended, which means respondents answered the survey freely from a few keywords to full sentences. I volunteered to clean up the answers with the help of ChatGPT and let the community get a grasp of the survey experiences.

In this article, I present the steps and lessons learned from my journey to shed some light on the MLOps survey on LLMs. I’m first going to present the goal and questions of the survey. Then, I will explain how I used ChatGPT to review the data and standardize the content. Finally, I’m going to evaluate the performance of ChatGPT compared to a manual review.

Kubeflow: The Machine Learning Toolkit for Kubernetes

MLflow: An open source platform for the machine learning lifecycle

We need POSIX for MLOps

If you work on MLOps, you must navigate an ever-growing landscape of tools and solutions. This is both an intense source of stimulation and fatigue for MLOps practitioners. Vendors and users face the same problem: How can we combine all these tools without the combinatorial complexity of creating custom integrations?

In this article, I propose a solution analogous to POSIX to address this challenge. First, I motivate the creation of common protocols and schemas for combining MLOps tools. Second, I present a high-level architecture to support implementation. Third, I conclude with the benefits and limitations of standardizing MLOps.

How to install Kubeflow Pipelines v2 on Apple Silicon

Kubeflow Pipelines (KFP) is a powerful platform for building machine learning pipelines at scale with Kubernetes. The platform is well supported on major cloud platforms such as GCP (Vertex AI Pipelines) or AWS (Kubeflow on AWS). However, installing KFP on Apple Silicon (macOS 12.5.1 with Apple M1 Pro) proved to be more challenging than I imagined. Thus, I wanted to share my experience and tips to install KFP as easily as possible on your shiny Mac.

In this article, I present 4 steps to install Kubeflow on Apple Silicon, using Rancher Desktop for setting up Docker/Kubernetes. In the end, I list the problems I encountered during the installation of Kubeflow Pipelines.

The Programming Trade-Off: Purpose, Productivity, Performance

As programmers, we are continuously looking for languages that are performant, productive, and general purpose. Is there any programming language that currently satisfies these properties? Can we ever create one?

In this article, I present a fundamental trade-off that affects the design of programming languages and the success of software projects.

Creating better ground truth to further understand Android malware: A large scale mining approach based on antivirus labels and malicious artifacts

Mobile applications are essential for interacting with technology and other people. With more than 2 billion devices deployed all over the world, Android offers a thriving ecosystem by making accessible the work of thousands of developers on digital marketplaces such as Google Play. Nevertheless, the success of Android also exposes millions of users to malware authors who seek to siphon private information and hijack mobile devices for their benefits.

To fight against the proliferation of Android malware, the security community embraced machine learning, a branch of artificial intelligence that powers a new generation of detection systems. Machine learning algorithms, however, require a substantial number of qualified samples to learn the classification rules enforced by security experts. Unfortunately, malware ground truths are notoriously hard to construct due to the inherent complexity of Android applications and the global lack of public information about malware. In a context where both information and human resources are limited, the security community is in demand for new approaches to aid practitioners to accurately define Android malware, automate classification decisions, and improve the comprehension of Android malware.

This dissertation proposes three solutions to assist with the creation of malware ground truths.

Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware

Android malware is now pervasive and evolving rapidly. Thousands of malware samples are discovered every day with new models of attacks. The growth of these threats has come hand in hand with the proliferation of collective repositories sharing the latest specimens. Having access to a large number of samples opens new research directions aiming at efficiently vetting apps. However, automatically inferring a reference ground-truth from those repositories is not straightforward and can inadvertently lead to unforeseen misconceptions. On the one hand, samples are often mislabeled as different parties use distinct naming schemes for the same sample. On the other hand, samples are frequently misclassified due to conceptual errors made during labeling processes.

In this paper, we analyze the associations between all labels given by different vendors and we propose a system called EUPHONY to systematically unify common samples into family groups. The key novelty of our approach is that no prior knowledge of malware families is needed. We evaluate our approach using reference datasets and more than 0.4 million additional samples outside of these datasets. Results show that EUPHONY provides competitive performance against the state-of-the-art.

On the Lack of Consensus in Anti-Virus Decisions: Metrics and Insights on Building Ground Truths of Android Malware

There is generally a lack of consensus in Antivirus (AV) engines' decisions on a given sample. This challenges the building of authoritative ground-truth datasets. Instead, researchers and practitioners may rely on unvalidated approaches to build their ground truth, e.g., by considering decisions from a selected set of Antivirus vendors or by setting up a threshold number of positive detections before classifying a sample. Both approaches are biased as they implicitly either decide on ranking AV products, or they consider that all AV decisions have equal weights. In this paper, we extensively investigate the lack of agreement among AV engines.

To that end, we propose a set of metrics that quantitatively describe the different dimensions of this lack of consensus. We show how our metrics can bring important insights by using the detection results of 66 AV products on 2 million Android apps as a case study. Our analysis focuses not only on AV binary decision but also on the notoriously hard problem of labels that AVs associate with suspicious files, and allows to highlight biases hidden in the collection of a malware ground truth---a foundation stone of any machine learning-based malware detection approach.

Projects

Keep It Simple Scribe

MLOps Python Package

Fixing the MLOps Survey with ChatGPT

Kubeflow Demo

MLflow Demo

onet

  • Date: August 2020 - September 2020
  • Description: Train and predict procedures of DNN for binary image classification
  • Link: https://github.com/fmind/onet

fincrawl

invest

parsoc

Bigdata Tutorials

STASE: A set of statistical metrics to better understand and qualify malware datasets

  • Date: Aprril 2016 - July 2019
  • Description: A handful of statistical metrics to better understand and qualify malware datasets
  • Link: https://github.com/fmind/STASE

apkworkers

servalx

Euphony: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware

  • Date: March 2017 - March 2019
  • Description: Harmonious Unification of Cacophonous Anti-Virus Vendor Labels for Android Malware
  • Link: https://github.com/fmind/euphony

Automatic Speech Recognition with Tensorflow

Dog Recognition with Tensorflow

genius

Alexa History Skill

Air Cargo Planning System

Sign Language Recognition System

AI Agent for the Isolation Game

Sudoku Solver

lkml

Master 2 School Projects

chattail

Master 1 School Projects

Bachelor School Projects

Professional Bachelor School Project

Skills

Artificial Intelligence / Machine Learning

  • Artificial Intelligence (AI)
  • Machine Learning
  • MLOps
  • Deep Learning
  • Data Science
  • Statistics
  • Scikit-Learn
  • Tensorflow
  • KubeFlow
  • MLflow
  • Jupyter
  • Pandas
  • Keras
  • Transformers
  • Natural Language Processing (NLP)

Generative AI

  • Retrieval-Augmented Generation (RAG)
  • Large Language Model (LLM)
  • AWS Bedrock
  • ChatGPT

Software Engineering

  • Python
  • Functional Programming (FP)
  • Object-Oriented Programming (OOP)
  • API REST
  • Android
  • Docker
  • Flask
  • JSON
  • Git

Cloud Platforms

  • AWS SageMaker
  • Vertex AI (GCP)
  • Google Cloud Platform (GCP)
  • Microsoft Azure Machine Learning
  • Apache Airflow
  • Kubernetes
  • DataBricks
  • Terraform
  • Ansible
  • Linux

Computer Security

  • ISO 27001
  • Cybersecurity

Data Management

  • Neo4j
  • NoSQL
  • MongoDB
  • Big Data
  • PostgreSQL
  • Apache Spark
  • ElasticSearch
  • Data Engineering

Project Management

  • Project Management
  • Agile Methodology
  • Jira
  • UML

Languages

Français

  • Proficiency: Native or bilingual proficiency

English

  • Proficiency: Full professional proficiency