Portfolio

Silent Sabotage: Backdooring Code-Executing LLM Agents

Investigated the unique backdoor vulnerabilities of CodeAct LLM agents, demonstrating highly effective attacks via fine-tuning poisoning, even with minimal poisoned data, highlighting critical security risks in autonomous systems.

NLP for Patent Search & Generation at DeepIP (Kili Technology)

Developed and evaluated patent similarity search using Embeddings, and LLMs. Specialized LLMs for patent generation via fine-tuning exploration and advanced instruction design, CoT. Integrated style transfer through architectural refactoring.

Data Processing Platform Development at Lingua Custodia

Developed core Python microservice components for the Datomatic data platform. Created NLP tools, a web scraping framework, and a client library, enhancing financial translation data pipelines

Ethical AI Recommendations: Benchmarking LLM Bias in Cold-Start Scenarios

Developed and applied a novel benchmark to evaluate ethical biases (gender, nationality, etc.) in LLM-based recommender systems, especially for new users (cold-start), revealing significant stereotype replication and providing tools for fairer AI.

Advanced of the Machine Learning Toolkit

A deep dive into supervised, unsupervised, randomized optimization, and reinforcement learning algorithms using Scikit-learn, Matplotlib, Gymnasium, and custom libraries.

ModernBERT for Patents: Faster Insights, Smarter Classification

ModernBERT for complex patent classification, demonstrating >2x faster inference than traditional BERT with state-of-the-art accuracy using hierarchical loss. Introduced USPTO-3M, a large public dataset of 3 million patents.

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

Deep Learning Mastery: From Foundations to Advanced Generative Models with PyTorch

Implemented, trained, and evaluated diverse deep learning models (MLPs, CNNs, Transformers, GANs, Diffusion Models) using PyTorch and NumPy for tasks like image classification/generation, sequence modeling, and robotic control.

Gauthier Roy

Portfolio