Fabrice Normandin lebrice

Hi there 👋

I'm a ML Research Enginner* at Mila, the Quebec Artificial Intelligence Institute (Mila). I have a Masters in Machine Learning from Université de Montréal / Mila. Before that, I did a Bachelors in Computer Engineering at McGill, during which I completed four Software Engineering internships. I was until recently an AI researcher at Mila doing my Master's thesis in Continual Learning with Irina Rish. You can see my list of publications here.

My job is to help make AI researchers more productive, as well as to help them use compute resources (GPUs) efficiently. I do this in several ways:

I build software tools and libraries that optimize different parts of the ML research workflow. For example: a Research Project Template, simpleparsing, milatools, Sequoia, torch-jax-interop, tensor_regression, and many more.
- milatools is actively used by over 250 researchers at Mila!
I prepare and present interactive tutorials to teach good research and software engineering practices to researchers. For example, I give tutorials on writing GPU-friendly training scripts, debugging distributed training jobs, how to do profiling of GPU jobs and how to interpret profiler traces, how to write clean code, how to do proper testing, etc.
I created and hold the IDT Office Hours at Mila, where researchers walk in with their laptop, and I help them optimize their code, sort out their issues or scale up their training jobs to multiple GPUs or nodes. This has led me to meet a significant portion of the 1000+ researchers here at Mila, and to get experience optimizing / scaling / debugging / profiling a very wide range of ML workflows.

My research experience is in the field of Continual Learning, Self-Supervised Learning, and more recently Reinforcement Learning. I'm quite experienced in large-scale distributed training, as well as massively parallel RL training in Jax. For example, I:

Launched LLM pre-training jobs using up to 100 nodes and 400 A100 GPUs while monitoring the scaling behaviour, performance bottlenecks and resource utilization. (Wandb report link here)
Scaled up RL training (PQN algo) on up to ~200 000 parallel environments, using 16 nodes and 64 H100 gpus. (Project link)

I might do a PhD in Reinforcement Learning eventually to get closer to research, particularly in RL and robotics, unless I find a research engineering role closer to RL / robotics, where I can have an impact on very interesting problems.

On a more personal note, here are some other things I enjoy:

Sharing something neat and wonderful with others (cute bits of code, new songs, food, movies, etc.)
Fixing/repairing broken things in an elegant way
Philosophical discussions
Getting destroyed in code reviews (no, really, I love it!)
Chess / Videogames / Videogame development (playing around in Unity)
Classical / Electronic music

*: My official job title is "Software Developer in Machine Learning" because of the office of engineers of Quebec.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fabrice Normandin lebrice

Achievements

Achievements

Block or report lebrice

Hi there 👋

Pinned Loading

Uh oh!