Skip to content
View lebrice's full-sized avatar

Block or report lebrice

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
lebrice/README.md

Hi there 👋

I'm a ML Research Enginner* at Mila, the Quebec Artificial Intelligence Institute (Mila). I have a Masters in Machine Learning from Université de Montréal / Mila. Before that, I did a Bachelors in Computer Engineering at McGill, during which I completed four Software Engineering internships. I was until recently an AI researcher at Mila doing my Master's thesis in Continual Learning with Irina Rish. You can see my list of publications here.

My job is to help make AI researchers more productive, as well as to help them use compute resources (GPUs) efficiently. I do this in several ways:

  • I build software tools and libraries that optimize different parts of the ML research workflow. For example: a Research Project Template, simpleparsing, milatools, Sequoia, torch-jax-interop, tensor_regression, and many more.
    • milatools is actively used by over 250 researchers at Mila!
  • I prepare and present interactive tutorials to teach good research and software engineering practices to researchers. For example, I give tutorials on writing GPU-friendly training scripts, debugging distributed training jobs, how to do profiling of GPU jobs and how to interpret profiler traces, how to write clean code, how to do proper testing, etc.
  • I created and hold the IDT Office Hours at Mila, where researchers walk in with their laptop, and I help them optimize their code, sort out their issues or scale up their training jobs to multiple GPUs or nodes. This has led me to meet a significant portion of the 1000+ researchers here at Mila, and to get experience optimizing / scaling / debugging / profiling a very wide range of ML workflows.

My research experience is in the field of Continual Learning, Self-Supervised Learning, and more recently Reinforcement Learning. I'm quite experienced in large-scale distributed training, as well as massively parallel RL training in Jax. For example, I:

  • Launched LLM pre-training jobs using up to 100 nodes and 400 A100 GPUs while monitoring the scaling behaviour, performance bottlenecks and resource utilization. (Wandb report link here)
  • Scaled up RL training (PQN algo) on up to ~200 000 parallel environments, using 16 nodes and 64 H100 gpus. (Project link)

I might do a PhD in Reinforcement Learning eventually to get closer to research, particularly in RL and robotics, unless I find a research engineering role closer to RL / robotics, where I can have an impact on very interesting problems.

On a more personal note, here are some other things I enjoy:

  • Sharing something neat and wonderful with others (cute bits of code, new songs, food, movies, etc.)
  • Fixing/repairing broken things in an elegant way
  • Philosophical discussions
  • Getting destroyed in code reviews (no, really, I love it!)
  • Chess / Videogames / Videogame development (playing around in Unity)
  • Classical / Electronic music

*: My official job title is "Software Developer in Machine Learning" because of the office of engineers of Quebec.

Lebrice's GitHub stats

Pinned Loading

  1. Sequoia Sequoia Public

    The Research Tree - A playground for research at the intersection of Continual, Reinforcement, and Self-Supervised Learning.

    Python 196 16

  2. SimpleParsing SimpleParsing Public

    Simple, Elegant, Typed Argument Parsing with argparse

    Python 521 58

  3. torch_jax_interop torch_jax_interop Public

    Simple tools to mix and match PyTorch and Jax - Get the best of both worlds!

    Python 35 2

  4. scaling_pqn scaling_pqn Public

    Scaling Parallel Q-Learning (PQN) to massively parallel training (WIP, exploratory work)

    Python

  5. mila-iqia/milatools mila-iqia/milatools Public

    Tools to connect to and interact with the Mila cluster

    Python 79 15

  6. mila-iqia/ResearchTemplate mila-iqia/ResearchTemplate Public

    Research Project Template Repository

    Python 39 9