Skip to content

alexpalms/alexpalms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 

Repository files navigation

Alessandro (Alex) Palmas

Senior AI / ML Research Engineer
Specializing in LLMs / VLMs / VLAs, Reinforcement Learning, and Embodied / Physical AI


About

I’m a senior AI/ML research engineer with 15+ years of experience building intelligent agents and AI systems that understand and reason about the world. My work spans foundation models, reinforcement learning, and physics-based simulation, continuously expanding into VLMs, VLAs, and embodied multimodal AI.

Currently part of the core research and engineering team at LawZero, a non-profit AI lab in Montreal led by Yoshua Bengio, I focus on advancing truthful, transparent, and safe-by-design AI through next-generation foundational models, and reasoning architectures.

My passion is right at the intersection of AI, simulation, and decision-making. I focus on:

  • Foundation Models & Alignment: LLM / VLM / VLA fine-tuning and alignment (SFT, DPO, RLVR/RLHF/RLAIF) with a focus on interpretable, safe, and truthful reasoning.
  • Reinforcement Learning: Online/offline, adversarial, multi-agent, imitation learning, and human-in-the-loop optimization.
  • Embodied AI & High-Fidelity Simulation: Isaac Lab, Genesis, and physics-based virtual environments for grounded RL and world modeling.

What I’m Working On Now

  • Experimenting with large-scale, distributed RL finetuning, using verifiable rewards for code-focused LLMs
  • Building knowledge and hands-on experience on RL applied to foundation models
  • Exploring the landscape around Vision-Language-Action models: SOTA architectures, training playbooks and recipes, benchmarks, data pipelines, and tooling
  • Standardizing approaches and pipelines for DeepRL-based optimal decision making and control in real-world applications

Let's Connect

I’m always happy to chat with people who share my interests and passions. Feel free to reach out!


Featured Projects

Here are a few of my public repositories that reflect my work and interests:

ContaineRL

🐋 ContaineRL - Containerize your RL Environments and Agents

Toolkit to package and deploy reinforcement learning environments and agents inside reproducible containers.

RL-Drone-Swarm

🛡️ RL Drone Swarm Defense

Reinforcement learning framework for decision-level interception prioritization of drone swarms.

SFIII Gym

👊🏼 SFIII Gym

Gymnasium environment for Street Fighter III: 3rd Strike using the MAME emulator.

DIAMBRA Arena

🕹️ DIAMBRA Arena

A platform to train reinforcement learning agents in classic retro fighting games.

DIAMBRA Agents

🤖 DIAMBRA Agents

A library of reinforcement learning algorithms tailored for DIAMBRA Arena environments.

RSL-RL

🔀 Extending GPU-Native RSL-RL Library

Customized fork of RSL-RL library that supports multi-discrete action spaces.


About

The special repo for GitHub

Topics

Resources

Stars

Watchers

Forks

Contributors