Senior ML Researcher/Engineer (World Models & RL) for the Delivery Robot Team

Yandex delivery robots are not just bold R&D, but a real functioning business. Our robots deliver thousands of orders daily, maneuvering in a complex, unstructured urban environment. We are actively growing and plan to scale the fleet to 20,000 robots by 2028.

We are now transitioning from a classic modular pipeline with rigid binding to HD maps and perception/prediction/planning modules to a full-fledged End-to-End (E2E) architecture based on World Models.

Our goal is to build a strong Embodied AI. For an RL agent to be able to adequately plan complex maneuvers in the real world, it needs a deep understanding of physics, causality, and object permanence. Training a policy directly from raw pixels is extremely sample-inefficient. Therefore, we are building a system where a 3D/video tokenizer compresses the world, and a large-scale World Model learns to predict its latent dynamics. Within this generated simulation, we will train our planning policy using RL.

We are looking for a Senior ML Engineer/Researcher to join the core WM + E2E team, who will focus on building a fast, interactive world model and large-scale training of MBRL agents. Your research ideas will guide thousands of physical agents on city streets every day. If you are ready to solve fundamental robotics problems at the intersection of generative video models and RL — join us!

What tasks await you

Development and Scaling of World Models You will design and train massive 3D/video tokenizers and backbones based on Diffusion Transformers (DiT), Flow Matching, etc. The goal is accurate prediction of the evolution of the physical world in latent space in response to the agent's actions.

Distributed Training You will build pipelines for distributed training of heavy foundation models on our computing cluster. You will work with Data-, Tensor- and Pipeline-parallelism, orchestrate multi-node training, and squeeze the absolute maximum out of the hardware.

Model-Based RL (MBRL) & Planning Your task will be training pure RL and IL + RL policies within the frozen latent simulation of the World Model, using dense self-supervised representations to train a reward model with high sample efficiency.

Representation Shaping You will work on integrating auxiliary losses for perception tasks like 3D detection, segmentation, and tracking for explicit semantic grounding of important scene objects.

Safety & Inference You will build a reliable safety framework on top of the model outputs and prepare the entire construct for real-time inference directly on the robot's edge devices.

More about ML at Yandex — in the Yandex for ML channel

We expect you to

Have expert-level proficiency in JAX and PyTorch, with deep practical experience with modern frameworks — we place a huge emphasis on JAX: SPMD, multi-host JAX, XLA compilation
Have skills in large-scale distributed training, confident experience training heavy models on multi-node clusters (FSDP, principles of Megatron-LM, 3D parallelism)
Have a deep mathematical and ML foundation: excellent understanding of continuous generative models (Diffusion, Flow Matching, Diffusion Forcing) and Deep RL (Actor-Critic architectures, RL in imagination, Model-Based RL)
Are capable of writing, generating, and verifying fast, optimized code and bringing hardcore research to production with strict real-time constraints

It will be a plus if you

Have worked with Vision Foundation Models, generative video and image models, as well as synthesis of LiDAR point clouds.
Have experience in Reinforcement Learning in LLMs or, better yet, beyond them
Possess experience with advanced quantization of heavy transformers or diffusion models for edge devices: FP8, W4A8, INT4 (PTQ/QAT)
Have optimized onboard robot inference: used C++, TensorRT, ONNX, CUDA
Have a background in Autonomous Driving, Motion Planning, or Robotics

What tasks await you

Representation Shaping You will work on integrating auxiliary losses for perception tasks like 3D detection, segmentation, and tracking for explicit semantic grounding of important scene objects.

Safety & Inference You will build a reliable safety framework on top of the model outputs and prepare the entire construct for real-time inference directly on the robot's edge devices.

More about ML at Yandex — in the Yandex for ML channel

We expect you to

Have expert-level proficiency in JAX and PyTorch, with deep practical experience with modern frameworks — we place a huge emphasis on JAX: SPMD, multi-host JAX, XLA compilation

Have skills in large-scale distributed training, confident experience training heavy models on multi-node clusters (FSDP, principles of Megatron-LM, 3D parallelism)

Have a deep mathematical and ML foundation: excellent understanding of continuous generative models (Diffusion, Flow Matching, Diffusion Forcing) and Deep RL (Actor-Critic architectures, RL in imagination, Model-Based RL)

Are capable of writing, generating, and verifying fast, optimized code and bringing hardcore research to production with strict real-time constraints

It will be a plus if you

Have worked with Vision Foundation Models, generative video and image models, as well as synthesis of LiDAR point clouds.

Have experience in Reinforcement Learning in LLMs or, better yet, beyond them

Possess experience with advanced quantization of heavy transformers or diffusion models for edge devices: FP8, W4A8, INT4 (PTQ/QAT)

Have optimized onboard robot inference: used C++, TensorRT, ONNX, CUDA

Have a background in Autonomous Driving, Motion Planning, or Robotics

Senior ML Researcher/Engineer (World Models & RL) for the Delivery Robot Team

Key Skills

Contacts

Details

What tasks await you

We expect you to

It will be a plus if you

Similar vacancies

Senior Developer for the Delivery Robot ML Planner Team (RL)

Senior ML Engineer (Motion Planning)

ML Researcher for the Early-binding Architectures Team

Senior Research Engineer (Multimodal Diffusion & RLHF)

RL Engineer for the Humanoid Robot Locomotion Team

Senior DL Developer for Neuro Team

Senior Deep Learning Research Engineer (Diffusion Models)

LLM RL Training Infrastructure Developer

Senior DL/GenAI Research Engineer (Diffusion Video Generation & World Model Development)

ML Developer for the Reinforcement Learning (RL) Group

DL Developer for the YandexGPT Architecture Research Team

ML Tech Lead (Motion Planning)

Senior ML Researcher/Engineer (World Models & RL) for the Delivery Robot Team

Key Skills

Contacts

Details

What tasks await you

We expect you to

It will be a plus if you

Similar vacancies

Senior Developer for the Delivery Robot ML Planner Team (RL)

Senior ML Engineer (Motion Planning)

ML Researcher for the Early-binding Architectures Team

Senior Research Engineer (Multimodal Diffusion & RLHF)

RL Engineer for the Humanoid Robot Locomotion Team

Senior DL Developer for Neuro Team

Senior Deep Learning Research Engineer (Diffusion Models)

LLM RL Training Infrastructure Developer

Senior DL/GenAI Research Engineer (Diffusion Video Generation & World Model Development)

ML Developer for the Reinforcement Learning (RL) Group

DL Developer for the YandexGPT Architecture Research Team

ML Tech Lead (Motion Planning)