Senior DL Developer for the YandexGPT Agents and Functions Development Team

Modern LLMs can handle a variety of tasks—from helping with homework to acting as a psychologist or financial advisor. The key factor in a model's usefulness is its ability to interact with the outside world. Our Agents and Functions development team is working to improve these skills in the YandexGPT family of models. We teach LLMs to use both popular tools (e.g., publicly available MCPs) and those created internally, and we train them to find effective solutions in various conditions, including using a browser. Furthermore, we aim to adapt models for multi-agent scenarios and develop their ability to reason when solving problems.

Are you passionate about agentic systems? Become part of our team and help us create the technologies of the future!

What tasks await you

New Data and Training Environments A model capable of performing complex agentic tasks must possess a set of various skills: the ability to make parallel function calls, determine the relevance of tools for the task at hand, build an execution plan, and much more. This creates a need for data that the model could use to learn effectively. This data can be in the form of either instruction-answer pairs or interactive environments tailored for training specific abilities. Your task will be to collect such datasets and evaluate their impact on model quality improvement.

Training Agentic Models For us, it's important that LLMs can be applied in a wide range of scenarios—from a personal assistant to a coding assistant. This requires models to have good knowledge of domain areas and the ability to work in diverse conditions. And while the former is typically solved during the pretraining stage, the latter is a skill that can only be developed by solving problems in complex environments. We expect you will train agentic models in complex setups with a large number of concurrently used environments.

Enhancing Models with Reasoning The use of reasoning by models when solving complex problems (mathematics, code) has shown high potential for quality improvement. We are confident that basic reasoning patterns, such as verification, reflection, and backtracking, are also useful in agentic scenarios. A task with complex constraints arises for you to solve—significantly improving the agent's quality of work given a reasonable increase in response time.

More about ML at Yandex — in the Yandex for ML channel

We expect that you

Have excellent knowledge of mathematics, classical algorithms, and data structures
Can program in Python
Understand Reinforcement Learning. You are not intimidated by words like GAE, PPO, GRPO, and other versions of policy optimization
Have practical experience in distributed training of large models based on the Transformer architecture
Understand the structure of the alignment stage of modern LLMs

Will be a plus

Have trained LLMs in skills for using external tools (tool calling, function calling)
Have practical experience with RL training infrastructure: vLLM, SGLang, VERL, etc.

Contacts

What tasks await you

We expect that you

Will be a plus

Similar vacancies

YandexGPT Reasoning Team Lead

Senior LLM Developer for the Alignment Team at YandexGPT

ML Developer for the Reinforcement Learning (RL) Group

Senior DL Developer for Neuro Team

Team Lead of DL Development for the International Direction Neuro (LLM)

Senior Developer for the Delivery Robot ML Planner Team (RL)

Senior LLM Developer for the Neuro Team

Senior ML Developer of AI Agents

NLP Developer for YandexGPT Pretrain Team

DL Developer for Neuro Quality Group

Senior ML Developer for the YandexGPT Pretraining Quality Team

Senior LLM Developer at Neuro

Senior DL Developer for the YandexGPT Agents and Functions Development Team

Key Skills

Details