Description

We are the core team responsible for machine learning for understanding audio data across Sber. Last year, we open-sourced a SOTA model for Russian speech recognition, GigaAM (https://arxiv.org/abs/2506.01192), and this spring, we were the first in Russia to launch native audio understanding in an LLM: GigaChat Audio (https://habr.com/ru/companies/sberdevices/articles/904894/). Currently, we are actively working on improving GigaChat's multimodal capabilities: improving quality on complex contexts from audio and images; understanding video not only via the audio stream but also via frames.

Responsibilities

creating a pipeline for generating synthetic Audio+Vision+Text data from internal and open models
creating benchmarks: llm-as-a-judge, auto-metrics
conducting experiments on LLM training: testing data and training stages, modality mixing methods

Requirements

python: modular code, OOP, concurrency, pep, tests
understanding of training stages and modern LLM architectures
understanding of methods for evaluating the quality of ML systems
deep theoretical knowledge in DL
experience with debugging/training in multi-gpu mode

Would be a plus

experience in Computer Vision / Audio

Conditions

comfortable modern office near Kutuzovskaya metro station
opportunity to choose a convenient schedule – office/hybrid (Moscow / Saint Petersburg offices)
annual salary review and annual bonus
corporate gym and recreation areas
more than 400 educational programs from SberUniversity for professional and career development
extended voluntary health insurance, preferential insurance for family members
flexible mortgage discount equal to 1/3 of the Central Bank key rate
free subscription to SberPrime+, discounts on products from partner companies
referral bonus for recommending friends to the Sber team

Description

Responsibilities

creating a pipeline for generating synthetic Audio+Vision+Text data from internal and open models
creating benchmarks: llm-as-a-judge, auto-metrics
conducting experiments on LLM training: testing data and training stages, modality mixing methods

Requirements

python: modular code, OOP, concurrency, pep, tests
understanding of training stages and modern LLM architectures
understanding of methods for evaluating the quality of ML systems
deep theoretical knowledge in DL
experience with debugging/training in multi-gpu mode

Would be a plus

experience in Computer Vision / Audio

Conditions

comfortable modern office near Kutuzovskaya metro station
opportunity to choose a convenient schedule – office/hybrid (Moscow / Saint Petersburg offices)
annual salary review and annual bonus
corporate gym and recreation areas
more than 400 educational programs from SberUniversity for professional and career development
extended voluntary health insurance, preferential insurance for family members
flexible mortgage discount equal to 1/3 of the Central Bank key rate
free subscription to SberPrime+, discounts on products from partner companies
referral bonus for recommending friends to the Sber team

Key Skills

Contacts

Average salary for this role

Details

Description

Responsibilities

Requirements

Conditions

Similar vacancies

Senior ML Engineer (Multimodal LLM; Video Understanding)

Senior NLP Engineer (GigaChat)

Team Lead ML TTS GigaChat Data

Senior NLP Researcher (RnD GigaChat)

Middle ML Researcher (Audio)

ML Engineer LLM GigaChat

ML Engineer

NLP Engineer (GigaChat Pretrain)

Senior Data Engineer / ML Engineer (GigaChat)

NLP Engineer at GigaChat Alignment

Deep Learning Engineer (GigaChat Prod)

Senior LLM Researcher (Center for Applied Artificial Intelligence)

Key Skills

Contacts

Average salary for this role

Details

Description

Responsibilities

Requirements

Conditions

Similar vacancies

Senior ML Engineer (Multimodal LLM; Video Understanding)

Senior NLP Engineer (GigaChat)

Team Lead ML TTS GigaChat Data

Senior NLP Researcher (RnD GigaChat)

Middle ML Researcher (Audio)

ML Engineer LLM GigaChat

ML Engineer

NLP Engineer (GigaChat Pretrain)

Senior Data Engineer / ML Engineer (GigaChat)

NLP Engineer at GigaChat Alignment

Deep Learning Engineer (GigaChat Prod)

Senior LLM Researcher (Center for Applied Artificial Intelligence)