Description

We are a core team responsible for machine learning for audio data understanding across Sber. Last year, we released the SOTA model for Russian speech recognition, GigaAM (https://arxiv.org/abs/2506.01192), as open source, and this spring we were the first in Russia to launch native audio understanding in an LLM: GigaChat Audio (https://habr.com/ru/companies/sberdevices/articles/904894/). We are now actively working on improving GigaChat's multimodal properties: improving quality on complex contexts from audio and images; understanding video not only via the audio stream but also via frames.

Responsibilities

creating a pipeline for generating synthetic Audio+Vision+Text data from internal and open models
creating benchmarks: llm-as-a-judge, auto-metrics
conducting LLM training experiments: testing data and training stages, modality mixing methods

Requirements

python: modular code, OOP, concurrency, pep, tests
understanding of LLM training stages and modern architectures
understanding of ML system quality assessment methods
deep theoretical knowledge in DL
experience with debugging/training in multi-gpu mode

Will be a plus

experience in Computer Vision / Audio

Conditions

comfortable modern office near Kutuzovskaya metro station
ability to choose a convenient schedule – office/hybrid (offices in Moscow / St. Petersburg)
annual salary review and annual bonus
corporate gym and recreation areas
more than 400 educational programs from SberUniversity for professional and career development
extended voluntary health insurance, preferential insurance for family
flexible mortgage discount equal to 1/3 of the Central Bank's key rate
free SberPrime+ subscription, discounts on products from partner companies
referral bonus for recommending friends to Sber's team

Contacts

Description

Responsibilities

Requirements

Conditions

Similar vacancies

Senior ML Engineer (NLP, GigaChat Audio)

Senior Deep Learning Research Engineer (Diffusion models)

Senior DL/GenAI Research Engineer (Diffusion Video Generation & World Model Development)

Senior LLM Researcher (Center for Applied Artificial Intelligence)

Senior Research Engineer (Multimodal Diffusion & RLHF)

Senior DL/GenAI Research Engineer (Diffusion Video Generation)

ML Engineer (GigaChat Data)

Middle/Senior Data Scientist LLM (B2C team)

ML Engineer LLM GigaChat

Senior DL/LLM Engineer (Pretrain/RL Efficiency)

LLM Platform Engineer (ML Engineer)

Senior Research Engineer (Kandinsky)

Senior ML Engineer (Multimodal LLM; Video Understanding)

Key Skills

Details

Average salary for this role