We are a speech synthesis team responsible for high-quality automatic voice-overs with voices in Yandex products, including Alice, video translation in Yandex Browser, and the virtual narrator in Bookmate. We are currently entering an era in speech synthesis transitioning from low resource (even for major languages) to big data and pre-training. New models can sing famous songs in another voice or utter any phrase (even in another language) based on just a few seconds of your recorded speech. For large-scale work we have:

numerous DL research projects in PyTorch;
hundreds of modern GPUs for our experiments;
the power of Toloka and assessors for data labeling;
Yandex's large language models trained on massive text corpora;
high-performance production on GPU and C++;
an actively developing product and a strong team.

What tasks await you

Training TTS models for video translation The quality of dubbing in video translation significantly impacts user experience. Imagine if the dubbing of your favorite series was voiced by the original actors with their expressive intonations. To get closer to such a future, you will be training SOTA models that solve the task of multilingual zero-shot speech synthesis, intonation or emotion transfer, voice conversion.

Working with data You will experiment with approaches to collecting data from open sources, filtering data, and how to properly use the collected data for training a voice pre-train.

Working with the runtime Video translation in Yandex Browser is a service regularly used by millions of people, so our runtime must handle high loads. You will be engaged in its acceleration: optimizing neural network inference, writing efficient code for the backend.

We expect that you

Understand the principles of machine learning
Have trained neural network ML models in industry or researched them
Are well acquainted with Python

Will be a plus

Have worked with ML in the field of voice technologies in ASR, voice biometrics, text-to-speech, or voice conversion
Have worked with NLP or computer vision
Know cuDNN, cuBLAS, CUDA, TensorRT

numerous DL research projects in PyTorch;
hundreds of modern GPUs for our experiments;
the power of Toloka and assessors for data labeling;
Yandex's large language models trained on massive text corpora;
high-performance production on GPU and C++;
an actively developing product and a strong team.

What tasks await you

Working with data You will experiment with approaches to collecting data from open sources, filtering data, and how to properly use the collected data for training a voice pre-train.

We expect that you

Understand the principles of machine learning
Have trained neural network ML models in industry or researched them
Are well acquainted with Python

Will be a plus

Have worked with ML in the field of voice technologies in ASR, voice biometrics, text-to-speech, or voice conversion
Have worked with NLP or computer vision
Know cuDNN, cuBLAS, CUDA, TensorRT

Key Skills

Contacts

Average salary for this role

Details

What tasks await you

We expect that you

Will be a plus

Similar vacancies

ML Research Engineer for Video Translation in the Browser

ML Developer for Voice Input Applications Team

ML Engineer for Speech Synthesis Pretraining Team

ML Developer for the Intonation Group

ML Researcher Developer for the Alignment team of the Speech Synthesis Service

Senior ML Engineer (Text-to-Speech)

Backend Developer for Voice Technology Team

ML Developer for Voice Quality Improvement Team

Senior ML Engineer (Text-to-Speech)

ML Developer for the Voice Quality Enhancement Team at Alice

ML Developer for the Recommendation Systems Team

Middle ML Researcher (Audio)

Key Skills

Contacts

Average salary for this role

Details

What tasks await you

We expect that you

Will be a plus

Similar vacancies

ML Research Engineer for Video Translation in the Browser

ML Developer for Voice Input Applications Team

ML Engineer for Speech Synthesis Pretraining Team

ML Developer for the Intonation Group

ML Researcher Developer for the Alignment team of the Speech Synthesis Service

Senior ML Engineer (Text-to-Speech)

Backend Developer for Voice Technology Team

ML Developer for Voice Quality Improvement Team

Senior ML Engineer (Text-to-Speech)

ML Developer for the Voice Quality Enhancement Team at Alice

ML Developer for the Recommendation Systems Team

Middle ML Researcher (Audio)