Reach out directly about this role
By job title
3 years
Experience
Full-time
Employment
Middle
Grade
Data Science & ML
Specialization
IT & Tech
Industry
Corporation
Company Type
We are a speech synthesis team responsible for high-quality automatic voice-overs with voices in Yandex products, including Alice, video translation in Yandex Browser, and the virtual narrator in Bookmate. We are currently entering an era in speech synthesis transitioning from low resource (even for major languages) to big data and pre-training. New models can sing famous songs in another voice or utter any phrase (even in another language) based on just a few seconds of your recorded speech. For large-scale work we have:
Training TTS models for video translation The quality of dubbing in video translation significantly impacts user experience. Imagine if the dubbing of your favorite series was voiced by the original actors with their expressive intonations. To get closer to such a future, you will be training SOTA models that solve the task of multilingual zero-shot speech synthesis, intonation or emotion transfer, voice conversion.
Working with data You will experiment with approaches to collecting data from open sources, filtering data, and how to properly use the collected data for training a voice pre-train.
Working with the runtime Video translation in Yandex Browser is a service regularly used by millions of people, so our runtime must handle high loads. You will be engaged in its acceleration: optimizing neural network inference, writing efficient code for the backend.