Reach out directly about this role
NLP Developer for Keyboard
Yandex Keyboard is one of the fastest-growing mobile applications at Yandex. The main goal of the Keyboard is to make text input easier, faster, and more convenient. We are looking for a strong ML developer who will help us develop ML in the Keyboard. You will have the opportunity to solve complex problems and influence every stage of the model lifecycle: from data collection to optimizing inference on mobile devices. The results of your work will be applied billions of times a day and will have a colossal impact on Keyboard users: any improvement can cumulatively save up to several thousand years of their lives per year.
If you've long wanted to work on a product with millions of DAU, implement modern NLP approaches directly on devices so that quality and speed are not inferior to server-side equivalents, and quickly see the profit from implementations in online metrics — come to our small team that builds the product end-to-end.
On Habr, you can read about our neural language model and about the tap model
Moving most of the stack to LLM Currently, the tasks of predicting the next letter/word and correcting typos are handled by a sprawling pipeline with a large number of dependencies between models. Transitioning to one LLM for all tasks will allow us to relatively freely benefit from architectural developments and the pre-training stage from the YaGPT team.
Training Infrastructure We face many interesting infrastructure challenges. For example: * Speed up the training pipeline. We want to achieve the goal of "One day from the start of an experiment to a ready release candidate." * Optimize the data collection pipeline for training and quality measurement — to arrive at a situation where "Offline measurements strongly correlate with online results."
Deeper Context Understanding If the model uses the entire context in the current and adjacent input sessions and works with it effectively, then suggestions/corrections become increasingly relevant. To solve this task, it is necessary to properly build model training and their application on the user's device.
More about ML at Yandex — in the Yandex for ML channel
3-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Middle
Grade
Data Science & ML
Specialization
IT & Tech
Industry
Corporation
Company Type
By country
Data Science & ML
Specialization
IT & Tech
Industry
Corporation
Company Type