Reach out directly about this role
ML Researcher in Generative Personalization Models
We work on fundamental, semantic, and generative models for recommendations. We employ LLM-like architectures for behavioral modeling and develop a unified semantic representation for multimodal and multi-domain data. Our models are applied in Yandex's core recommendation systems.
Unified Semantic Representation for Multi-Domain Multimodal Data We define heterogeneous entities from Yandex products (ad banners, Market goods, search queries, etc.) in a unified embedding space by training models simultaneously on content features and user behavior patterns. This task requires active use of multimodal LLMs and aligning their representations with collaborative signals.
Semantic Tokenization We experiment with various embedding quantization algorithms. Our goal is to represent any element in a user's behavior sequence as a set of discrete tokens. The presence of a strict hierarchy and meaning in behavioral tokens allows us to use them in generative models—both those we train from scratch and those intended for fine-tuning LLMs.
Pretraining and Fine-tuning of Generative Behavioral Models For training models from scratch, we implement an encoder-decoder architecture where the encoder encodes a representation of the user's action history within the ecosystem, and the decoder generates a sequence of semantic tokens to predict the user's next action.
Fundamental Models of User Behavior in the Yandex Ecosystem Our goal is to train a unified upstream model of user behavior on ecosystem logs, applying it simultaneously to multiple downstream tasks. We experiment with multi-domain multi-task models on heterogeneous action sequences—developing specific architectures and training recipes, as well as a common semantics for different events.
More about ML at Yandex — in the channel Yandex for ML
3-5 years
Experience
Full-time
Employment
Hybrid, Remote, Onsite
Work Format
Senior
Grade
Data Science & ML
Specialization
IT & Tech
Industry
Corporation
Company Type
Data Science & ML
Specialization
IT & Tech
Industry
Corporation
Company Type