Reach out directly about this role
Alice in Search Implementation Engineer (C++)
Alice in Search is a key Yandex product: tens of millions of users see generative responses at the top of search results every day. Under the hood, it's not just one LLM, but a family of models with different sizes and properties.
We are looking for an experienced developer to work at the intersection of product development, ML teams, and infrastructure. The primary task is to build and maintain complete model configurations for releases: from forming requirements to integrating services into production, taking into account infrastructure limitations and product scenarios.
Individual teams create parts of the future product:
But without a linking layer that connects them into a single working product, it's impossible to roll out models to production stably and predictably, control the impact of changes on metrics and resource consumption, or quickly respond to changes in product requirements.
Your task will be to assemble a working and scalable configuration: select models for the product scenario, adapt runtime pipelines for computations, account for infrastructure limitations, and bring the configuration to a production-ready state. You will understand the structure of experimental ML pipelines, modify them for production scenarios, deploy services, and resolve integration issues between teams.
This role ensures systematicity and technical reliability in delivering ML results to the user. Your decisions will determine the speed of releases and the quality of the final product.
Working with response pipelines You will modify and maintain computational pipelines (C++, Jinja), deploy generative, DSSM, and BERT models in the existing infrastructure and integrate them into computational pipelines, as well as assist in diagnosing issues based on experiment results from the development side.
Working with the offline generative response database You will support functionality for collecting, updating, and re-running data, and implement product improvements on top of the existing storage.
Conducting A/B experiments You will need to form correct samples and slices for A/B experiments, taking into account developments from frontend, slice, baseline model quality, generative service infrastructure, and other teams. You will maintain related non-generative services: C++, work with databases, Python for service deployment automation.
3-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Middle
Grade
AI Engineering
Specialization
IT & Tech
Industry
Corporation
Company Type
By city
AI Engineering
Specialization
IT & Tech
Industry
Corporation
Company Type