Reach out directly about this role
3-5 years
Experience
Full-time
Employment
Hybrid, Remote, Onsite
Work Format
Middle
Grade
Data Analytics
Specialization
AI
Industry
Corporation
Company Type
Analyst Developer at Neuro
We are developing a new Neuro technology for Alice. It solves search tasks in a minute that typically take users several hours. Our goal is to confidently defeat competitors in the Russian market, which includes very strong international players.
The technology is based on an LLM neural network. It communicates with the user in dialogue mode, analyzes data from the internet, and writes accessible, informative, clear, and reliable answers. The first product release was launched in 2024 under the name Neuro and brought significant profit to Yandex Search. This technology has now become part of Alice and continues to evolve actively.
Our analytics team is key to the development of the new Alice. Every six months, together with ML and product teams, we prepare the next technology release. Each release is based on offline quality labeling of answers, which we develop. Using labeling, we assess how well Alice's answers meet our product goals and user expectations: * We compare answer quality Side-by-Side * We measure the amount of false information and hallucinations in answers through fact-checking * We use crowd labeling (human task completion), train LLM-as-a-judge, and assemble LLM-based assistants to help performers in Yandex Crowd.
Labeling is used as the main tool for measuring quality — just like a target for RLHF when training neural networks. Our projects are among the most complex and large-scale offline quality labeling projects at Yandex. We have unique expertise in this area.
Our team is involved in the entire technology creation cycle, from formulating general product requirements to debugging and accepting finished models. To successfully develop our projects, we need to solve various types of tasks: from technical and analytical to product-related.
Formalizing product requirements You will need to delve into ambiguous product requirements, help the product team turn them into logical principles and rules that become clear instructions for performers in Yandex Crowd and neural networks. Figure out how to properly reason about answer quality, how to find all factual errors in it and assess their significance for the product.
Finding and fixing LLM bugs through training You will quantify and improve the quality of training for RLHF, together with ML developers constantly analyze problems in current LLM versions and fix them using our labeling, develop and maintain convenient tools for collecting training datasets and measuring metrics in experiments.
Deep analysis of labeling quality You will need to collect and update reference labeling sets (gold sets) that perfectly match product goals and become benchmarks for project development, work with Yandex's AI trainers team, analyze complex task examples with them, formulate and refine product principles based on data.
Developing large projects in Yandex Crowd You will recruit, train, and test performers for labeling, develop processes for ongoing quality control, banning, and rehabilitation, create convenient dashboards that allow quick answers to questions about the performance, cost, and quality of our labeling.
More about analytics at Yandex — in the channel Yandex for Analytics