Reach out directly about this role
We are building Neuro service — a new way of searching for information that may soon replace the familiar blue-link results. Read more in the article on Habr "Yandex launched Neuro. We tell you how it works."
In short: a generative neural network (LLM) based on YandexGPT analyzes the content of web pages and answers user questions in a chat. To train the neural networks and measure the quality of their work, we run answer labeling projects using Yandex Crowd — an internal Yandex service. The development of Neuro directly depends on the success of these projects. To better understand how we work with the product, you can watch the talk about labeling for Search at the DataDriven conference.
If you know analytics, write code well, and draw accurate conclusions based on data — come build the search of the future with us!
Formalizing product quality requirements Our main task is to transform a poorly formalized and contradictory product definition into a set of clear rules and principles, with which we can mark a specific answer as good (suitable for the product) or bad (an error in the product) and justify such a decision. First, we learn to do this ourselves (collect and discuss examples, write instructions) — and then we teach AI trainers and assessors to do it.
Creating complex projects in Yandex Crowd For large-scale labeling collection, we create hierarchical projects with several levels of labelers. Each group of performers has its own training method and its own quality requirements. We launched in 2023 and continue to develop several projects that differ significantly from typical ones in their complexity and large volumes of tasks.
Verifying answer groundedness One of the most important tasks is to verify the groundedness of Neuro's answers. This means the answer's meaning follows the content of the sources (the web pages on which it is based): it correctly conveys facts, does not contradict them, does not lie, and does not mislead the user. In practice, this is a complex text analysis task, and we will actively work on its quality in the coming six months.
It's great with us because: * We work with Neuro — a new Yandex product based on LLM — and are focused primarily on results in production * Our tasks are closely related to both the product design itself and ML * We offer the opportunity to develop both technical and communication and managerial skills * Your work will directly influence what Neuro becomes in six months * We are creating crowdsourcing projects that are unique in their complexity, scale, and architecture * Our close-knit team of Search quality analysts and ML engineers constantly discusses tasks and shares experience
3-5 years
Experience
Full-time
Employment
Hybrid
Work Format
Middle
Grade
Data Analytics
Specialization
AI
Industry
Corporation
Company Type
Data Analytics
Specialization
AI
Industry
Corporation
Company Type