Reach out directly about this role
By company
3-5 years
Experience
Full-time
Employment
Hybrid
Work Format
Middle
Grade
Data Analytics
Specialization
AI
Industry
Corporation
Company Type
Verification Analyst-Developer for Search with Alisa
Our analytics team helps develop Neuro — a service based on large language models that is poised to replace the traditional internet search. For developing LLM-based products, a key success factor is the high-quality labeling of input data for training these models. Our team helps collect this data and carefully draw the line between a bad and a good answer. Daily, we collect huge volumes of "query — answer" data, pass it through humans and algorithms, and obtain the final labeling. Our goal is to analyze and improve this process, making it faster, cheaper, and of higher quality.
Reducing labeling costs and increasing their volume Another area of work is reducing labeling costs and increasing their volume. Currently, the company spends very large sums on human labeling. We need to find ways that, without degrading the resulting quality, will allow us to collect more diverse data, which will enable us to advance the product to a higher level.
Promtization A third direction, actively developing since 2024, is promtization — one of Yandex's key focus areas, capable of becoming a real game-changer in the development of search algorithms and the training of language models.
Reducing the "gray area" When people make any statements, only a small part of them can be unambiguously characterized as truthful or erroneous. For the rest, this characteristic is conditional and in many cases depends on the context. Our task as verification analysts is to reduce this uncertainty by turning it into a set of rules. We have made significant progress in reducing the "gray area" in the good/bad distinction. Now we plan to go deeper into various specialized topics (e.g., jurisprudence or taxes), reducing the number of hallucinations and factual errors in them.