Reach out directly about this role
Generative Response C++ Developer in Search (Neuro)
Our team develops the infrastructure around generative network computation. We are responsible for implementing and supporting the business logic of the generative response in Yandex Search. Search with Alisa provides detailed answers with illustrations and videos, analyzes complex queries and offers solutions, generates images and texts. Our team ensures the operation of all these generative response scenarios.
Various teams within Yandex use our infrastructure to create new generative response scenarios, run experiments with new models, and refine existing pipelines.
The team's responsibilities include:
Developing the inference server We have an internal Yandex library that handles the calculations for generative models. You will need to develop/refine the server around this library, which must handle a wide variety of client requests: preprocessing, gRPC, WebSocket, sending data to external storage, batching.
Organizing release processes for delivering the server to production Within Yandex, there are already about 200 installations for different products. With this scale and diverse requirements, you will need to organize horizontally scalable deployment processes. This includes no-diff testing of all available endpoints for each client, the actual deployment, and traffic management during the rollout.
Creating architecture to support a large number of pipelines Our system includes many pipelines for generating responses: creative scenarios, detailed answers, generative responses for other countries. Your task is to design and improve the system architecture to ensure reliable operation of existing pipelines and efficient routing between them (selecting the relevant pipeline for a request). You will also develop new pipelines to implement future scenarios.
Integration with internal Yandex services to improve the user experience You will integrate various Yandex services into our pipelines to improve response quality (processing user queries, selecting the most relevant documents, etc.), as well as refine the main generative response backend to provide users with additional capabilities for working with their query data.
3-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Backend
Specialization
IT & Tech
Industry
Corporation
Company Type
By city
IT & Tech
Industry
Corporation
Company Type