Reach out directly about this role
Advertising is one of Yandex's most high-load services. Every second, we help thousands of companies find clients and develop new technologies. We handle genuine highload, ensure reliability at 9999 level across the entire stack, and rely on clear, transparent metrics to make decisions.
We are looking for Lead Developers in C++, Java, Python, and Go, as well as Technical Leaders for various areas: the Recommender Systems Infrastructure Department, the ML Infrastructure Department, the Advertising Stability and Labelling Department, and other divisions.
Fast Advertising Data For businesses, it is extremely important that any event—whether it's a change in product price or a user click—is taken into account in the final ranking as quickly as possible. Our task is to reduce these delays to minutes and seconds. We have already created fast profiles for all major advertising entities. Now we are working on incremental updates of all advertising databases and indexes.
Real-time machine learning In addition to simply delivering updated profiles to the runtime, there is the task of retraining neural network models on fresh data. We are taking the first steps towards moving dataset construction from MapReduce to RT. We need to build a system capable of processing over 10 GB/s of input data with a minute delay, performing windowed joins across several logs. For this, we are actively developing our own data streaming framework.
A Unified Company-wide Neural Model Inference Framework (or Inference Server) Most high-quality implementations at Yandex are achieved through neural models. We are developing a company-wide framework that allows for quick and easy deployment of inference for all major neural network architectures, provides all surrounding infrastructure (graphs, logs, performance tests), and maximizes GPU or CPU efficiency. Currently, there is a framework in beta stage, but we have a long way to go.
ML-DWH Advertising involves hundreds of product ML tasks, each requiring data, sometimes amounting to many tens of petabytes. Moreover, the data is needed not just once, but constantly, to retrain models on it. To do this efficiently and simply, saving time for hundreds of researchers, we are developing a framework from scratch. The product is currently at the PoC stage with the first early adopters, but we need to scale.
Developing Data Processing Functionality In the advertising technology market, the ORD (Operator of Advertising Data) service is rapidly developing. Infrastructure investments are particularly important here. In ORD, you will have the opportunity to work with Yandex's cutting-edge data streaming solutions. We plan to add new processing loops to the event-driven service architecture.
Developing ORD as a Product At the start of the project, the priority of Yandex ORD was labeling data for the Yandex Direct advertising system, which generates over 200 million requests per day just for creative labeling. Now we aim to make ORD more accessible and convenient for a wide range of users.
Apply if you are interested in contributing to the development of one of Yandex's key services! If you want to ask additional questions, write in Telegram to the recruitment team consultant: @nikitakv, Nikita.
5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Lead
Grade
Backend
Specialization
IT & Tech
Industry
Corporation
Company Type
Backend
Specialization
IT & Tech
Industry
Corporation
Company Type