Description
IT B2C is the largest ecosystem in Sber. We are more than 8,000 people in 18 cities across Russia. We are engaged in the development and enhancement of retail solutions, helping to make the Bank's services more accessible, secure, and convenient.
We are waiting for you!
We are a team of experts united by a common passion for artificial intelligence and recommender systems (RecSys).
Our main task is to create a modern, scalable recommendation platform capable of anticipating user expectations, offering personalized recommendations at every stage of their interaction with the Sber ecosystem. Our solutions cover a wide range of industries: from finance and e-commerce to entertainment and healthcare.
The development of our platform is centered around the implementation of new SOTA models. We monitor global trends, experiment with new approaches, implement them as part of the platform, and bring them to specific business applications. We are looking for a Machine Learning Engineer to join the team to bring the Bank's recommender models into PRODUCTION. We work with huge amounts of data and high-load services, which makes our work not only important but also technically interesting. We also directly influence the development of the recommendation platform product itself in the Bank, as we determine its key growth points. If you are drawn to the idea of being a pioneer and want to be at the forefront of a new technology, join us!
Responsibilities
- Development and improvement of End-to-End ML pipelines;
- Development of production data processing pipelines;
- Working with Sber's massive data volumes (petabytes) on PySpark, researching approaches for their application in models;
- Writing efficient and scalable code for training and inference of models on PyTorch, conducting experiments on a GPU cluster;
- Performance optimization of code for processing large data arrays or high-load online recommendation services;
- Mentoring junior team members, sharing knowledge and expertise.
Requirements
- Mathematical background;
- Good knowledge of Python and key data processing frameworks (PySpark, PyArrow, Pandas);
- Experience writing quality production code;
- Experience writing industrial data processing pipelines containing many steps, dependencies, and complex logic;
- Experience using Airflow (or other industry-standard pipeline orchestrators, such as Luigi, Dagster, etc.);
- Good understanding of SQL / NoSQL databases.
Will be a plus:
- Experience with Kubernetes;
- Experience with MLFlow (or other similar tools);
- Experience with distributed training of large models on a GPU cluster;
- Experience or education in finance, banking;
- Experience implementing online inference under high load;
- Experience optimizing data preprocessing pipelines for highload.
Tech Stack:
- Python, PySpark, Airflow, Kubernetes, FastAPI, S3, PyTorch, MLFlow, Jira, Confluence, Git.
Conditions
- Hybrid/office work format (optional)
- Annual bonus and annual review
- Extended corporate health insurance from day one + dental care and preferential insurance for family
- Sber Corporate University, internal educational platform, participation in IT conferences
- Office at Kutuzovsky Prospekt with relaxation areas and a gym
- 90 days of remote work from any region of the Russian Federation (not applicable for support roles)
- Preferential mortgage at Sber, corporate pension program, SberPrime+ subscription, discounts from partners and services of the group of companies.