Description
We are building the foundation for the safe and efficient use of AI at the Bank. Our team is developing a multi-agent system for the autonomous monitoring of all the Bank's AI agents in industrial operation. This is not just dashboards and alerts — it's an intelligent platform that must understand how and why agents make decisions, predict failures before they occur, and automatically localize root causes.
We offer the opportunity to play a key role in developing a unique system from scratch, on which the reliability of all the Bank's AI services for millions of people depends.
Responsibilities
- Development, testing, and implementation of AI agents (including LLM-as-a-Judge) and classical ML models for quality assessment, anomaly detection, degradation prediction, and automatic root cause localization of failures.
- Research and implementation of new approaches in the field of LLM fine-tuning, LLM-as-a-Judge, and RAG to make monitoring more accurate, stable, and understandable.
- Managing a project through the full cycle: from idea and prototype to a working solution in production, its testing, and support.
- Designing pipelines for working with data (agent traces) and training models, integrating them into our MLOps ecosystem.
- Interaction with AI agent development and validation teams, MLOps teams for solution integration and establishing best practices.
Requirements
- Deep knowledge of mathematical statistics, classical ML algorithms, and neural network architectures.
- 5+ years of experience as a Data Science / Machine Learning professional with the full development cycle — from research and prototyping to production deployment and monitoring.
- Experience conducting analytical research (R&D): the ability to independently explore a problem domain, formulate and test hypotheses, select and adapt state-of-the-art methods for project tasks.
- Confident command of the technology stack for analysis, experimentation, and development: NumPy, Pandas, Polars, Scikit-learn, XGBoost, LightGBM, CatBoost, Optuna / Hyperopt.
- Skill in writing clean, modular Python code, understanding SOLID principles, experience with Git.
Will be an advantage:
- Experience in developing AI agents / agentic systems, understanding their operating, communication, and orchestration principles.
- Experience working with AI agent traces and metadata (OpenAI, Arize Phoenix, LangSmith).
- Experience with vector databases.
- Knowledge of the observability stack (OpenTelemetry, Prometheus, Grafana) for monitoring ML systems.
- Publications or significant contributions to open-source in the field of ML/NLP/LLM.
Personal qualities:
- Proactivity: the ability to independently identify problems and propose effective solutions.
- Systems thinking: the ability to see the project as a holistic system, understand interconnections and the long-term consequences of decisions.
- Effectiveness under uncertainty: the ability to work effectively with incomplete data and in changing conditions.
- Responsibility: understanding the importance of production systems and SLAs, readiness to take responsibility for one's decisions.
Conditions
- Comfortable modern office: Moscow, Kutuzovsky Prospekt metro station.
- Work format – office, by agreement with the manager, it is possible to perform a pool of tasks in a hybrid format.
- Annual salary review, annual bonus.
- Corporate gym and relaxation areas.
- More than 400 educational programs from SberUniversity for professional and career development.
- Onboarding program and supervisor support at the start.
- Extended voluntary health insurance (VHI), preferential insurance for family members.
- Mortgage program for employees.
- Free SberPrime+ subscription, discounts on products from partner companies.
- Referral bonus for recommending friends to Sber teams.