Description
We are protecting the future: building a cybersecurity system for generative AI at Sber. Our product SOC4AI (Security Operations Center for AI) is a component within the Sberbank SOC (Security Operations Center) structure, responsible for monitoring, analyzing, and countering cyber threats targeting GenAI models and AI agents. We are at the forefront of defending the digital tomorrow and are looking for talented developers ready to tackle the most complex challenges.
Responsibilities
Your tasks will include:
- developing, training, and optimizing ML models for detecting anomalies and cyber threats in the behavior of GenAI models and AI agents
- designing and implementing data preparation and processing pipelines (Feature Engineering, preprocessing, vectorization) for machine learning tasks
- integrating ML models into the high-load services of the SOC4AI product
- participating in building and supporting Real-time data streaming pipelines for operational cyber threat detection
- collaborating with related teams (cybersecurity platforms, backend developers, analysts) to align data requirements and integrate models
- writing high-quality, efficient, testable, and documented code
- participating in all stages of the ML model lifecycle - from data collection and experiments to monitoring drift and retraining in the production environment
- conducting demonstrations of implemented functionality for stakeholders and teams
- analyzing and participating in the resolution of incidents related to the operation of ML components in the system
- working in an Agile team, participating in sprint planning and task estimation.
Requirements
What is important for us:
- commercial development and implementation experience with ML solutions for 3+ years
- confident proficiency in Python and key machine learning libraries (scikit-learn, pandas, NumPy, PyTorch / TensorFlow, LLM)
- experience in developing and optimizing models for NLP tasks and/or sequence analysis
- experience working with LLMs, Transformers
- experience with Apache Flink (including Apache Flink SQL) for stream data processing
- experience with the Big Data stack: Apache Spark (for batch processing) and/or Hadoop HDFS (for distributed storage)
- experience with relational/non-relational databases (PostgreSQL, NoSQL)
- experience with vector databases (PostgreSQL, Qdrant, Milvius, ChromaDB)
- understanding of MLOps principles: CI/CD for models, experiments, data and model versioning (Data Version Control, MLflow), containerization (Docker) and orchestration (Kubernetes)
- experience working in a team using agile development methodologies (Agile/Scrum/Kanban).
Will be a significant advantage:
- experience in projects with a microservice architecture focused on processing large volumes of data in real-time mode (Real-time)
- experience with tracing systems (OpenTelemetry, Jaeger) and monitoring systems (Prometheus, Grafana)
- curiosity in the field of generative AI: understanding the basic principles of LLM operation, prompt engineering (PromptEng), familiarity with frameworks for creating AI agents.
Conditions
We offer:
- comfortable modern office near Leninsky Prospekt metro station
- office-based work format
- annual salary review, annual bonus
- corporate gym and recreation areas
- more than 400 educational programs from SberUniversity for professional and career development
- adaptation program and manager support at the start
- extended VHI (Voluntary Health Insurance), preferential insurance for family, and a corporate pension program
- flexible discount on mortgage loans, equal to 1/3 of the Central Bank's key rate
- free SberPrime+ subscription, discounts on products from partner companies
- referral bonus for recommending friends to join the Sber team.