Description
The Center for Practical Artificial Intelligence is engaged in the development and implementation of high-tech AI tools. Tasks are drawn from everyday business practice.
Responsibilities
- develop, optimize, and maintain NLP/multimodal pipelines, including RAG systems and assistants for business tasks
- create and develop AI agents and multi-agent systems (workflow orchestration, planning, tools, memory modules, integrations with bank services)
- participate in formulating and testing hypotheses to improve the quality of models and pipelines
- integrate agent pipelines into high-load bank services, ensuring stability, performance, and monitoring
- adapt and implement research results into applied solutions
- develop services around models: API layers, microservices, inference scripts, CI/CD for ML
- ensure code quality and oversee engineering practices (testing, logging, monitoring)
- participate in selecting and configuring infrastructure for inference and training.
Requirements
- strong technical skills
- deep knowledge of NLP and a solid foundation in classical ML
- experience in developing RAG systems, ML assistants, working with vector stores and the retrieval stack
- experience in developing and productionizing ML services for 5+ years
- excellent knowledge of Python, experience writing industrial, maintainable, and testable code, working with concurrency and asynchronicity
- experience with multi-agent frameworks (LangGraph, LlamaIndex, or others)
- confident command of development and infrastructure tools: bash, Docker/Openshift/Kubernetes, Git
- experience packaging models into services and interfaces (FastAPI, Flask, Tornado; UI frameworks like StreamLit/ChainLit are a plus)
- understanding of inference and training technologies for large models (vLLM, DeepSpeed, Accelerate)
- experience integrating generative models into real business processes
- knowledge of CI/CD for ML/infra (GitLab CI/GitHub Actions/ArgoCD)
- skills in profiling, optimizing, and monitoring production systems (Prometheus/Grafana/OpenTelemetry)
- understanding of MLOps patterns: feature store, model registry, rollout/rollback strategies.
Will be a plus:
- experience working with multimodal models (Vision/Audio LLMs)
- experience with distributed training and optimization of large models.
Conditions
- comfortable modern office in Moscow, near Kutuzovskaya metro station
- office work format (hybrid format can be discussed after probation period)
- annual salary review and yearly bonus
- corporate gym and recreation areas
- more than 400 educational programs from SberUniversity for professional and career development
- VHI, preferential insurance for family members, and corporate pension program
- flexible mortgage discount equal to 1/3 of the Central Bank's key rate
- free SberPrime+ subscription, discounts on products from partner companies.