Description
We develop and implement advanced optimization methods for the training and inference of extremely large neural networks (tens of billions of parameters) for multimodal generative models. The focus is on compilation, quantization, distillation, sparsity, and other acceleration techniques, without compromising quality.
Responsibilities
- research and implementation of training optimization methods (gradient checkpointing, activation recomputation, mixed-precision, computational graph optimization)
- development and integration of inference acceleration techniques: quantization (INT8, FP8), pruning, structured sparsity, knowledge distillation
- use and modification of ML compilers (TorchDynamo, TorchInductor, TensorRT, and others) for optimizing computational graphs
- collaboration with CUDA operators and Distributed Learning teams to ensure maximum GPU performance
- design and execution of model compression experiments and comparative analysis of speed/quality trade-offs.
Requirements
- expert-level Python, PyTorch
- experience with ML compilers and optimization of inference and training
- deep understanding of quantization, distillation, and sparsification methods
- skills in performance profiling and optimization (PyTorch Profiler, Nsight Systems, perf)
- understanding of modern LLM and Diffusion model architectures
Bonus: Experience in optimization for CPU/ASIC/FPGA, publications at NeurIPS/ICML/MLSys, knowledge of C++.
Conditions
- comfortable modern office near Kutuzovskaya metro station
- hybrid work format
- annual salary review, quarterly and annual bonus
- corporate gym and recreation areas
- more than 400 educational programs from SberUniversity for professional and career development
- adaptation program and manager's assistance at the start
- extended private health insurance, preferential insurance for family, and corporate pension program
- mortgage rates more favorable by up to 7% for every employee
- free SberPrime+ subscription, discounts on partner company products
- referral bonus for recommending friends to the Sber team