Description
We are developing Russia's first World Model based on the Kandinsky video generation framework. Input can be a description, a frame, a script, a set of actions, or instructions; the output is an interactive scene. The focus is on developing new architectures and techniques, training large models (tens/hundreds of billions of parameters), and optimizing inference.
Responsibilities
- research and development in the field of video generation and world models (researching existing architectures and developing new ones)
- working on various aspects of project activities: reading papers and forming hypotheses, planning and designing experiments, including data collection and preparation (if needed), processing and analyzing results, defending proof of concept, presenting results at review meetings, writing papers and technical literature
- working on topics such as improving the physical realism of generation, measuring and enhancing spatiotemporal consistency, incorporating control context into generation, all kinds of optimization aimed at increasing autoregression stability and accelerating generation, and other world model modules
- collaborative work both within the team building the World Model and with other teams responsible for data collection and preparation, pre-training, post-training, optimization, open-source, production, and others
- interacting with partner teams for model adaptation and implementation (autonomous vehicles, robotics, video game industry).
Requirements
- expert-level Python, PyTorch
- deep understanding of ML/DL/CV and visual GenAI
- experience in classical image and video processing tasks
- experience processing large video datasets
- experience with diffusion models
- understanding of training/distributed training methods
- understanding of modern LLM and Diffusion model architectures
- understanding of video quality assessment metrics.
BONUS:
- understanding of 3D and its connection to images/video (point clouds, depth maps, voxels, meshes, NeRF, Gaussian splats, etc.)
- understanding of RL principles
- understanding of digital signal processing, compression, and various enhancers
- understanding of the concepts of autonomous driving or robotics
- ability to explain complex concepts in simple terms.
Conditions
- comfortable modern office near Kutuzovskaya metro station
- hybrid work format
- annual salary review, quarterly and annual bonuses
- corporate gym and relaxation areas
- access to over 400 educational programs from SberUniversity for professional and career development
- onboarding program and manager support at the start
- extended voluntary health insurance, preferential family insurance, and corporate pension program
- mortgage benefits up to 7% more favorable for every employee
- free SberPrime+ subscription, discounts on products from partner companies
- referral bonus for recommending friends to the Sber team