Developer of Multimodal VLM (visual language models)

Multimodal models are one of the trends in the field of deep learning. We, the computer vision team, build visual-language multimodal models (visual language models, VLM). They adapt large language models to work not only with text but also with images.

We are looking for developers who will work on next-generation neural networks and bring their solutions to the level of a finished product.

What tasks await you

Train large language models to work with visual information (images and videos) You will work at the intersection of two areas: computer vision and natural language processing. Non-standard technical and architectural solutions are used to create VLM.

Create large data pipelines that process all internet data Training VLM requires a huge amount of data. We are building full-scale data pipelines for collecting, processing, and filtering multimodal data.

Optimize large-scale model training and accelerate their inference There are many nuances in the process of training VLM. To make it efficient, we have to profile bottlenecks a lot. And after training, we need to think about how to make fast inference of such models.

Adapt models to product requirements Our goal is to integrate VLM into every Yandex service. To do this, we have to consider the specifics of each task, and most importantly — adapt the model (both architecturally and functionally) to specific requirements.

More about Alice AI

More about ML at Yandex — on the channel Yandex for ML

We expect that you

Understand how modern neural network architectures work
Are familiar with large language models
Have worked with large volumes of data
Have trained deep learning models and deployed them to production
Keep up with the latest achievements in the fields of computer vision and natural language processing (understand the difference between ViT and ConvNeXt)

Contacts

What tasks await you

We expect that you

Similar vacancies

ML Developer for the VLM Foundations Team

ML Developer for the Visual Yandex Market Quality Team

ML Developer for the Detector Pretraining Subgroup in Autonomous Transport

C++ ML Developer for Yandex Visual Search Team

ML/C++ Developer for the Yandex Visual Search Team

ML Engineer / Research Engineer (R&D) in Autonomous Transport

Senior ML Engineer (Multimodal LLM; Video Understanding)

ML Developer for the International Search Ranking Team

Computer Vision Developer for the Generative Models Team

Machine Learning Engineer (Robotics/Humanoid AI)

AI Products Developer at Plus Fantech

Senior DL Developer for Neuro Team

Multimodal VLM Developer (visual language models)

Key Skills

Details

Average salary for this role