Description

We develop LegalTech products based on state-of-the-art NLP models. Our solutions daily analyze hundreds of types of legal documents, extract knowledge from them, and help people make responsible decisions based on that knowledge. This optimizes the bank's work on legal risks of credit transactions with minimal human involvement.

Responsibilities

— Solving complex tasks in creating diverse AI services in the legal field using the LLM GigaChat;

— Researching and selecting advanced solutions, rapid assessment of their effectiveness, evaluation of required resources (data, computation), hypothesis testing, development of an implementation plan;

— Formulating requirements for annotation: question-answer pairs and chats for the task of fine-tuning LLMs on a specialized legal domain, interacting with lawyers and annotation specialists;

— Improving LLM generation quality through the use of advanced prompting techniques (CoT, ToT, ReAct, Planning, etc.);

— Creating custom AI agents that solve legal tasks step by step;

— Participating in the creation of a specialized legal benchmark for assessing LLM capabilities;

— A desire to understand the nuances of our domain area.

Requirements

— At least 3 years of experience developing NLP models;

— Knowledge of advanced approaches, ability to explain them to the team;

— Excellent knowledge of Data Science fundamentals — from linear algebra and probability theory to DNNs and RLHF;

— Understanding the architecture, principles of operation, and training of large language models (LLMs) and generative transformers like GPT/Bert;

— Understanding the principles of training and applying Reinforcement Learning models;

— Understanding the main methods of Machine Learning (regressions, clustering, decision trees, etc.), confident knowledge of when to apply them and when not to;

— Assessing the computational complexity of the entire Pipeline, applying classical algorithms to reduce it;

— Willingness to take on non-standard, complex tasks;

— Quickly testing hypotheses on limited resources and scaling successful solutions;

— Ability to evaluate solution progress with metrics;

— Willingness to work in a team and use Git, Jira, Confluence, and other teamwork tools;

— A high degree of self-organization.

Conditions

— Extremely interesting NLP tasks in the most complex field of subject knowledge (GPT + Legal domain);

— Opportunities for learning and development, participation in Sber conferences;

— A cozy office with cookies and other amenities;

— Social benefits package (voluntary health insurance, fitness, preferential insurance).

Contacts

Description

Responsibilities

Requirements

Conditions

Similar vacancies

Senior NLP Researcher (RnD GigaChat)

Senior NLP Data Scientist (Knowledge Management team)

Middle/Senior Data Scientist LLM (B2C team)

Team Lead ML TTS GigaChat Data

Data Research Lead/Lead Data Scientist (SBOL)

NLP, LLM Middle+ Data Scientist (AI Solutions Development, Strategy & Development Block)

Senior Data Scientist (Modeling and Data Research Management)

Middle Data Scientist NLP (GenAI Solutions Validation)

Senior NLP Data Scientist (AI Agents)

NLP/LLM Researcher

Senior Data Scientist

Senior Data Scientist

Team Lead Data Science NLP

Key Skills

Details