Reach out directly about this role
By city
1-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Middle
Grade
Data Science & ML
Specialization
AI
Industry
Corporation
Company Type
NLP Developer for Alice
Alice Pro is a B2B sibling service of the well-known AI assistant, which integrates the best of Alice AI into workflows: it enables searching and analyzing data, creating content, and connecting corporate systems.
We aim to become a leading player in the market by providing companies with a comprehensive solution for digitizing workflows. The focus is on implementing on-premise and cloud solutions for corporate customers: the system works with tens of thousands of documents and integrates into existing IT infrastructure.
Our team ensures high product quality thanks to advanced NLP technologies. Among the key areas:
Connecting new sources Currently, we are good at answering based on user-uploaded documents. But let's be honest: for users' work tasks, this is not the most attractive solution. We use messengers, Confluence, Jira, or internal analogues in our own tasks and now want to learn how to work with them.
You will need to help scale the architecture and approach to the solution so that Alice can answer based on any large external B2B databases, especially those from existing Yandex 360 solutions.
Improving Excel All RAG systems don't like tables due to non-obvious retrieval, separators, and input overflow with numbers. But we, as ML specialists, have a special place in our hearts for Pandas. It's time to combine these starting points, automatically solving tasks with tables through code generation. Why remember import pandas as pd and Excel when you can ask AI to build a chart in the chat?
Basic quality Roughly speaking, all the above projects are productively interesting applications of our technology. But all of this is the result of meticulous and very important work on the basic quality of answers from documents. We cannot stop developing the core technology for a second.
Our assistant doesn't always compare product specifications well, cannot make a diagnosis instead of a doctor, and may not understand thread designations on a drawing. But which specific model/technology under the hood of the OCR/VLM/IR/LLM stack is letting us down? Are you ready to treat this area?