Reach out directly about this role
Analyst-Developer for Autonomous Vehicles
We are the data mining group for autonomous vehicles — we measure the quality of our technologies. Our main task is to collect datasets for all vehicle components, from location localization to motion planning. But why isn't this just a simple train_test_split?
Two main reasons:
The vast majority of driving scenarios are "repetitive" and "boring." But as technologies improve, we want to be able to distinguish increasingly finer changes, because each next step becomes harder, though it carries greater weight. One way to solve this problem is to artificially increase the proportion of rare scenarios, while not losing the ability to look at random intervals. The basic idea is similar to Active Learning, which we also plan to actively implement. By the way, defining "repetitiveness" and "interestingness" is another one of our tasks.
There are five components in an autonomous vehicle responsible for autonomy. There are also many adjacent tasks that are not focused on autonomy at runtime, but their quality also needs to be measured. For example, the quality of driving simulation. To correctly determine the direction of development and help teams adjust it, we collect datasets separately for each major task.
Data Mining Our group's main task is to help teams improve technology quality. We collect datasets for training and quality measurement, find growth points and devise scaling methods, actively use LLMs, VLMs, open-source and proprietary models. This is a creative task that requires a love for digging into data and good communication skills: to understand what adjacent teams want and to explain what they actually need.
Determining "interestingness" and "repetitiveness" of segments from real (and simulated) drives Not all segments are equally useful, and running simulations on identical trips is a complete waste of resources. For similar inputs, we get similar outputs and no useful information. By correctly deduplicating and weighting trips, we can save computational and time resources, meaning we can simulate more data!
An important nuance: "interestingness" and "repetitiveness" depend on the component. For example, localization pays attention to the location and objects around the autonomous vehicle, while perception focuses on obstacles and agents, trying to predict their behavior and impact on the vehicle.
More about analytics at Yandex — in the channel Yandex for Analytics
3-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Data Analytics
Specialization
Robotics
Industry
Corporation
Company Type
By city
Robotics
Industry
Corporation
Company Type