Reach out directly about this role
ML Developer for the Machine Learning Quality Group of the e-com Content System
Search technologies are the DNA of the Search business group. Already, every fifth query in Search is for selecting products. This scenario generates 40% of the profit. We are working on a tool that searches for information across all possible online stores (more than 60,000 of them). According to plans, a convenient AI consultant will be built into the tool: it will help compare products by characteristics or decide where to buy better.
We are looking for an ML developer to develop and support product data processing processes: matching, deduplication, and creating new product cards.
Setting up the product matching process Our team prepares data that allows comparing prices for products across the entire Russian e-com catalog. Such a task is called matching, and to solve it, you need to match two product cards from different sellers and determine if it's the same product or different ones. The complexity of the task is that solving it requires considering all product data (images, descriptions, attributes), as well as achieving stability of trained models so they work correctly both on popular devices like iPhone 16 Pro Max and on niche ones like plumbing pipes.
Supporting the product deduplication process When solving the matching task, two types of data arise: product (SKU) and offer. SKU is the internal representation of the product, a beautiful card that the user sees in the interface. Offer is a proposal to sell a product by a specific seller. For good matching, a high-quality SKU database without duplicates is needed. The complexity lies in the increased requirements for model quality: if you say that two SKUs are duplicates and make a mistake, there is a risk of 'merging' into one set offers for products with different prices. You will have to prevent this.
Creating SKU cards The most complex task is the automatic creation of SKU cards. You will need to create new SKUs based on offers from various e-com platforms, with cards containing the most detailed product information and beautiful, relevant pictures. The complexity of the task lies, firstly, in not creating a new SKU when a suitable one already exists in the database, and secondly, in filling the SKU card with combined (sometimes even contradictory) information from several offers.
More about ML at Yandex — in the channel Yandex for ML
3-5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Senior
Grade
Data Science & ML
Specialization
Ecommerce
Industry
Corporation
Company Type
By city
Data Science & ML
Specialization
Ecommerce
Industry
Corporation
Company Type