Developer of a storage format for data in YTsaurus dynamic tables

YTsaurus is a software product for building large data lakes, where data can be processed using different paradigms: both MapReduce (background processing) and NewSQL (real-time). YTsaurus has its own data storage layer and its own implementations of storage formats—efficient for Yandex's real-world data and volumes.

You will work on the data storage layer and adapt it for fast analytics tasks.

What tasks await you

Hierarchical data format One of the important tasks is to develop a compression format for hierarchical data, which will allow both efficiently reading large ranges and quickly retrieving one specific document or part of it.

Such a task requires working with various compression mechanisms as well as low-level engineering work at the CPU and memory access level. You will need both SIMD instructions and code adaptation to the processor's memory hierarchy. We expect that you love algorithms and efficient C++ programming!

Historical data format for analytics tasks Dynamic tables (as we call the NewSQL component of YTsaurus) traditionally use a data format tailored for transaction processing. History in such data is stored together with timestamps, which allows providing a snapshot isolation level. Such data is redundant for analytics tasks: simpler formats are better suited for them. You will need to find a compromise and adapt history storage in dynamic tables to make them suitable for transactional-analytical tasks.

Analytical indexes Analytics uses its own indexes: SMA, star-tree. You will need to add them to the data formats, implement their construction, and use them in queries. This task will require diving into the entire SQL query processing cycle.

More about backend at Yandex — in the channel Yandex for Backend

We expect that you

Can program in C++
Know and continue to learn new algorithms
Love diving into the specifics of hardware operation
Want to build a reliable service for users

Will be a plus

Have worked with low-level optimizations: tailored algorithms to the memory hierarchy and processor pipeline
Understand compression algorithms, especially fast ones
Are familiar with general principles of DBMS construction and SQL query processing

Developer of a storage format for data in YTsaurus dynamic tables

You will work on the data storage layer and adapt it for fast analytics tasks.

What tasks await you

More about backend at Yandex — in the channel Yandex for Backend

We expect that you

Can program in C++
Know and continue to learn new algorithms
Love diving into the specifics of hardware operation
Want to build a reliable service for users

Will be a plus

Have worked with low-level optimizations: tailored algorithms to the memory hierarchy and processor pipeline
Understand compression algorithms, especially fast ones
Are familiar with general principles of DBMS construction and SQL query processing

Key Skills

Contacts

Average salary for this role

Details

What tasks await you

We expect that you

Will be a plus

Similar vacancies

C++ Developer for the YDB String Tables Team

Cloud Solutions Developer on the ClickHouse Platform

C++ Developer for the YQL over YT development group

Developer for the Directory Infrastructure Development Team

C++ Developer for the YDB Distributed Storage Team

Backend Developer for Feature Store

C++ Developer for the YDB Distributed System Infrastructure Team

C++ Developer for YT Flow

Developer for the Automatic Ad Generation Team

C++ Developer at YDB

C++ Developer for the Tablets Team

C++ Developer for the Market Product Card Group

Key Skills

Contacts

Average salary for this role

Details

What tasks await you

We expect that you

Will be a plus

Similar vacancies

C++ Developer for the YDB String Tables Team

Cloud Solutions Developer on the ClickHouse Platform

C++ Developer for the YQL over YT development group

Developer for the Directory Infrastructure Development Team

C++ Developer for the YDB Distributed Storage Team

Backend Developer for Feature Store

C++ Developer for the YDB Distributed System Infrastructure Team

C++ Developer for YT Flow

Developer for the Automatic Ad Generation Team

C++ Developer at YDB

C++ Developer for the Tablets Team

C++ Developer for the Market Product Card Group