Reach out directly about this role
By country
5 years
Experience
Full-time
Employment
Hybrid, Onsite
Work Format
Middle
Grade
Backend
Specialization
IT & Tech
Industry
Corporation
Company Type
C++ Developer for YDB OLAP
YDB is a relational database. Distributed, scalable, fault-tolerant. YDB is also a platform. We offer users not only reliable transactions on top of tables, but also ready-made solutions based on the engine itself: persistent queues, federated queries, network disks for virtual machines (Yandex Network Blockstore), and more.
YDB is used in Market, Taxi, Bank, Alice, and other Yandex services. We can already handle millions of queries per second and reliably store petabytes of data. Solutions for delivering payment data to Yandex billing and storing logs in Yandex Cloud are built on YDB.
One of the development directions of the platform is analytical tables, or OLAP (Online Analytical Processing), an analogue of Microsoft Azure Data Explorer, Google BigQuery, ClickHouse, and Greenplum. We analyze competitors' solutions and build a product for large and medium-sized companies. Our goal is to develop a tool that will help partners easily, quickly, and reliably build control over the metrics important to them.
We have a sea of ideas, and we are looking for ambitious colleagues.
Development of YDB analytical tables in general You will solve tasks of efficient data storage and processing, in particular the development of components: index building, compaction, predicate pushdown, compression algorithms, and various encodings for data representation.
Development of backup infrastructure for columnar tables For the task of efficient backup of petabytes of data, special approaches are needed that can survive failures and continue their work from the last saved point. You will also have to solve the task of data consistency.
Horizontal scaling tasks For computational efficiency, it is necessary to engage in horizontal scaling. This task includes the ability to split columnar table tablets for independent and distributed processing of data ranges without affecting already running processing processes.
Join us, and Scale It Easy!