Reach out directly about this role
Technical Manager, Observability Platform
Yandex's Observability platform provides a clear and instant answer about the state of systems at any given moment. The platform includes quantitative monitoring, alerting, a notification system, logs, and traces. Almost all Yandex teams use the platform to track the state of their services—both external and internal. In addition, Yandex Monitoring and Yandex Cloud Logging services are available to Yandex Cloud users.
The challenges we face:
Creating a unified platform where services can be easily, conveniently, and according to common principles integrated, enabling users to quickly get an answer about the state of their systems based on all platform data
Developing platform services
Lowering the entry barrier and solving popular user scenarios out of the box. For example, automatic delivery of metrics, dashboards, and alerts from services, developing community and tools for sharing popular solutions
Supporting open-source solutions: Prometheus, Kubernetes, Grafana, and others We are looking for a technical product or project manager who, together with us, will be responsible for the development of the platform as a whole and its individual areas.
We work in a matrix structure. We have several dedicated teams that jointly work on platform development: backend, frontend, managers, designers, and system administrators.
Providing aggregated insights The Observability team works with a large volume of raw data. It's important to extract useful information from the entire flow and present it to the user in a convenient format. For example, showing an aggregated picture of the number and nature of errors in the system, as well as their correlation with other data.
Getting telemetry "out of the box" Quite often our users use standard infrastructure solutions in their services: databases, agents, load balancers, and much more. We ensure that users get telemetry for them "out of the box." Therefore, each user does not need to solve the task of monitoring these systems from scratch.
Developing client libraries Ready-made libraries are used to send telemetry to the Observability platform, simplifying connection for users. You will work on developing these libraries, lowering the entry barrier, and expanding the provided functionality.
See other vacancies for the Yandex Cloud Observability Platform by link.
5 years
Experience
Full-time
Employment
Hybrid, Remote, Onsite
Work Format
Middle
Grade
Product Management
Specialization
IT & Tech
Industry
Corporation
Company Type
By job title
Product Management
Specialization
IT & Tech
Industry
Corporation
Company Type