Description

We are looking for an experienced and motivated team lead to head a team of administrators (Infrastructure Engineers) responsible for the reliability, performance, and development of Sber's critically important HR platform services. You will be not only a technical leader but also a mentor for the team, responsible for process quality and the strategic development of the infrastructure.

Responsibilities

Forming, developing, and motivating a team of administrators (Infrastructure Engineers).
Setting goals (OKRs), planning and distributing tasks, conducting regular 1:1 meetings and performance reviews.
Continuously improving support, monitoring, and automation processes.
Troubleshooting complex problems in distributed high-load systems.
Analyzing incidents and developing recommendations to improve the fault tolerance, scalability, and performance of the HR platform.
Developing proactive and reactive monitoring, creating effective alerts based on SLOs.
Participating in designing the architecture of new services considering reliability and operational requirements.
Participating in developing and implementing the reliability and performance strategy for key services.
Close collaboration with development, testing, and product teams throughout the service lifecycle.
Interaction with support and maintenance teams of the HR platform (SRE, DBA, DevOps).
Interaction with the bank's infrastructure support teams.

Requirements

Minimum 3 years of experience managing a Dev/DevOps/SRE/Infrastructure team (task setting, motivation, development, hiring).
Deep practical experience (minimum 5 years) as an Infrastructure/DevOps Engineer or SRE.
Deep understanding and practical application of SRE (Site Reliability Engineering) philosophy and practices.
Expert troubleshooting skills in complex distributed systems.
Experience building, scaling, and supporting high-load, fault-tolerant systems.
Proficient command of key automation tools: Ansible, Terraform.
Deep knowledge in the field of containerization and orchestration: Docker, Kubernetes (Openshift).
Solid knowledge of one automation language: Python, Go, Ruby, or Bash.
Experience with monitoring and visualization systems: Prometheus, Grafana, Zabbix, Dynatrace.

Tech Stack:

Linux: RHEL
Docker, Kubernetes, Openshift (CRI, CNI, CSI)
Nginx, envoy, openresty
Kafka
PostgreSQL, Redis, Clickhouse
Vault, Consul SD
ELK, fluentd, fluentbit
Prometheus, Grafana, Zabbix, Dynatrace
Jenkins, Gitlab (Drone, Gitea, Bitbucket)
Python, ruby, bash, groovy, Go
Ansible, terraform

Conditions

A good office (AgileHome) near Kutuzovskaya metro station with all amenities (cafeterias + numerous cafes + kitchens with refrigerators, coffee machines; a free gym; free underground parking; recreation areas - table tennis, several PlayStation consoles, foosball, billiards)
A competitive salary (base pay + bonuses)
An opportunity to work with a modern technology stack
Social package (VHI)
A huge catalog of educational programs, opportunities for training and certification at the company's expense
A preferential lending program at SBER
Discount programs from numerous partner companies

Contacts

Description

Responsibilities

Requirements

Conditions

Similar vacancies

SRE Engineer

AS Administrator / Support Engineer

AS Administrator / System Engineer

DevOps Engineer (Workstation Management Systems Team)

SRE Engineer

Middle DevOps Engineer

Senior DevOps (SRE) Engineer

Software Support Engineer

Software Support and Information Systems Implementation Engineer

Lead Support Engineer (Construction IT Block)

DevOps Engineer (Platform V GraDeLy)

Support Engineer

Team Lead of Application Administrators Team (HR platform Pulse)

Key Skills

Details

Average salary for this role