We are a team working on platform services at Kaspersky Lab. We need an engineer to join our team who will be responsible for the support and development of Ceph-based storage systems.
The main consumer of the storage service is a private cloud based on OpenStack.
What you will be doing:
- Administering existing clusters, deploying new ones;
- Participating in the development of Ceph-based data storage service with access organized via RBD and S3 protocols;
- Automating your tasks using JuJu/Ansible;
- Conducting load testing and debugging of Ceph-based storage systems;
- Ensuring regular patch management;
- Debugging running clusters, identifying and resolving potential issues;
- Proposing solutions to current problems and infrastructure bottlenecks, researching and implementing new tools.
What we expect from the candidate:
- Extensive experience in administering and troubleshooting Linux systems;
- Experience in deploying and administering large-scale Ceph clusters;
- Ability to install and administer Ceph object gateway with access organized via S3/SWIFT protocols;
- Ability and desire to automate your work using Ansible;
- Good understanding of network principles and protocols;
- Knowledge of replication, scaling, and fault tolerance organization;
- Desirable experience in administering Ceph in conjunction with OpenStack;
- Broad IT knowledge;
- Communication skills, independence, high level of organization.