❇️** **Data Platform Engineer ❇️ | Top Selection Company
🔥 We are looking for a Data Platform Engineer / Big Data SRE (Linux) for a project-based engagement **
Grade: Middle+|Senior
Rate: from 256K to 280K
Citizenship/Location: RF
Workload: full-time
Term: long-term
Employment type: Sole Proprietor only 📌
📌Project:
Operation and development of a large corporate data platform in an industrial company.
✅Mandatory:
- Experience in administering Data Platform / Big Data / DWH for 3+ years
- Proficient knowledge of Linux (RHEL/CentOS/Ubuntu) at the system administration level
- Practical experience in operating production clusters: Arenadata DB / Greenplum (or similar MPP systems), Apache Kafka (mandatory with administration experience, not just usage), ClickHouse
- Understanding of distributed systems: replication, partitioning, fault tolerance, network interactions (TCP/IP), storage
- Experience with: PostgreSQL / Greenplum architecture, SQL query optimization
- Practical experience with Kafka: setting up topics and retention policy, working with replication / partitioning, producer/consumer performance tuning
- Automation skills: Bash and/or Python, Ansible and/or Terraform
- Monitoring experience: Prometheus, Grafana, ELK / OpenSearch
- Incident management experience: L2/L3 support, root cause analysis
➕Desirable:
- Docker / Kubernetes
- Hadoop ecosystem
- Spark / Flink
- Airflow
- Experience with high-load enterprise DWH
- Experience in industry / telecom / fintech
📝 Tasks: Platform Operation and Development
- Administration and support of Data Platform (Arenadata DB / Greenplum, Kafka, ClickHouse)
- Management of data storage and processing clusters
- Configuration and maintenance of high availability (HA), replication, load balancing
- Management of updates, patches, and releases
Kafka and Streaming Data
- Administration of Kafka clusters (topology, partitioning, replication, retention)
- Tuning of producers/consumers, working with lag and performance
- Support of streaming ETL and ingestion pipelines
Integrations and Data Handling
- Integration of the platform with DWH, BI, and ML systems
- Support and development of data transfer and processing circuits between systems
- Participation in ensuring the stability of data pipelines
Performance and Reliability
- Platform monitoring (Prometheus, Grafana, ELK/OpenSearch)
- Performance analysis (SQL, storage, network)
- Optimization of queries and cluster operations
- Incident response (L2/L3), root cause analysis
Infrastructure and Automation
- Automation of operations (Bash / Python / Ansible / Terraform)
- Development of CI/CD for data infrastructure
- Configuration management (Infrastructure as Code)
Integrations and Access
- Integration with DWH, BI, and ML systems
- RBAC configuration and access control
Access and Security
- RBAC configuration and access control