Mô tả công việc
DevOps Engineer (Observability Platform)

Các trách nhiệm chính

  • Job Description

    • Plan and execute infrastructure projects that improve observability tools and platforms, including metrics, logging, distributed tracing, dashboarding, alerting, application performance management
    • Build and enhance tools, frameworks, and pipelines for monitoring, logging and tracing across VPbank's systems.
    • Developing and maintaining systems to enable our engineering colleagues to confidently run their services with high quality, and quickly understand and mitigate any technical issues
    • Working with engineering colleagues to contribute to our observability strategy with a strong emphasis on open-source libraries, data formats and query languages
    • Work with other stakeholders to integrate observability best practices into the development lifecycle and improve overall system resilience.
    • Contribute to on-call rotations
    • Join with the team to define and implement observability standards (e.g., SLIs, SLOs, dashboards).
    • Partnering with our observability vendors.

Trình độ đào tạo

Đại học in Công nghệ thông tin or Khoa học máy tính

Yêu cầu

  • Yêu cầu ứng viên

    • 3+ years of software engineering experience, focused on platform engineering, observability, or site reliability engineering (SRE)
    • Have experience building and operating infrastructure at scale.
    • Proficiency in programming languages such as Go, Python, Java, or similar.
    • Hands-on experience with observability / monitoring tools like OpenText, Solarwinds, Prometheus, Grafana, Loki, Jaeger, or OpenTelemetry.
    • Strong knowledge of distributed systems, microservices architecture, and cloud platforms (e.g., AWS, GCP, Azure).
    • Proficiency in CI/CD tools (Jenkins, GitLab, etc.)
    • Experience with administering Linux systems and writing scripts to facilitate operation.
    • Experience with Infrastructure-as-Code tools like Terraform, Helm, and orchestration languages like Ansible.
    • Experience with Data tools: Kafka, Clickhouse, SQL, Redis
    • Ability to debug complex systems and design solutions that scale with high traffic and data volumes.
  •