You will play a critical role in setting observability standards and driving automation within our engineering teams. Your responsibilities will include managing and configuring the Datadog observability platform using Infrastructure-as-Code (IaC) practices. 

This is a hands-on role focused on ensuring end-to-end visibility into Java applications, Kubernetes workloads, and containerized infrastructure, with a focus on scaling observability efficiently and cost-effectively.

You will collaborate closely with Site Reliability Engineering (SRE), DevOps, and Software Engineering teams to standardize monitoring and logging practices, contributing to the development of scalable and reliable observability solution.

Key Responsibilities:

  • Define and implement observability standards for Java applications, Kubernetes workloads, and cloud infrastructure.
  • Configure and manage Datadog using Terraform and IaC best practices.
  • Lead the adoption of structured JSON logging, distributed tracing, and custom metrics across Java and Python services.
  • Optimize Datadog usage with cost governance strategies, log filtering, sampling, and automated reporting.
  • Collaborate with Java developers and platform engineers to standardize instrumentation and alerting.
  • Troubleshoot issues related to logs, metrics, and traces, ensuring proper instrumentation and data flow into Datadog.
  • Participate in incident response activities, providing insights for actionable alerting, root cause analysis (RCA), and reliability improvements.
  • Serve as the main point of contact for Datadog-related inquiries and internal support.
  • Continuously audit and improve monitor configurations, reducing false positives and improving alert quality.
  • Maintain clear documentation on Datadog usage, standards, integrations, and IaC workflows.
  • Evaluate and suggest improvements to the observability stack, including new Datadog features and OpenTelemetry adoption.
  • Mentor engineers and develop training programs on Datadog, observability-as-code, and log pipeline architecture.

Qualifications:

  • Bachelor’s degree in Computer Science, Engineering, Mathematics, Physics, or a related technical field.
  • 5+ years of experience in DevOps, Site Reliability Engineering, or related roles, with a strong focus on observability and Infrastructure-as-Code.
  • Extensive experience managing and scaling Datadog programmatically through Terraform, APIs, and CI/CD workflows.
  • Deep knowledge of Datadog features including APM, logs, metrics, tracing, dashboards, and audit trails.
  • Experience integrating Datadog observability into CI/CD pipelines (e.g., GitLab CI, AWS CodePipeline, GitHub Actions).
  • Strong understanding of AWS services and best practices for monitoring Kubernetes infrastructure.
  • Preferred background in Java application development.

Why Join Us?

  • Join a dynamic team where you’ll shape the future of observability and make a real impact on system reliability and performance. 
  • You’ll work with cutting-edge technologies, lead key initiatives, and mentor engineers. 
  • We offer a competitive salary, starting up to $1000 USD per month. 
  • Enjoy a collaborative, growth-focused environment where your contributions truly matter.

Job Details

Job Channel:
Total Positions:
2 Posts
Job Shift:
First Shift (Day)
Job Type:
Job Location:
Johar Town, Lahore, Pakistan
Gender:
Female
Minimum Education:
Bachelors
Career Level:
Experienced Professional
Minimum Experience:
5 Years
Apply Before:
Jun 07, 2025
Posting Date:
May 06, 2025

Staff Link

Information Technology · 1-10 employees - Lahore

What is your Competitive Advantage?

Get quick competitive analysis and professional insights about yourself
Talk to our expert team of counsellors to improve your CV!
Try Rozee Premium

Similar Job Titles

Senior DevOps Engineer

PureLogics, Lahore, Pakistan
Posted Apr 23, 2025

Senior DevOps Engineer

Rayymen Technologies Private Limited, Multiple Cities, Pakistan
Posted Apr 22, 2025

DevOps Engineer

MTBC, Multiple Cities, Pakistan
Posted May 02, 2025

DevOps Engineer

Droidor, Lahore, Pakistan
Posted Apr 17, 2025
View All
I found a job on Rozee!