Job Summary
We are looking for a strategic and hands-on Operations Lead to manage the resilience, performance, and cost-efficiency of our Azure-based data platform. This role requires a unique combination of platform reliability management, incident response, SLA ownership, and financial operations (FinOps), alongside deployment oversight and CI/CD pipeline management. As the main point of contact for operational issues, you will coordinate outage responses, drive platform optimizations, and lead FinOps discussions with internal stakeholders to ensure smooth and efficient data platform operations.
Key Responsibilities
Manage the daily operations of our data platform, which includes Azure Synapse, Azure Databricks, Azure Data Factory (ADF), and Power BI. You will lead root cause analysis and incident triage during data quality issues, delays, or outages, ensuring timely communication with all relevant stakeholders.
Continuously monitor and optimize pipeline performance, compute workloads, Power BI refresh cycles, and overall platform resource usage. Define, implement, and enforce Service Level Agreements (SLAs) for critical datasets, pipelines, and reporting assets to maintain high operational standards.
Facilitate FinOps forums with business teams to review platform usage, identify cost-saving opportunities, and enhance financial accountability. Track, report, and optimize expenditures related to Databricks clusters, Synapse pools, ADF activity runs, and Power BI consumption.
Oversee and improve CI/CD pipelines for deploying data pipelines (ADF), Databricks notebooks, and Power BI assets. Collaborate closely with engineering teams to ensure safe, automated, and compliant releases of data workflows and platform updates. You will own version control hygiene, release management standards, and rollback procedures for critical platform changes.
Implement observability solutions using Azure Monitor, Log Analytics, or other native tools to proactively detect and resolve issues. Develop operational dashboards, alerts, and automate routine failure recovery tasks wherever possible.
Maintain comprehensive runbooks, escalation protocols, and incident management playbooks to streamline operational responses. Work in partnership with data engineering and analytics teams to align operational strategies with business priorities and the overall platform roadmap.
Required Qualifications
- Minimum of 8 years’ experience in data platform or data operations roles, with at least 2 years in a leadership or strategic capacity supporting modern cloud-based data platforms.
- Strong hands-on expertise with Azure Synapse, Databricks, Azure Data Factory, and Power BI.
- Proven experience managing CI/CD workflows for data deployments using Azure DevOps, GitHub Actions, or similar tools.
- Familiarity with Infrastructure-as-Code, release automation, and rollback planning.
- Solid understanding of cloud-native monitoring and incident management best practices.
- Excellent communication skills, especially under high-pressure operational incidents.
- Strategic mindset with the ability to link technical performance to business value and cost considerations.
- Strong documentation skills, process ownership, and problem-solving abilities.
- Experience with Azure Purview, data governance tools, or automated data quality checks.
- Knowledge of ITIL or formal service management frameworks.
- Certification or practical experience in FinOps or Azure cost management.
- Awareness of GDPR, HIPAA, or other relevant data compliance standards.
Preferred Qualifications and Benefits
Our team consists of over 700 professionals working on innovative enterprise projects and products, serving a diverse customer base including Fortune 100 retail and CPG companies, leading store chains, fast-growing fintech firms, and Silicon Valley startups.
Confiz stands out through its commitment to robust processes and a vibrant culture. We hold certifications in ISO 9001:2015 (Quality Management), ISO 27001:2022 (Information Security), ISO 20000-1:2018 (IT Service Management), and ISO 14001:2015 (Environmental Management). We foster a collaborative learning environment and prioritize making the workplace enjoyable.
Joining Confiz means working with cutting-edge technologies while contributing to both company success and your personal growth.