Apply now »

Data Platform SRE

Data Platform SRE

 

 

DATA PLATFORM SRE

 

We are looking for a Data Platform SRE to be part of our Nestlé Nespresso Digital and Tech Team. At Nespresso, our Digital & Tech teams are at the heart of our innovation journey, a space where we continue to invest, evolve, and grow.



 

Position Snapshot:

 

  • Location: Bengaluru, Karnataka, India
  • Type of Contract: Permanent
  • Grade: Band 2
  • Type of work: Hybrid
  • Work Language: Fluent Business English

 

 

The Role:

 

As a Data Platform Site Reliability Engineer (SRE), you will be responsible for ensuring the operational observability, reliability, and performance of Nespresso’s enterprise Data Platform. You will design and implement monitoring, alerting, reporting, and operational controls across Azure, Snowflake, Airflow, and Kubernetes environments, enabling proactive management of platform health, pipeline performance, data quality, and cost efficiency.

 

Working closely with Data Engineering, Platform Engineering, Analytics, and Operations teams, you will establish actionable observability frameworks, operational standards, and automated response mechanisms that ensure the platform remains reliable, scalable, and cost-effective. You will play a critical role in driving operational excellence by transforming platform telemetry into meaningful insights, enabling faster issue detection, resolution, and continuous improvement across the data ecosystem.

                                                                                     

 

In This Role, You Will:

 

  • Design, implement, and maintain monitoring, reporting, and alerting capabilities across Azure, Snowflake, Airflow, and Kubernetes environments.
  • Build and manage observability dashboards that provide actionable visibility into platform health, data pipeline performance, SLA compliance, resource utilization, and operational trends.
  • Develop monitoring frameworks and operational controls for Snowflake warehouse utilization, query performance, and cost optimization initiatives.
  • Establish and maintain alerting mechanisms for infrastructure risks, service degradation, pipeline failures, data quality issues, and cost anomalies.
  • Monitor and improve Kubernetes cluster health by implementing proactive monitoring patterns for node, pod, and resource utilization metrics.
  • Define, implement, and operationalize technical data quality monitoring aligned to key dimensions such as accuracy, completeness, consistency, timeliness, and uniqueness.
  • Collaborate with engineering teams to integrate alerts and operational events into incident management and operational tooling, including automated routing and escalation workflows.
  • Support observability and monitoring requirements for ML and AI workloads, including model performance monitoring, drift detection, and operational maintenance triggers where applicable.    

 

 

What We’re Looking For:

 

  • Proven experience in Site Reliability Engineering (SRE), Platform Operations, Data Operations, DevOps, or Data Platform Engineering roles.
  • Strong hands-on experience with monitoring and observability platforms such as Power BI, Grafana, Azure Monitor, Log Analytics, or similar technologies.
  • Experience designing and implementing monitoring, reporting, and alerting frameworks within Azure cloud environments.
  • Strong knowledge of Snowflake platform monitoring, warehouse utilization tracking, performance analysis, and cost optimization practices.
  • Experience monitoring data pipelines, managing SLA/SLO reporting, and implementing operational controls for data platforms.
  • Practical experience implementing technical data quality frameworks and monitoring across key data quality dimensions.
  • Experience monitoring Kubernetes environments, including cluster health, node performance, pod health, resource utilization, and operational alerting.
  • Hands-on experience monitoring workflow orchestration platforms such as Apache Airflow, including workflow health, failures, latency, and throughput analysis.
  • Proficiency in Python scripting and automation for monitoring, reporting, operational tooling, and process improvement initiatives.
  • Strong analytical and problem-solving skills with the ability to investigate incidents, identify root causes, and drive operational improvements.
  • Experience integrating alerts into incident management, operational processes, and automated remediation workflows.
  • Excellent communication, documentation, and stakeholder management skills, with the ability to create operational standards, runbooks, and monitoring best practices.

 

Extra Skills That Set You Apart:

 

  • Experience monitoring MLOps environments, including model drift detection, model performance monitoring, and operational lifecycle management.
  • Knowledge of FinOps practices, cloud cost management, and chargeback/showback models across Azure and Snowflake environments.
  • Experience building enterprise-wide observability frameworks, operational dashboards, and self-service monitoring capabilities for data platform users and engineering teams.

 

We Offer You:

 

We offer more than just a job. We put people first and inspire you to become the best version of yourself.

  • Flexible work policies including core hours and options for working from home. Discuss with us during the recruitment process to understand what flexibility could look like for you!
  • Genuine opportunities for career and personal development through ongoing training and constant career opportunities reflecting our conviction that people are our most important asset. 
  • Modern "smart office" locations providing agile workspaces. Our state-of-the-art campus is equipped with areas to co-create, network, and chill!
  • International, dynamic & inclusive working environment with attractive additional benefits.
  • The pride to work for a B Corp certified company and one of the world’s most trusted brands.

 

 

The Hiring Process:

 

  • Your Application: Submit your application, and we'll review it carefully (make sure your CV is in English as the hiring team is international).
  • Initial Screening: Relevant candidates will be contacted by our Talent Acquisition team for an initial interview. 
  • Hiring Manager Interview: Selected candidates will then meet with the hiring manager to discuss the role and their experience in more detail. 
  • Stakeholder Interview: Candidates will engage with potential team members to assess fit and collaboration. 
  • Leadership & HRBP Interaction: Candidates will have a discussion with our leadership team & HRBP. 
  • Feedback: After interviews, we provide feedback to all candidates. 
  • Job Offer: Successful candidates will receive a formal offer. 
  • First Working Day: Once the offer is accepted, we’ll welcome you on your first day!

 

 

About Nespresso: 

 

The Nespresso story began with a simple but revolutionary idea: enable anyone to create the perfect cup of espresso coffee.

 

Since 1986, Nespresso has redefined and revolutionized the way millions of people enjoy their coffee.

 

We are a Company committed with the Climate change and we aim to achieve carbon neutrality as soon as possible and net-zero GHG emissions by 2050 at the latest.

In 2019 we created the digital hub in Barcelona to offer the best customer experience and innovation to B2C and B2B channels.

 

We encourage the diversity of applicants across gender, age, ethnicity, nationality, sexual orientation, social background, religion or belief and disability.

People are at the heart of our success – all 14,000 of them. We actively cultivate diversity, inclusion and belonging in the workplace. We celebrate individuality, believing that your authenticity and uniqueness can help us to grow and thrive together

Step outside your comfort zone; share your ideas, way of thinking and working to make a difference to the world, every single day. You own a piece of the action – make it count.

Join Nestlé #beaforceforgood

 

 

 

DATA PLATFORM SRE

 

We are looking for a Data Platform SRE to be part of our Nestlé Nespresso Digital and Tech Team. At Nespresso, our Digital & Tech teams are at the heart of our innovation journey, a space where we continue to invest, evolve, and grow.



 

Position Snapshot:

 

  • Location: Bengaluru, Karnataka, India
  • Type of Contract: Permanent
  • Grade: Band 2
  • Type of work: Hybrid
  • Work Language: Fluent Business English

 

 

The Role:

 

As a Data Platform Site Reliability Engineer (SRE), you will be responsible for ensuring the operational observability, reliability, and performance of Nespresso’s enterprise Data Platform. You will design and implement monitoring, alerting, reporting, and operational controls across Azure, Snowflake, Airflow, and Kubernetes environments, enabling proactive management of platform health, pipeline performance, data quality, and cost efficiency.

 

Working closely with Data Engineering, Platform Engineering, Analytics, and Operations teams, you will establish actionable observability frameworks, operational standards, and automated response mechanisms that ensure the platform remains reliable, scalable, and cost-effective. You will play a critical role in driving operational excellence by transforming platform telemetry into meaningful insights, enabling faster issue detection, resolution, and continuous improvement across the data ecosystem.

                                                                                     

 

In This Role, You Will:

 

  • Design, implement, and maintain monitoring, reporting, and alerting capabilities across Azure, Snowflake, Airflow, and Kubernetes environments.
  • Build and manage observability dashboards that provide actionable visibility into platform health, data pipeline performance, SLA compliance, resource utilization, and operational trends.
  • Develop monitoring frameworks and operational controls for Snowflake warehouse utilization, query performance, and cost optimization initiatives.
  • Establish and maintain alerting mechanisms for infrastructure risks, service degradation, pipeline failures, data quality issues, and cost anomalies.
  • Monitor and improve Kubernetes cluster health by implementing proactive monitoring patterns for node, pod, and resource utilization metrics.
  • Define, implement, and operationalize technical data quality monitoring aligned to key dimensions such as accuracy, completeness, consistency, timeliness, and uniqueness.
  • Collaborate with engineering teams to integrate alerts and operational events into incident management and operational tooling, including automated routing and escalation workflows.
  • Support observability and monitoring requirements for ML and AI workloads, including model performance monitoring, drift detection, and operational maintenance triggers where applicable.    

 

 

What We’re Looking For:

 

  • Proven experience in Site Reliability Engineering (SRE), Platform Operations, Data Operations, DevOps, or Data Platform Engineering roles.
  • Strong hands-on experience with monitoring and observability platforms such as Power BI, Grafana, Azure Monitor, Log Analytics, or similar technologies.
  • Experience designing and implementing monitoring, reporting, and alerting frameworks within Azure cloud environments.
  • Strong knowledge of Snowflake platform monitoring, warehouse utilization tracking, performance analysis, and cost optimization practices.
  • Experience monitoring data pipelines, managing SLA/SLO reporting, and implementing operational controls for data platforms.
  • Practical experience implementing technical data quality frameworks and monitoring across key data quality dimensions.
  • Experience monitoring Kubernetes environments, including cluster health, node performance, pod health, resource utilization, and operational alerting.
  • Hands-on experience monitoring workflow orchestration platforms such as Apache Airflow, including workflow health, failures, latency, and throughput analysis.
  • Proficiency in Python scripting and automation for monitoring, reporting, operational tooling, and process improvement initiatives.
  • Strong analytical and problem-solving skills with the ability to investigate incidents, identify root causes, and drive operational improvements.
  • Experience integrating alerts into incident management, operational processes, and automated remediation workflows.
  • Excellent communication, documentation, and stakeholder management skills, with the ability to create operational standards, runbooks, and monitoring best practices.

 

Extra Skills That Set You Apart:

 

  • Experience monitoring MLOps environments, including model drift detection, model performance monitoring, and operational lifecycle management.
  • Knowledge of FinOps practices, cloud cost management, and chargeback/showback models across Azure and Snowflake environments.
  • Experience building enterprise-wide observability frameworks, operational dashboards, and self-service monitoring capabilities for data platform users and engineering teams.

 

We Offer You:

 

We offer more than just a job. We put people first and inspire you to become the best version of yourself.

  • Flexible work policies including core hours and options for working from home. Discuss with us during the recruitment process to understand what flexibility could look like for you!
  • Genuine opportunities for career and personal development through ongoing training and constant career opportunities reflecting our conviction that people are our most important asset. 
  • Modern "smart office" locations providing agile workspaces. Our state-of-the-art campus is equipped with areas to co-create, network, and chill!
  • International, dynamic & inclusive working environment with attractive additional benefits.
  • The pride to work for a B Corp certified company and one of the world’s most trusted brands.

 

 

The Hiring Process:

 

  • Your Application: Submit your application, and we'll review it carefully (make sure your CV is in English as the hiring team is international).
  • Initial Screening: Relevant candidates will be contacted by our Talent Acquisition team for an initial interview. 
  • Hiring Manager Interview: Selected candidates will then meet with the hiring manager to discuss the role and their experience in more detail. 
  • Stakeholder Interview: Candidates will engage with potential team members to assess fit and collaboration. 
  • Leadership & HRBP Interaction: Candidates will have a discussion with our leadership team & HRBP. 
  • Feedback: After interviews, we provide feedback to all candidates. 
  • Job Offer: Successful candidates will receive a formal offer. 
  • First Working Day: Once the offer is accepted, we’ll welcome you on your first day!

 

 

About Nespresso: 

 

The Nespresso story began with a simple but revolutionary idea: enable anyone to create the perfect cup of espresso coffee.

 

Since 1986, Nespresso has redefined and revolutionized the way millions of people enjoy their coffee.

 

We are a Company committed with the Climate change and we aim to achieve carbon neutrality as soon as possible and net-zero GHG emissions by 2050 at the latest.

In 2019 we created the digital hub in Barcelona to offer the best customer experience and innovation to B2C and B2B channels.

 

We encourage the diversity of applicants across gender, age, ethnicity, nationality, sexual orientation, social background, religion or belief and disability.

People are at the heart of our success – all 14,000 of them. We actively cultivate diversity, inclusion and belonging in the workplace. We celebrate individuality, believing that your authenticity and uniqueness can help us to grow and thrive together

Step outside your comfort zone; share your ideas, way of thinking and working to make a difference to the world, every single day. You own a piece of the action – make it count.

Join Nestlé #beaforceforgood

 

Bangalore, IN, 560103

Bangalore, IN, 560103

Apply now »