
Senior Customer Reliability Engineer - Infrastructure
3d3 days agoAstronomer
IE · Full-time · €120,000 – €160,000
About this role
Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform powered by Apache Airflow®. The Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service.
As a senior infrastructure specialist, you will focus on the reliability of the underlying cloud infrastructure and Kubernetes clusters. This entails responding to incidents raised by customers or from our monitoring system, then taking further steps to ensure problems are permanently resolved or monitored.
This role is directly customer-facing and gives exposure to very diverse problems and requirements across different cloud providers and industries. You will own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide white-glove guidance on the path to production.
As owners of the observability platform, CRE has unlimited potential to improve the reliability of the product and deliver the best possible outcome for our customers. Your contributions will directly impact customers' success with using Astronomer products, and you will help make meaningful improvements to the customer experience.
Requirements
- 6 years of experience, preferably with large, complex cloud infrastructures operating at scale.
- 4 years of experience with Kubernetes.
- Experience managing a Production distributed system with at least one major cloud provider (AWS, GCP, or Azure).
- Strong Linux experience.
- Knowledge of how to operate and monitor issues for distributed systems.
- Previous experience in handling customer issues (internal or external).
- DevOps or CI/CD experience.
- Python scripting.
Responsibilities
- Provide solutions to customers to make them successful using our products.
- Troubleshoot customer environments and engage in active triaging with customers.
- Participate in on-call rotation for weekend coverage.
- Build out our monitoring and alerting systems.
- Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible.
- Help direct the architecture of the products and contribute where possible.
- Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide white-glove guidance on the path to production.
- Enhance and enrich customer documentation.
Benefits
- Work with the latest technology and multi-cloud implementations.
- Participate remotely within a fully distributed team.
- Directly impact customer success and product reliability.
- Exposure to diverse problems and requirements across different cloud providers and industries.
Similar roles

L2 Support Engineer
19h19 hours agoVirtusa
Chennai, IN · Full-time · INR 800,000 – INR 1,200,000

Microsoft Support Expert (Level 2/3)
2d2 days agoASI
Nantes, FR · Full-time · €33,000 – €36,000

Level 2 Helpdesk Engineer - Remote
2d2 days agoScalableOS
Remote · Full-time · PHP 360,000 – PHP 600,000

Technical Recruiter - Remote
3d3 days agoCodePath
US · Contract · $85,000 – $110,000