We're Celonis, the global leader in process mining technology, and we’ve delivered enormous value to the world’s largest and most esteemed companies. To continue to help organizations uncover hidden business value opportunities, reduce carbon emissions, and radically improve customer service, we need you to join us.
As part of our scaling Actions Platform team, you'll have a huge impact on helping teams and engineers build and operate resilient, reliable and scalable systems. You'll have ownership over our product's health, ensuring end-to-end availability and peak performance.
We are on the path to providing a first-class service, so we want our product to be super healthy and reliable at all times!
Collaboration is a huge part of our Celonis culture! Within this role, you’ll help teams catch issues before they affect customers, and tie reliability to business outcomes.
By helping our Product teams understand the reliability of their services and how they can improve it, our teams and engineers will be able to build, deliver, and operate resilient, reliable systems.
The work you’ll do
- Take ownership of complex issues related to performance, reliability, and scalability, from idea inception to production, including all required technical and organizational improvements.
- Help our engineering teams gain full control over the stability and performance of their services.
- Support and drive the investigation and resolution of incidents and issues in production.
- Monitor and maintain object and data storage solutions.
- Lead postmortems and root cause analysis to facilitate continuous improvement.
- Design, write, and deliver software that enhances the availability, scalability, and efficiency of our products.
- Proactively identify, plan, and execute improvement opportunities to minimize risks, address recurrent issues, automate manual processes, improve quality, and streamline our software deliveries.
- Provide technical leadership on reliability to engineers, managers, and product managers.
- Improve our monitoring, metrics, and KPIs, as well as define and implement missing SLOs.
- Implement processes and automation to prevent problem recurrence.
- Share acquired knowledge and document accordingly while implementing SRE best practices.
- Guide a technical roadmap for reliability to enable the planning and building of reliable solutions using our infrastructure and developer productivity platform.
The Qualifications You Need
- Experience in Software Engineering roles, typically with 5+ years of experience.
- Master’s degree in Computer Science or equivalent experience and skill set.
- Experience in developing and running large-scale productive services with Docker and Kubernetes.
- Experience working with in-memory data stores (e.g., Redis), RDBMS (e.g., Postgres), AMQP (e.g., RabbitMQ), and NoSQL (e.g., ElasticSearch).
- Experience working with various public cloud providers (AWS, Azure, or GCP) and modern cloud monitoring system observability frameworks (e.g., Datadog).
- Solid knowledge of scripting languages (e.g., Bash, Python, Ruby…).
- Proven problem-solving skills and the ability to troubleshoot complex technical issues.
- Deep commitment to maintaining high system reliability and availability.
- Experience in supporting or mentoring other developers in running services reliably in production.
- Excellent communication and collaboration skills to work effectively with cross-functional teams.
What Celonis can offer you:
- The unique opportunity to work within a new category of technology, Process Intelligence
- Investment in your personal growth and skill development (clear career paths, internal mobility opportunities, L&D platform, mentorships, and more)
- Great compensation and benefits packages (equity (restricted stock units), life insurance, time off, generous leave for new parents from day one, and more). For intern and working student benefits, click here
- Physical and mental well-being support (subsidized gym membership, access to counselling, virtual events on well-being topics, and more)
- A global and growing team of Celonauts from diverse backgrounds to learn from and work with
- An open-minded culture with innovative, autonomous teams
- Business Resource Groups to help you feel connected, valued and seen (Black@Celonis, Women@Celonis, Parents@Celonis, Pride@Celonis, Resilience@Celonis, Asians@Celonis, Latinx@Celonis, Veterans@Celonis and more
- A clear set of company values that guide everything we do: Live for Customer Value, The Best Team Wins, We Own It, Earth Is Our Future
Since 2011, Celonis has helped thousands of the world’s largest and most esteemed companies yield immediate cash impact, radically improve customer experience, and reduce carbon emissions. Its Process Intelligence platform uses industry-leading process mining technology and AI to present companies with a living digital twin of their end-to-end processes. For the first time, everyone in an organization has a common language for how the business runs, visibility into where value is hiding, and the ability to capture it. Celonis is headquartered in Munich, Germany and New York City, USA with more than 20 offices worldwide.
Celonis is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment and equal opportunity in all aspects of employment. We will not tolerate any unlawful discrimination or harassment of any kind. We make all employment decisions without regard to race/ethnicity, color, sex, pregnancy, age, sexual orientation, gender identity or expression, transgender status, national origin, citizenship status, religion, physical or mental disability, veteran status, or any other factor protected by applicable anti-discrimination laws. As a US federal contractor, we are committed to the principles of affirmative action in accordance with applicable laws and regulations. Different makes us better.