Site Reliability Engineer

Explore roles

Dunelm

Site Reliability Engineer

Multiple locations
NEW
Remote or hybrid

Description

About the Role
We’re looking for a Senior Site Reliability Engineer to help scale our observability practices, improve system reliability, and drive high availability across our technology platform.
You’ll work closely with engineering and platform teams, combining software engineering skills with systems thinking to build reliable, observable, and efficient services.
What You’ll Do
Lead Observability: Evolve how we collect and process telemetry using OpenTelemetry. Focus on actionable, cost-effective metrics and traces.
Define Reliability Standards: Help teams adopt SLIs/SLOs and build a culture of service ownership.
Incident Response: Take charge during incidents, lead post-incident reviews, and drive long-term fixes.
Infrastructure Automation: Use tools like Pulumi, Terraform, and CDK to manage AWS infrastructure and CI/CD pipelines.
Write Code: Build tooling and automation using TypeScript (and optionally Rust). Contribute to shared libraries and platforms.
Mentor and Collaborate: Guide other engineers, share knowledge, and help grow the SRE mindset.
Drive Innovation: Explore new tools and methods in observability and reliability. Lead proof-of-concepts and continuous improvement.
What You’ll Bring
Essential:
Strong experience in TypeScript or similar languages.
Solid understanding of OpenTelemetry and modern observability tools (e.g. Datadog, Grafana).
Deep knowledge of SRE principles: SLIs/SLOs, automation, monitoring, and incident response.
Hands-on experience with AWS (e.g. Lambda, ECS, S3, DynamoDB).
Comfortable with Linux, command line, and system debugging.
Experience with infra-as-code tools like Pulumi or Terraform.
Familiar with Kubernetes, CI/CD pipelines, and automated deployments.
Strong problem-solving skills and attention to detail.
Desirable
Experience with Rust or Go.
Deep understanding of trace sampling and OpenTelemetry at scale.
Experience reducing observability/cloud costs.
Exposure to Google Cloud Platform.
Familiarity with retail tech challenges is a bonus.
How You Work
You align with our core values:
Take ownership and build trust.
Communicate openly and support your team.
Stay curious and always look to improve.
Embrace change and challenges.
Think long-term and work collaboratively.

Role tech stack

opentelemetry
aws
typescript

Culture overview

We're here to help our customers create the joy of feeling truly at home. Join us and you'll find our caring and inclusive culture makes this a place you'll feel right at home too. Learn Wherever you work with us and in whatever role, you'll have every opportunity to keep on learning and keep on growing. Thrive We'll take care of you, and make sure your everyday needs are met, so you can focus on doing a great job and being the best version of you. Belong We embrace diversity in all its forms. We'll celebrate the individual you are and value the unique contribution you bring. Colleague Networks All of our colleagues have the opportunity to be part of our four colleague networks. These are Disability & Neurodiversity, LGBTQ+, Gender Equality and Ethnicity & Race. Each network has co-chairs and an exec sponsor who work closely with us to ensure that we are a workplace where everyone feels supported, celebrated, valued and heard. A chance to give something back We're serious about our role in society. Each of our stores is partnered with a local charity and has its own community Facebook page. And we offer our Pausa Cafés for free to local community groups. We're also proud partners of the mental health charities, Mind (UK and Wales), SAMH (Scotland) and Inspire (Northern Ireland). And each year, we'll give you a day's paid leave to support a charity that matters to you. Work your way We have adapted our ways of working to make sure everyone can feel at home wherever they work. For many colleagues at our Head Office in Leicester and our Central London hub that now includes working on a hybrid basis, combining days in the office with time spent working at home or elsewhere across the business.

Employee benefits

Bonus Scheme
Childcare Vouchers
Flexible Working
Free Parking
Laptop
Learning Allowance
Life Insurance
Pension
Private Healthcare
Share Options
Wellbeing Programme

Office vibe

Birthday Off
City Centre
Hackathons
Office Dog
Open Plan
Social Events

Location

Leadership

John Gahagan
Chief Technology and Information Officer

Tech overview

Our Tech, Digital and Data teams are transforming literally every aspect of our business – from the way we manage and make use of our data, to the relationships we share with our customers. Already, their impact has been felt across the business, and indeed by our customers. But this is just the start and we know there are bigger opportunities ahead. Check out our tech blog for tales behind our talented teams: https://engineering.dunelm.com/ Keep on growing Join us on the tech side and you'll have access to a huge array of learning and development opportunities, including a variety of internally created workshops and externally accredited courses. We also have a substantial tech-specific budget to fund e-Learning licenses, conference visits, resources, and qualifications, plus dedicated mentors, well-being buddies and a wide range of network groups to support you as you progress.

Engineering principles

Agile Process
Code Reviews
Communication and collaboration
Continuous delivery
Continuous Development
Continuous integration
Infrastructure as code
Mentoring
Micro services
Pair programming
Scrum
Test Driven Development
Unit testing

Company tech stack

javascript
aws-lambda
graphql
react
typescript
jest
nodejs
sql
python
java
Dunelm
Site Reliability Engineer
Leicester
Share role