Site Reliability Engineer

Roles

Dunelm

Explore roles

This role has expired

Dunelm

Site Reliability Engineer

Multiple locations

Hybrid

Description

Hybrid requirements: This role has flexible working patterns.

This role can be based out of our London or Leicester offices but will be hybrid (i.e. work from home and office).

We’re searching for an Engineer to join our Site Reliability Engineering (SRE) team. The team is agile, data-driven, and motivated, comprising software and systems engineers. We’re dedicated to creating meaningful observability and monitoring solutions, while automating manual tasks, ensuring product quality and forging operational excellence. Our blame-free DevOps culture and collaboration is at the core of our approach.

What you’ll be doing

As a Site Reliability Engineer at Dunelm, you will become a key member of the SRE team. You are motivated and enthusiastic and able to use your operational and engineering knowledge to help develop effective tools, observability solutions, pipelines and more to allow the wider engineering and platform teams at Dunelm the ability to create, update and release with confidence.

Responsibilities:

Observability Development: Designing, building, deploying, running and – ultimately – owning observability tooling, such as dashboards, monitoring and alerting.

Embedded Consultancy: Working with other teams throughout the Dunelm technical space to help increase their SRE maturity level – mainly through helping to define Service Level Objectives (SLO) and Service Level Indicators (SLI), plus working on the integrations to help them produce the required telemetry for them.

DevOps Best-Practice Advocacy: Promoting a DevOps culture, through ‘shift-left’ testing, non-functional (security, performance etc.) testing, continuous integration and deployment and working with other teams to share the responsibilities of the software that is built.

Incident Response: Being available to help investigate incidents in real-time – sometimes out of normal working hours as part of our on-call rota. Helping to ascertain impact and find observability gaps during these investigations. Being part of ‘blameless post-mortems’, focusing on collaboration and knowledge-sharing.

Workflow: Helping to create and refine work tickets, breaking down larger pieces of work into actionable pieces. Ensuring relevant knowledge is shared with the rest of the team while working on these tickets and clearly articulating any blocking circumstances.

Code Quality and Risk Mitigation: Review the team’s output to ensure all code is highly maintainable, supportable, and minimises operational risk.

Mentorship and Coaching: Mentor and guide other team members, including less experienced engineers, providing feedback and coaching to help them reach their full potential.

Research and Learning: Researching new technologies and architectural patterns by conducting technical Proof of Concepts (PoC) and propose improvements to existing platforms as well as developing new solutions. You will also be given the opportunity to do team-based and independent learning on a regular basis, to improve yours and the team’s knowledge.

What we’ll look for in you

Essential skills

Amazon Web Services: We run most of our back-end software in AWS Lambdas, with some containerised software (ECS / Fargate) and some cloud server based (EC2). You will need experience and good knowledge of all of these, general AWS networking principles (VPC, security groups etc.), plus other AWS services including, but not limited to: S3, EventBridge, SQS / SNS, RDS and DynamoDB.

Development Experience: You will be a solid developer, experienced in building high-quality, testable applications and tools. You will know how to create effective tests (unit and integration) and be familiar and comfortable with different ways of tackling a problem – for example pair and mob programming. Our stack is mainly TypeScript and Python, so experience with both would be distinctly advantageous.

Observability Knowledge: The fundamental aspects of observability, including telemetry and RUM are something you can explain in detail, and you know how to use them to effectively observe running software. You also understand sampling and how that can be used most effectively.

Problem-Solving Prowess: You possess exceptional problem-solving skills, capable of addressing intricate challenges that may arise within our technology landscape. You are also a ‘detective’, using your skills and knowledge to collect evidence that eliminates what a problem is not, leading you to the most likely cause(s).

Pipeline Expertise: You will understand how to create, deploy and troubleshoot CI / CD pipelines (we use Gitlab) to run tests / checks, create builds and ultimately deploy software in various environments.

Technology Proficiency: You have solid knowledge of various technologies, tools, frameworks and patterns related to the previous five points, including, but not limited to: IaC (e.g. Terraform, Pulumi, CDK), Lambda runtime programming languages (e.g. TypeScript, Python), containerised applications (Docker), event-driven architecture and POSIX-based shells (e.g. Bash, zsh).

Tech Enthusiasm: Your passion for technology drives you to explore and embrace the latest innovations continuously. This dedication to growth and learning is essential in staying ahead in our ever-evolving tech landscape.

Desirable skills

OpenTelemetry: Previous experience or demonstrated knowledge of working with OpenTelemetry solidifies your observability expertise, enhancing our monitoring and diagnostic capabilities.

Google Cloud Platform (GCP): Although Dunelm’s distributed systems are overwhelmingly deployed on AWS, we do have strategic deployments in GCP, so any working knowledge of this platform would demonstrate your breadth of cloud knowledge.

Role tech stack

Life at

Dunelm

Browse all roles

Culture overview

We're here to help our customers create the joy of feeling truly at home. Join us and you'll find our caring and inclusive culture makes this a place you'll feel right at home too. Learn Wherever you work with us and in whatever role, you'll have every opportunity to keep on learning and keep on growing. Thrive We'll take care of you, and make sure your everyday needs are met, so you can focus on doing a great job and being the best version of you. Belong We embrace diversity in all its forms. We'll celebrate the individual you are and value the unique contribution you bring. Colleague Networks All of our colleagues have the opportunity to be part of our four colleague networks. These are Disability & Neurodiversity, LGBTQ+, Gender Equality and Ethnicity & Race. Each network has co-chairs and an exec sponsor who work closely with us to ensure that we are a workplace where everyone feels supported, celebrated, valued and heard. A chance to give something back We're serious about our role in society. Each of our stores is partnered with a local charity and has its own community Facebook page. And we offer our Pausa Cafés for free to local community groups. We're also proud partners of the mental health charities, Mind (UK and Wales), SAMH (Scotland) and Inspire (Northern Ireland). And each year, we'll give you a day's paid leave to support a charity that matters to you. Work your way We have adapted our ways of working to make sure everyone can feel at home wherever they work. For many colleagues at our Head Office in Leicester and our Central London hub that now includes working on a hybrid basis, combining days in the office with time spent working at home or elsewhere across the business.

Employee benefits

Bonus Scheme

Childcare Vouchers

Flexible Working

Free Parking

Laptop

Learning Allowance

Life Insurance

Pension

Private Healthcare

Share Options

Wellbeing Programme

Office vibe

Birthday Off

City Centre

Hackathons

Office Dog

Open Plan

Social Events

Location

Tech at

Dunelm

Go to profile

Leadership

John Gahagan

Chief Technology and Information Officer

Tech overview

Our Tech, Digital and Data teams are transforming literally every aspect of our business – from the way we manage and make use of our data, to the relationships we share with our customers. Already, their impact has been felt across the business, and indeed by our customers. But this is just the start and we know there are bigger opportunities ahead. Check out our tech blog for tales behind our talented teams: https://engineering.dunelm.com/ Keep on growing Join us on the tech side and you'll have access to a huge array of learning and development opportunities, including a variety of internally created workshops and externally accredited courses. We also have a substantial tech-specific budget to fund e-Learning licenses, conference visits, resources, and qualifications, plus dedicated mentors, well-being buddies and a wide range of network groups to support you as you progress.