Site Reliability Engineer
Explore roles
This role has expired
Haystack
Site Reliability Engineer
£50,000 - £70,000
Multiple locations
Remote or hybrid
Graduate
Junior
Mid
Senior
Leader
Description

About the Role Our client is seeking a Senior Site Reliability Engineer to enhance observability practices, boost system reliability, and support high availability across its platforms. This role involves close collaboration with engineering and infrastructure teams, combining software development with systems expertise to deliver dependable, observable, and efficient services. Key Responsibilities - Observability Leadership: Enhance telemetry collection and processing using OpenTelemetry, prioritizing actionable and cost-efficient metrics and traces. - Reliability Standards: Guide teams in defining and adopting SLIs/SLOs and foster a culture of service ownership. - Incident Management: Lead incident response efforts, facilitate post-incident reviews, and drive implementation of long-term solutions. - Infrastructure Automation: Use tools such as Pulumi, Terraform, or AWS CDK to manage cloud infrastructure and CI/CD pipelines. - Software Development: Create tools and automation in TypeScript (with optional Rust). Contribute to shared libraries and internal platforms. - Mentorship & Collaboration: Support and mentor other engineers, promoting a reliability-focused mindset across teams. - Continuous Improvement: Explore innovative tools and practices in observability and reliability; lead proof-of-concepts and improvement initiatives. Required Skills & Experience - Proficiency in TypeScript or similar programming languages. - Strong knowledge of OpenTelemetry and observability tools (e.g., Datadog, Grafana). - Solid grasp of SRE principles: SLIs/SLOs, automation, monitoring, and incident response. - Hands-on experience with AWS services (e.g., Lambda, ECS, S3, DynamoDB). - Proficient with Linux, command-line tools, and system-level debugging. - Experience using infrastructure-as-code tools such as Pulumi or Terraform. - Familiarity with Kubernetes, CI/CD pipelines, and automated deployment strategies. - Strong analytical and problem-solving abilities with attention to detail. Nice to Have - Experience with Rust or Go. - Deep understanding of trace sampling and scaling OpenTelemetry. - Track record of reducing observability or cloud infrastructure costs. - Familiarity with Google Cloud Platform. - Background in retail technology environments. Working Style & Values This role suits someone who: - Takes ownership and builds trust. - Communicates openly and supports teammates. - Is curious and continuously looks for ways to improve. - Embraces change and approaches challenges with flexibility. - Thinks long-term and works collaboratively across teams.

Role tech stack
Culture overview
At Haystack we work like the product we build – fast, transparent and signal-driven. We trust people to own their work and give them the space to ship without red tape. Ideas win on merit, not job titles. We’re remote-friendly but tightly connected, using daily async stand-ups and quick calls when it matters. Feedback flows openly, experiments happen daily, and learning is built into the job. Everyone here cares about craft – whether that’s clean code, crisp design or clear communication – and we back each other to keep raising the bar. We celebrate wins, own mistakes and move on quickly. If you like autonomy, impact and a team that genuinely roots for each other, you’ll feel at home at Haystack.
Employee benefits
Flexible Hours
Flexible Working
Free Parking
Laptop
Learning Allowance
Learning/Development days
Office vibe
Beer Fridge
City Centre
Free Coffee
Social Events
Team Building Days
Location
Leadership
Mike DaviesCo-founder
Tech overview
Haystack is a real-time marketplace connecting 150k+ tech professionals with teams they actually want to join. Our AI-driven matching crunches millions of data points to surface the right people in seconds, while Verified Candidates come pre-screened and interview-ready with 80% shortlist rates. Behind the scenes we run a live, event-driven platform built with TypeScript/Node, React, GraphQL, MongoDB and AWS—shipping fast with vector search, embeddings and microservices powering instant alerts and 24–48h interview booking. If you love distributed systems, data pipelines and building products people genuinely want to use, this is the place to do it.
Engineering principles
Agile Process
Code Reviews
Continuous delivery
Continuous Development
Mentoring
Rapid release cycles
Scrum
Unit testing
Company tech stack
Haystack
Site Reliability Engineer£50,000 - £70,000
This role has expired