Senior Site Reliability Engineer (Observability) Location: London/UK (Remote) Contract: 12 Months Initial Day rate : £55 Per Hour - £62 Per Hour Inside IR35 Job Overview We are looking for a Senior Site Reliability Engineer with strong experience in Observability, Monitoring and Distributed Systems to support large-scale cloud infrastructure supporting millions of devices globally. The role focuses on building and scaling monitoring, logging and alerting platforms to ensure high availability and performance of cloud services. Responsibilities Design, deploy and scale observability platforms Manage and scale Prometheus monitoring systems Deploy and maintain large Elasticsearch clusters Build and maintain data pipelines using Kafka Develop alerting and monitoring frameworks Automate infrastructure using Terraform and Ansible Develop tools and scripts using Python, Go, Ruby or Bash Work with Linux systems (Debian/Ubuntu) Participate in on-call rotation Improve system reliability, performance and scalabilityRequired Skills 5+ years experience in Site Reliability Engineering / DevOps Strong Linux systems experience Observability and Monitoring tools experience Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana) Kafka Terraform / Infrastructure as Code Ansible / Configuration Management Programming experience (Python, Go, Ruby or Bash) Distributed systems and cloud infrastructure experienceThis is an urgent vacancy where the hiring manager is shortlisting for an interview immediately. Please apply with a copy of your CV or send it khushboo. pandey @ randstad. Co. uk Randstad Technologies is acting as an Employment Business in relation to this vacancy