Make yourself visible and let companies apply to you.
Role title
Roles
OpenTelemetry Jobs
Get notified about new jobs that match this search?
Lead Integration Engineer & Developer
Ashdown Group
Liverpool
Hybrid
Senior
£100,000
RECENTLY POSTED
+6

A fast-growing Legal and Financial Services company based in Liverpool is requires a hands-on Lead Integration Engineer & Developer to take ownership of their growing integration platform connecting core internal systems with external partners and services. This is a high-impact role combining deep technical delivery with architectural leadership. Youll spend a significant portion of your time building production systems, while also shaping the future of their integration ecosystem.

The platform is built around HubSpot and a modern event-driven architecture in AWS, and youll play a key role in defining how they design APIs, process events, and scale integrations across the business.

The role is paying £90,000-£100,000 plus good benefits and is a hybrid role (3 days in the office 2 working from home), but with attendance in the central Liverpool office encouraged given the nature of the role and the team management aspect.

Technology Environment

Core Stack

  • AWS (Lambda, API Gateway, EventBridge, SQS, SNS)
  • Node.js / TypeScript and Python

Data & Infrastructure

  • DynamoDB, RDS
  • Infrastructure as Code (Terraform, CDK, CloudFormation)
  • CloudWatch and observability tooling

Integrations

  • HubSpot (CRM)
  • Internal microservices and external APIs

Required Experience

  • Significant experience in backend or platform engineering
  • Strong hands-on AWS experience (serverless preferred)
  • Proven experience with distributed, event-driven systems
  • Experience integrating with third-party APIs
  • End-to-end ownership of systems (design/build/operate)

Technical Expertise

  • Event-driven architecture (EventBridge, SQS, SNS, Kafka)
  • Reliability patterns (retries, idempotency, DLQs)
  • Observability and debugging in distributed systems
  • Data modelling and schema evolution

Leadership & Collaboration

  • Ability to lead technical design and influence architecture
  • Experience mentoring engineers
  • Strong communication across technical and non-technical teams
  • Comfortable in a fast-paced, evolving environment

Desirable

  • Experience with HubSpot or CRM integrations
  • Ownership of internal integration platforms
  • High-volume event ingestion or real-time pipelines
  • Containerisation (Docker, ECS, Kubernetes)
  • Observability tools (Datadog, OpenTelemetry)
Lead DevSecOps Engineer
Transunion
Leeds
Hybrid
Senior
Private salary
RECENTLY POSTED
+7

TransUnion’s Job Applicant Privacy Notice What We’ll Bring:
We Are TransUnion: TransUnion is a major credit reference agency, and we offer specialist services in fraud, identity and risk management, automated decisioning and demographics. We support organisations across a variety of sectors including finance, retail, telecommunications, utilities, gaming, government and insurance. What You’ll Bring: We’re looking for a Lead DevSecOps Engineer to join our growing team. Responsible for ensuring the operational integrity, release governance, and stakeholder alignment for the global platforms deployed in the UK regions. Acting as a bridge between regional and global teams, this role drives platform reliability, compliance, and continuous improvement through technical leadership and process excellence. Day to Day You’ll Be: Release & Deployment Management Lead the regional execution of all release types (major, minor, emergency) through automated CI/CD pipelines, ensuring timely and high-quality deployments.
Coordinate with Global Platform testers to oversee post-deployment validation, ensuring issues are identified, documented, and resolved efficiently.
Support and deputise for the Operations and Technical Release Manager in managing the full CAB ticket lifecycle, maintaining robust deployment traceability and compliance.
Collaborate with the Operations and Technical Release Manager to lead, manage, and prioritise Kanban workflows, driving effective tracking and continuous improvement of release processes.
Facilitate cross-functional coordination of development and change activities across the UK region, ensuring alignment with business objectives and minimising disruption.
Champion continuous improvement by analysing release outcomes, gathering feedback, and implementing process enhancements.
Communicate release status and risks to stakeholders, ensuring transparency and proactive issue resolution. Testing & Change Management Coordinate regression testing of changes to Global Business Platforms deployed in the UK region in collaboration with Global Platform testers, ensuring all updates impacting UK stakeholders are thoroughly validated and documented.
Act as a key liaison to the Operations and Technical Release Manager, leading the coordination of future development, maintenance, and change activities between global and UK teams to ensure alignment, minimise risk, and support successful delivery.
Facilitate effective communication between global and UK teams, ensuring that testing outcomes, risks, and dependencies are clearly documented and addressed.
Drive continuous improvement in testing and coordination processes by gathering feedback, identifying bottlenecks, and implementing best practices. Platform Operations & Monitoring Provide technical leadership for the operational management of Global Business Platforms deployed in the UK region, ensuring compliance with SLAs, SLOs, RTOs, and other service commitments.
Oversee incident management, monitoring, and escalation processes to maintain platform stability and minimise service disruptions, including participation in a 24x7 out-of-hours support rota.
Collaborate with the Operations and Technical Release Manager to produce and analyse reports on platform availability, performance, and capacity, supporting data-driven decision-making and proactive capacity planning.
Drive continuous improvement in operational processes by identifying risks, implementing best practices, and optimising platform reliability. Incident & Problem Management Provide technical leadership in troubleshooting and resolving regional issues with the Global Business Platforms deployed in the UK region, ensuring rapid restoration of service and minimal business disruption.
Technically lead critical and high-severity incident response, acting as the primary technical representative on incident bridges and ensuring effective communication and resolution.
Coordinate and participate in post-incident reviews (post-mortems/PIRs) with global and regional teams, driving the identification and implementation of corrective actions.
Analyse incident data to identify root causes, predict potential future issues, and implement preventative controls, fostering a culture of continuous improvement and operational resilience.
Document incident outcomes and share lessons learned to enhance team knowledge and prevent recurrence. Risk, Compliance & Governance Lead the identification, assessment, documentation, and management of risks and issues related to Global Business Platforms deployed in the UK region, ensuring proactive mitigation and platform stability.
Oversee and support compliance and audit activities for the Global Business Platforms deployed in the UK region, ensuring adherence to regulatory requirements and audit readiness.
Collaborate with risk, compliance, and audit stakeholders to address findings, implement corrective actions, and continuously improve risk and compliance processes.
Monitor evolving risks and regulatory changes, adapting controls and processes to maintain ongoing compliance and operational resilience. Essential Skills & Experience: Track record of years of experience in technical leadership roles within cloud‑native, containerized environments, with a proven track record of delivering complex solutions at scale.
Deep expertise in Kubernetes, specifically managing and supporting GKE (Google Kubernetes Engine) clusters and containers in production environments, including advanced proficiency with Helm for designing, installing, and managing applications using Helm charts.
Strong background in GCP cloud networking, load balancing, traffic management, and Google Cloud Storage (GCS), with the ability to architect and troubleshoot complex network topologies.
Comprehensive knowledge of GCP compute services and experience administering, managing, and maintaining Dataproc clusters.
Hands‑on experience with Wiz for Cloud Native Application Protection and container security, including policy definition and incident response.
Good understanding of Cloudflare.
Significant experience with PostgreSQL for database design, optimisation, and management in high‑availability environments, alongside experience working with Redis for session storage management.
Proven ability to design and support Kafka‑based data streaming architectures.
Expertise in implementing and managing Identity and Access Management (IAM) solutions using Ping and/or Keycloak.
In‑depth experience with Kong APIM for API lifecycle management, security, and governance.
Strong monitoring and observability skills, including creating and interpreting dashboard alerts from GCP Native Observability, Prometheus, Grafana, and OpenTelemetry.
Demonstrated leadership in CI/CD pipeline management using Harness, including advanced skills in infrastructure automation with Terraform and integration with Harness IaC pipelines.
Experience managing secrets and sensitive data using HashiCorp Vault.
Solid understanding of application security best practices, including integrating and acting on findings from CheckMarxOne or similar tools.
Ability to translate high‑level architectural blueprints into detailed low‑level designs.
Excellent communication, mentoring, and stakeholder management skills, with a passion for developing team capabilities and fostering a culture of technical excellence.
Ability to collaborate effectively with cross‑functional teams to deliver robust, secure, and scalable solutions aligned with business objectives.
Working knowledge of programming languages such as HTML5, SAS, and iOS.
Preferred qualifications include one of the following certifications: GCP Cloud Engineer
GCP DevOps Engineer
GCP Security Engineer
GCP Database Engineer
HashiCorp Terraform Associate
Kubernetes and Cloud Native Associate (KCNA)
Certified Kubernetes Administrator (CKA) Impact You’ll Make: What’s In It For you?
At TransUnion you will be joining a friendly, forward thinking global business.
As well as an excellent salary and bonus scheme or commission scheme (if joining our sales teams) our benefits package comes with: 26 days’ annual leave + bank holidays (increasing with service)
Global paid wellness days off + a bonus day off to celebrate your birthday
A generous contributory pension scheme + access to the TransUnion Employee Stock Purchase Plan
Private health care + a variety of physical, mental and financial fitness wellbeing programmes such as access to mindfulness tools
Access to our diversity forums and communities so you can get involved in causes close to your heart TransUnion - a place to grow: If there’s something on the list of essential / desirable skills that you can’t quite tick off, don’t let that put you off applying. We are open to exploring training and development opportunities for the right candidate to ensure you are successful.
We know imposter syndrome is real, lets confront it so we can continue to grow and thrive together Flexibility at TU:
We recognise that our people need the freedom to balance their day-to-day lives with their work. This is why we’ve set out to create inclusive and flexible policies and practices for you to accommodate all your responsibilities and needs: children, family and beyond. If the role is advertised as full time, don’t let this stop you from applying. Let us know if you’re looking for a part time or flexible working arrangement and we can discuss this with you. Additional support:
At TransUnion, we’re committed to fostering an inclusive and diverse workplace where all individual’s talents and perspectives are valued. When you apply for a position with us, you’re not just joining a team, you’re becoming part of a community that celebrates differences and embraces equality. We understand that everyone has different needs, which is why we offer a range of reasonable adjustments to our recruitment process. Please let us know if you require any reasonable adjustments to help you through the application process or to attend an interview with us by contacting (url removed) Interview & Hiring Process:
Most of our recruitment processes are virtual, so you’ll get to know our hiring managers and teams over the phone and through video. If we need you to attend a physical in person interview your recruiter will inform you of this. We do not accept any unsolicited CV’s from recruitment agencies. If you are a recruitment agency on our PSL our talent team will contact you directly should we require any assistance. #LI-Hybrid
Find out more about Life At TU UK:
(url removed) is a hybrid position and involves regular performance of job responsibilities virtually as well as in-person at an assigned TU office location for a minimum of two days a week. TransUnion Job Title Advisor, Software Development

Site Reliability Engineer II
CME Technology Support Services Ltd
Belfast
Hybrid
Junior - Mid
Private salary
RECENTLY POSTED
+1

Site Reliability Engineer II (Sunday - Thursday with generous shift allowance)

CME Group is seeking a SRE II to help, build, operate and scale systems in our Markets portfolio. Markets SREs work on products and applications related to CME’s Globex trading platform. Our systems deliver an exceptional combination of low-latency performance and rock-solid reliability to seamlessly handle the world’s busiest trading days.

The successful candidate will work alongside senior engineers to learn how we observe, monitor, automate, and improve Production service reliability.

Key Responsibilities:

  • Work alongside product teams and senior engineers to assist with building out observability, monitoring and alerting for key services
  • Collaborate with engineers and product teams to ensure requirements are understood, planned carefully and implemented safely
  • Participate in on-call rotation and assist in incident response and on-call rotation under guidance from senior engineers
  • Write scripts and tools to reduce toil and improve velocity
  • Contribute to DR and systems resiliency testing & improvements
  • Support the migration of markets applications to Google Cloud Platform (GCP)
  • Collaborate with cross-functional teams to improve system performance and efficiency

What We’re Looking for:

  • A keen interest in SRE
  • Experience with Linux-based systems
  • Basic programming/scripting skills (Python, Bash, etc.)
  • Strong problem-solving and analytical abilities
  • Excellent communication and teamwork skills
  • Eagerness to learn and adapt in a fast-paced trading environment

Desirable

  • Experience with Cloud-based platform(s) - Google Cloud Platform, GCE, and/or GKE a bonus
  • Experience with metrics & monitoring, OpenTelemetry, Splunk, Prometheus, Grafana, etc.
  • Experience and knowledge of working with distributed systems
  • Experience with Kubernetes
  • Basic knowledge of networking (HTTP/TCP/UDP/IP).
  • Experience in Financial markets.
  • Experience with message-oriented middleware.
  • Experience working in an agile environment.

Why CME Group:

  • Be part of a global leader in financial services technology.
  • Work on cutting-edge technology in a collaborative and innovative culture.
  • Competitive compensation and benefits package.
  • Opportunity to grow and advance your career in SRE with an organisation who is transforming to this approach

Join CME Group and play a crucial role in ensuring the stability and performance of our Markets applications while contributing to the migration to Google Cloud Platform. Apply now to be a part of our dynamic SRE team!

Company Benefits:

  • Bonus Programme
  • Equity Programme
  • Employee Stock Purchase Plan (ESPP)
  • Private Medical and Dental coverage
  • Mental Health Benefit Programme
  • Group Pension Plan
  • Income Protection
  • Life Assurance
  • Cycle To Work
  • EV Car Benefit Scheme
  • Gym Membership
  • Family Leave
  • Education Assistance - MBA/Advanced Degree/Bachelor Degree
  • Ongoing Employee Development Training/Certification
  • Hybrid Working

CME Group: Where Futures are Made

CME Group is the world’s leading derivatives marketplace. But who we are goes deeper than that. Here, you can impact markets worldwide. Transform industries. And build a career by shaping tomorrow. We invest in your success and you own it - all while working alongside a team of leading experts who inspire you in ways big and small. Problem solvers, difference makers, trailblazers. Those are our people. And we’re looking for more.

At CME Group, we embrace our employees’ unique experiences and skills to ensure that everyone’s perspectives are acknowledged and valued. As an equal-opportunity employer, we consider all potential employees without regard to any protected characteristic.

Important Notice: Recruitment fraud is on the rise, with scammers using misleading promises of job offers and interviews to solicit money and personal information from job seekers. CME Group adheres to established procedures designed to maintain trust, confidence and security throughout our recruitment process. Learn more here.

To be considered for this role you will be redirected to and must complete the application process on our careers page. To start the process click the Continue to Application or Login/Register to apply button below.

Remote Network Monitoring Specialist - Streaming Telemetry
Akkodis
Multiple locations
Fully remote
Mid - Senior
£70,000 - £75,000
RECENTLY POSTED
+4

Salary: 70,000 - 75,000
Location: Remote
Contract: 6-month FTC

Role Overview:

Our client is looking for an experienced Network Monitoring Specialist to support a major network infrastructure rollout on a 6-month fixed-term basis.

This is a hands-on role focused on designing, implementing and commissioning monitoring capability across newly deployed network and fibre infrastructure. The priority is to ensure the environment is fully visible, measurable and supportable from day one.

The role would suit someone with strong experience across network observability, alerting, telemetry, dashboards, service health, performance baselining and operational handover. The client is open to different monitoring backgrounds, particularly where candidates have worked with tools such as VictoriaMetrics, Prometheus, Grafana, Nagios, Zabbix, InfluxDB, Telegraf, SolarWinds, PRTG, Datadog, Elastic, OpenTelemetry, SNMP, NetFlow/IPFIX or syslog pipelines.

You will work closely with network engineering and operational teams to deliver reliable monitoring at pace within a project-led environment.

Key Responsibilities:

  • Design and deploy monitoring solutions across newly delivered network infrastructure.
  • Build monitoring capability that provides clear visibility of network health, performance and service availability.
  • Work with monitoring and observability platforms such as VictoriaMetrics, Prometheus, Grafana, Nagios, Zabbix, InfluxDB, SolarWinds, PRTG, Datadog, Elastic or similar.
  • Support metrics ingestion, retention, alerting, dashboarding and performance visibility.
  • Build or support streaming telemetry pipelines to provide real-time visibility across the network.
  • Implement and refine alerting workflows for service health, escalation and operational response.
  • Develop dashboards and reporting views to support engineering and operational teams.
  • Commission monitoring across network devices, access infrastructure and Layer 1-3 equipment.
  • Define baseline performance metrics, thresholds and SLA-led alerting.
  • Work closely with network and operational teams to align monitoring with changing infrastructure requirements.
  • Support analytics-led monitoring for anomaly detection and predictive fault identification where relevant.
  • Improve monitoring architecture, tooling, documentation and handover processes.
  • Produce clear runbooks, escalation paths and operational guides.
  • Support knowledge transfer into internal technical teams.

What We’re Looking For:

  • Previous experience in a senior network monitoring, network engineering or observability-focused role.
  • Experience working in a telecoms, ISP, managed network or large-scale infrastructure environment.
  • Strong understanding of network monitoring principles, including alerting, telemetry, dashboards, service health and performance baselining.
  • Hands-on experience with monitoring or observability tools such as VictoriaMetrics, Prometheus, Grafana, Nagios, Zabbix, InfluxDB, Telegraf, SolarWinds, PRTG, Datadog, Elastic, OpenTelemetry or similar.
  • Experience with network data sources and protocols such as streaming telemetry, gNMI, gRPC, SNMP, NetFlow/IPFIX or syslog.
  • Good understanding of time-series monitoring, metrics ingestion, retention and performance visibility.
  • Strong networking fundamentals across TCP/IP, BGP, OSPF, VLANs and optical or fibre environments.
  • Familiarity with dashboarding, alert tuning, service health monitoring and operational reporting.
  • Exposure to AI/ML-led monitoring, anomaly detection or predictive fault identification would be beneficial.
  • Scripting or automation experience, such as Python or Bash, would be advantageous.
  • Comfortable working independently and delivering against defined project milestones.
  • Strong communication, documentation and stakeholder engagement skills.
  • Proactive, detail-focused and comfortable solving problems without heavy direction.

Why Consider This Role?

This is a strong opportunity to join a business delivering a major network infrastructure programme, in a role where monitoring and observability are central to successful delivery.

You will be taking ownership of a critical technical area rather than simply maintaining an existing setup. The focus is on making sure newly deployed infrastructure is properly monitored, operationally ready and reliable from day one.

For someone with strong network monitoring experience, this offers a focused 6-month project where you can make a visible impact across a live network environment, using a range of modern monitoring, telemetry and observability technologies.

Modis International Ltd acts as an employment agency for permanent recruitment and an employment business for the supply of temporary workers in the UK. Modis Europe Ltd provide a variety of international solutions that connect clients to the best talent in the world. For all positions based in Switzerland, Modis Europe Ltd works with its licensed Swiss partner Accurity GmbH to ensure that candidate applications are handled in accordance with Swiss law.

Both Modis International Ltd and Modis Europe Ltd are Equal Opportunities Employers.

By applying for this role your details will be submitted to Modis International Ltd and/ or Modis Europe Ltd. Our Candidate Privacy Information Statement which explains how we will use your information is available on the Modis website.

Go Full Stack Developer
itecopeople
London
Hybrid
Senior
£60,000
RECENTLY POSTED
+6

Senior Full Stack Developer

12-Month Fixed-Term Contract | Hybrid Working (2 Days per Week in London) Salary £54,000 - £61,000 pa plus generous pension and holidays

A prestigious client is seeking an experienced Senior Full Stack Developer to join a high-performing technology team delivering innovative AI and automation solutions at scale.

This is an exciting opportunity to work on a modern, cloud-native platform using cutting-edge technologies across backend services, integrations, and user-facing applications. You’ll play a key role in shaping engineering standards, influencing technical direction, and delivering high-quality software within a collaborative Agile environment.

The Role

As a Senior Full Stack Developer, you will focus primarily on backend engineering using Go, while also contributing to modern React-based frontend applications. You’ll work closely with developers, architects, and product teams to build scalable, secure, and observable solutions deployed through Kubernetes-based infrastructure.

This is a hands-on senior engineering role ideal for someone who enjoys solving complex technical challenges, mentoring others, and driving best practice across software delivery.

Key Responsibilities

  • Design, develop and maintain scalable backend services in Go
  • Build modern frontend applications using React
  • Develop APIs, integrations and event-driven services
  • Contribute to CI/CD pipelines and cloud-native deployments
  • Review code and champion engineering best practices
  • Improve application performance, observability and reliability
  • Collaborate within Agile delivery teams across multiple projects
  • Support technical decision-making and continuous improvement

Skills & Experience

We are looking for candidates with strong commercial experience in:

  • Go / Golang backend development
  • Full stack software engineering within Agile environments
  • REST APIs, integrations and distributed systems
  • React and modern frontend development
  • CI/CD, Git and cloud-based delivery practices
  • Docker and Kubernetes
  • Code reviews, testing and engineering governance

Experience with any of the following would be highly advantageous:

  • Microsoft Azure
  • Python
  • GitOps tooling (Argo CD / Flux)
  • Observability tooling (Prometheus, Grafana, OpenTelemetry)
  • AI/LLM-enabled applications
  • Event-driven architectures and messaging platforms

What’s on Offer

  • Opportunity to work on cutting-edge AI and cloud-native technologies
  • Hybrid working model with flexibility
  • Collaborative, forward-thinking engineering environment
  • High-profile programme of work with real organisational impact
  • £54,000 - £61,000 salary and benefits package

If you are a passionate senior engineer looking for your next challenge within a modern technology environment, we would love to hear from you.

Please send your CV to Laura at

Services advertised are those of an employment agency.

Go Full Stack Developer
itecopeople
London
Hybrid
Senior
£54,000 - £61,000
RECENTLY POSTED
+4

Senior Full Stack Developer

12-Month Fixed-Term Contract | Hybrid Working (2 Days per Week in London) Salary £54,000 - £61,000 pa plus generous pension and holidays

A prestigious client is seeking an experienced Senior Full Stack Developer to join a high-performing technology team delivering innovative AI and automation solutions at scale.

This is an exciting opportunity to work on a modern, cloud-native platform using cutting-edge technologies across Back End services, integrations, and user-facing applications. You’ll play a key role in shaping engineering standards, influencing technical direction, and delivering high-quality software within a collaborative Agile environment.

The Role

As a Senior Full Stack Developer, you will focus primarily on Back End engineering using Go, while also contributing to modern React-based Front End applications. You’ll work closely with developers, architects, and product teams to build scalable, secure, and observable solutions deployed through Kubernetes-based infrastructure.

This is a hands-on senior engineering role ideal for someone who enjoys solving complex technical challenges, mentoring others, and driving best practice across software delivery.

Key Responsibilities

  • Design, develop and maintain scalable Back End services in Go
  • Build modern Front End applications using React
  • Develop APIs, integrations and event-driven services
  • Contribute to CI/CD pipelines and cloud-native deployments
  • Review code and champion engineering best practices
  • Improve application performance, observability and reliability
  • Collaborate within Agile delivery teams across multiple projects
  • Support technical decision-making and continuous improvement

Skills & Experience

We are looking for candidates with strong commercial experience in:

  • Go/Golang Back End development
  • Full stack software engineering within Agile environments
  • REST APIs, integrations and distributed systems
  • React and modern Front End development
  • CI/CD, Git and cloud-based delivery practices
  • Docker and Kubernetes
  • Code reviews, testing and engineering governance

Experience with any of the following would be highly advantageous:

  • Microsoft Azure
  • Python
  • GitOps tooling (Argo CD/Flux)
  • Observability tooling (Prometheus, Grafana, OpenTelemetry)
  • AI/LLM-enabled applications
  • Event-driven architectures and messaging platforms

What’s on Offer

  • Opportunity to work on cutting-edge AI and cloud-native technologies
  • Hybrid working model with flexibility
  • Collaborative, forward-thinking engineering environment
  • High-profile programme of work with real organisational impact
  • £54,000 - £61,000 salary and benefits package

If you are a passionate senior engineer looking for your next challenge within a modern technology environment, we would love to hear from you.

Please send your CV to Laura at (see below)

Services advertised are those of an employment agency.

Principal Engineer
Synergetic Recruitment Group Limited
Chelmsford
In office
Senior
£100,000
RECENTLY POSTED
+4

Principal Software Engineer

Location: Cambridge

Our client is scaling a large, distributed cloud platform and is looking for a Principal Engineer to act as the Subject Matter Expert (SME) across observability and cloud infrastructure.

Youll be working at serious scale managing thousands of Kubernetes nodes, handling tens of terabytes of logs daily, and supporting millions of real-time metrics across a highly distributed environment.

The Role

This is a senior, hands-on role where you will own the technical direction and standards of the observability ecosystem.

As the SME, youll define best practice, guide architectural decisions, and act as the go-to expert across engineering teams, ensuring scalable, cost-efficient, and high-performance systems.

Key Responsibilities

  • Act as the SME for observability and cloud infrastructure across the organisation
  • Lead architecture across metrics, logs, and tracing systems
  • Design and optimise high-throughput data pipelines and storage layers
  • Implement strategies such as sampling, aggregation, and down-sampling
  • Extend and enhance open-source observability tools at scale
  • Partner with engineering teams to standardise tooling and improve adoption
  • Drive reliability, scalability, and cost optimisation across the platform
  • Define and promote best practices aligned with OpenTelemetry and modern observability standards
  • Mentor engineers and elevate engineering quality across teams

Tech Environment

  • Kubernetes at scale (thousands of nodes)
  • High-volume telemetry (hundreds of thousands of events per second)
  • Observability stack: Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, ClickHouse
  • Multi-cloud (AWS, GCP)
  • Infrastructure as code (Terraform), CI/CD pipelines

What Were Looking For

  • 15+ years building and scaling distributed systems
  • Strong hands-on experience with Golang (plus Python or Shell)
  • Deep expertise in observability at scale
  • Strong Kubernetes and cloud infrastructure experience
  • Proven ability to design systems for performance, scale, and cost efficiency
  • Experience with service mesh technologies (e.g. Istio/Envoy)
  • Ability to operate as a technical authority and trusted advisor across teams

Nice to Have

  • Open-source or CNCF contributions
  • Experience using AI tools to improve engineering efficiency

Why Join

  • Be the go-to expert shaping a large-scale observability platform
  • Work on complex, high-impact infrastructure challenges
  • Strong ownership and influence at Principal level
Telemetry and Observability Engineer
Oscar Associates Limited
London
Hybrid
Mid - Senior
£475/day - £515/day
RECENTLY POSTED
+3

Telemetry & Observability Engineer | Contract | £475-£515 p/d | Inside IR35 | London | 3 days on site | 6 month contract

We are seeking a highly skilled Telemetry & Observability Engineer to join a large-scale enterprise engineering environment focused on improving system reliability, visibility, and operational intelligence across complex distributed platforms. This role is ideal for a hands-on engineer who can go beyond using observability tools and instead design, build, and automate observability and telemetry solutions from the ground up.

You will be responsible for building and evolving telemetry pipelines and observability infrastructure across a distributed, cloud-native environment. Working closely with network, platform, and software engineering teams, you will help embed observability into engineering workflows and CI/CD processes.

Key Responsibilities

  • Design and implement scalable pipelines for metrics, logs, traces, and event data across distributed systems
  • Build and enhance observability tooling, including dashboards, monitoring systems, alerting frameworks, and reliability standards
  • Develop and maintain telemetry solutions using OpenTelemetry and related open-source technologies
  • Automate observability infrastructure using Terraform and infrastructure-as-code practices
  • Integrate monitoring and observability into CI/CD pipelines and SDLC processes
  • Define and support SLIs, SLOs, and alerting strategies in collaboration with engineering teams
  • Promote best practices in instrumentation, monitoring, and incident response
  • Work with network and platform teams to improve visibility across infrastructure and services

Required Skills & Experience

  • Proven experience in observability, SRE, or platform engineering roles within complex distributed environments
  • Strong hands-on experience with OpenTelemetry, including building and managing telemetry pipelines
  • Experience with observability and monitoring tools such as: Grafana, Prometheus, Elastic / Splunk, Loki / Jaeger or similar
  • Strong experience with Terraform or other infrastructure-as-code tools
  • Solid understanding of cloud-native environments (Kubernetes, microservices, distributed systems)
  • Experience working within or alongside network engineering or infrastructure operations teams
  • Programming experience in at least one language such as Python, Go, or Java
  • Strong understanding of SLOs, SLIs, and modern reliability engineering practices
  • Strong analytical and problem-solving skills with a focus on automation and system reliability

Desirable Experience

  • Experience in regulated or large enterprise environments
  • Exposure to AI/ML-driven observability or proactive monitoring initiatives
  • Contribution to or involvement with open-source observability projects

If this sounds like a fit, APPLY NOW!

Telemetry & Observability Engineer | Contract | £475-£515 p/d | Inside IR35 | London | 3 days on site | 6 month contract

Oscar Associates (UK) Limited is acting as an Employment Business in relation to this vacancy.

To understand more about what we do with your data please review our privacy policy in the privacy section of the Oscar website.

Senior DevOps Engineer - ID46327
Humand
Oxford
Hybrid
Senior
£80,000 - £110,000
+4

Senior DevOps Engineer Oxfordshire (Hybrid) | Up to £110,000 base We're working with a well-funded, fast-growing technology company building high-performance, data-intensive platforms, and they’re looking for a Senior DevOps Engineer to join their team. This is a high-impact role where you’ll play a key part in shaping infrastructure, improving deployment processes, and driving reliability across complex distributed systems operating across both cloud and on-prem environments. What you’ll be doing: \* Building and automating infrastructure using Terraform and Ansible \* Developing and optimising CI/CD pipelines (GitHub Actions, including self-hosted runners) \* Managing deployments across multiple environments, including restricted or security-sensitive systems \* Driving observability using OpenTelemetry and the Grafana stack (Loki, Tempo, Mimir) \* Implementing artifact management, versioning strategies, and release processes \* Supporting the evolution of a hybrid cloud / on-prem platform architecture \* Working with containerised services and contributing to platform scalability \* Collaborating with software and platform engineering teams to improve system performance and reliability What they’re looking for: \* Strong DevOps / SRE experience in complex, production environments \* Deep Linux systems expertise (Ubuntu or similar) \* Strong experience with infrastructure as code (Terraform, Ansible, etc.) \* Strong scripting/programming skills (Python or similar) \* Proven experience building and maintaining CI/CD pipelines \* Hands-on experience with observability tooling (Grafana, OpenTelemetry, Prometheus) \* Strong experience with containerised environments (Docker or similar) \* Experience working across cloud and on-prem / hybrid environments \* Strong understanding of security, scalability, and system reliability Why join? \* Work on complex, high-impact systems at scale \* Real influence over infrastructure and platform engineering direction \* Strong engineering culture with cross-functional collaboration \* Competitive salary + bonus + equity \* Private healthcare (family included) \* Flexible hybrid working We are committed to building inclusive teams. We welcome applications from people of all backgrounds, experiences, and perspectives. Our client values diversity and believes a broad range of ideas and experiences makes for better outcomes

Dynatrace Expert
BGTS LTD
London
Remote or hybrid
Senior
£65,000 - £80,000
+4

Dynatrace Configuration and Management

The primary responsibility involves the end-to-end implementation, configuration, and continuous optimization of Dynatrace solutions across our infrastructure. This includes deploying OneAgent, configuring ActiveGates, and establishing comprehensive monitoring strategies encompassing tagging, baselines, and alerting mechanisms. The expert will create and maintain customized dashboards and reports to provide actionable insights into system performance for various stakeholders.

Performance Monitoring and Issue Resolution

The expert will continuously monitor system performance, analyzing logs, metrics, and distributed traces to diagnose application issues. A critical aspect of this role is utilizing reverse engineering methodologies to dissect complex system behaviors, identify root causes of performance degradation, and uncover hidden configuration flaws. This involves deep-dive performance analysis using tools like PurePath and Smartscape to ensure optimal operation of Java backends and Angular frontends.

Cloud and Microservices Integration

The candidate will be responsible for integrating Dynatrace monitoring within our AWS cloud infrastructure and microservices ecosystem. This includes ensuring seamless observability across containerized environments (e.g., Kubernetes, Docker) and serverless architectures. The expert will collaborate closely with development and DevOps teams to embed monitoring best practices into CI/CD pipelines, facilitating automated performance validation during deployments.

Required Qualifications and Skills

Technical Expertise

Candidates must possess extensive hands-on experience with the Dynatrace platform, including advanced configuration and administration. A strong foundation in Application Performance Monitoring (APM) concepts is essential. The role requires profound knowledge of AWS services, microservices architectures, and full-stack development technologies, specifically Java and Angular.

Analytical and Problem-Solving Skills

Exceptional analytical skills are required to interpret complex performance metrics, including CPU, memory, latency, and throughput. The candidate must demonstrate proficiency in reverse engineering to troubleshoot intricate system issues and optimize configurations. Experience with scripting languages (e.g., Python, Shell) for automation and custom integrations is highly desirable.

Collaboration and Communication

The successful candidate will exhibit strong communication skills, enabling effective collaboration with cross-functional teams, including software engineers, system administrators, and project managers. The ability to document monitoring strategies, root cause analyses, and best practices clearly is crucial for maintaining a robust observability culture within the organization.

Preferred Qualifications

  • Dynatrace Associate or Professional Certification.
  • Experience with OpenTelemetry (OTEL) implementation.
  • Familiarity with other monitoring and logging tools (e.g., Splunk, Prometheus).
  • Knowledge of DevOps practices and CI/CD toolchains.
Page 1 of 1