You will play a key role in shaping the future of the Systems Engineering function at Zego.
You will be part of the team ultimately responsible for the uptime of the Zego Hosting Platform and services.
You will build and nurture relationships with key stakeholders across Product and Engineering to ensure our platform is aligned with business objectives.
You will champion agile methodologies, metrics and tooling to support the teams in incrementally improving the efficiency of our hosting platform.
Drive improvements to the Zego Hosting Platform at scale, influence change and buy-in across the organisation.
Play a key role in defining technical solutions with a detailed analysis of costs, risks, performance, reliability, scalability, and maintainability.
Approach our processes critically, identify areas of improvement and balance business priorities against technical compromises.
Nurture a culture of continuous learning and knowledge-sharing, leading by example across multiple areas and teams.
Use code reviews as an opportunity to ensure coding styles are followed and leading the team in defining new standards to promote a clear and concise codebase.
We are looking for engineers who embrace the DevOps culture to deliver continuous improvements to our security posture. Engaging with and empowering the teams to drive change leveraging metrics, championing automation and operational excellence.
Good coding and scripting skills in languages such as Bash and Python with a focus on automating infrastructure tasks, monitoring, and process optimisation.
Experience configuring Infrastructure as Code (IaC) using tools such as Terraform, Crossplane, and Helm, with a focus on building scalable, reliable, and reproducible infrastructure.
In-depth knowledge of AWS cloud services, especially core services like VPC, S3, EC2, Kinesis, RDS, and EKS, with experience in multi-account setups and optimising for performance and cost-efficiency.
Skilled in container management and orchestration using Docker, Kubernetes, Helm, Service Mesh (ie. Istio) and GitOps (ie. ArgoCD), with a focus on streamlined deployments and managing complex service-oriented architectures.
Experienced in leveraging observability tools, such as Honeycomb (OpenTelemetry) and DataDog, to support data-driven decisions across the wider engineering team.
Comprehensive understanding of networking in cloud environments, including VPN solutions, efficient network configuration, load balancing, and troubleshooting.
Extensive experience designing, implementing, and optimising CI/CD pipelines to ensure reliable, automated delivery of code across development, testing, and production environments.