Lead a team of engineers and grow the technical backbone of our organization.
Implement software development practices to build observability, alerting, tracing, automation and self-healing capabilities to maintain the highest levels of platform availability.
Performance tune and enhance the reliability of the infrastructure stack, for both public and private cloud.
Hands on contribution to enterprise solutions, tooling, and initiatives leveraging your technical experience.
Nurture an environment of innovation and continuous improvement, leading changes that drive efficiencies into existing engineering and delivery processes.
Lead experimentation and proof of concepts of new open-source technologies to solve observability, testing and resiliency challenges. Influence the technology adoption for the Customer Journey organization and broader company platforms.
Implement shift left automated testing to prevent defects from reaching production
Ensure all new critical subsystems, microservices, databases and external calls meet the 5 9’s availability requirement.
Provide consultation for all significant functionality changes and peer review critical production hotfixes
Conduct technical code reviews and drive innovation across the organization to adopt industry best practices.
Be part of a global operations team that support a 24/7 model, willingness to work holidays and weekends.
Experience coding with Java, python, node, or a similar language with a strong desire to learn new languages. Experience with ML is a plus.
Excellent coding and scripting skills (Terraform/Ansible), knowledge of CICD tools (Jenkins, Gitlab and Artifactory), experience with monitoring/alerting/logging solutions (Splunk, Datadog, AWS, GCP Stackdriver, etc.)
Experience as an SRE in a public cloud environment with experience in designing and building cloud-native applications. Must have hands on experience with Kubernetes.
High degree of technical knowledge, ranging across several technologies (e.g., platform enablers including Prometheus, Consul, Vault, ELK and Infrastructure platforms including Cloud, networking and storage)
Hands on experience in building and enhancing distributed micro-service systems. Must have hands on experience with ServiceMesh products such as Istio.
Awareness of the challenges of distributed systems and practices of building highly available platforms.
An affinity to connect with openness and transparency and a passion to learn new technologies and optimize them to their potential.
Bachelor’s Degree in Computer Science, Computer Engineering, or equivalent work experience.
Experience in a start up is a plus.