SKILLFINDER INTERNATIONAL
Lead PySpark Engineer
Explore roles
This role has expired
SKILLFINDER INTERNATIONAL
Lead PySpark Engineer
£449,000
London
Remote or hybrid
Graduate
Junior
Mid
Senior
Leader
Description

Skill Profile

  • PySpark - Advanced (P3)
  • AWS - Advanced (P3)
  • SAS - Foundational (P1)

Key Responsibilities Technical Delivery

  • Design, develop, and maintain complex PySpark solutions for ETL/ELT and data mart workloads.
  • Convert and refactor Legacy SAS code into optimized PySpark solutions using automated tooling and manual refactoring techniques.
  • Build scalable, maintainable, and production-ready data pipelines.
  • Modernize Legacy data workflows into cloud-native architectures.
  • Ensure data accuracy, quality, integrity, and reliability across transformation processes.

Cloud & Data Engineering (AWS-Focused)

  • Develop and deploy data pipelines using AWS services such as EMR, Glue, S3, and Athena.
  • Optimize Spark workloads for performance, scalability, partitioning strategy, and cost efficiency.
  • Implement CI/CD pipelines and Git-based version control for automated deployment.
  • Collaborate with architects, engineers, and business stakeholders to deliver high-quality cloud data solutions.

Core Technical Skills PySpark & Data Engineering

  • 5+ years of hands-on PySpark experience (Advanced level).

  • Strong ability to write production-grade, maintainable data engineering code.

  • Solid understanding of:

    • ETL/ELT design patterns
    • Data modelling concepts
    • Fact and dimension modelling
    • Data marts
    • Slowly Changing Dimensions (SCDs)

Spark Performance & Optimization

  • Expertise in Spark execution planning, partitioning strategies, and performance tuning.
  • Experience troubleshooting distributed data pipelines at scale.

Python & Engineering Quality

  • Strong Python programming skills with emphasis on clean, modular, and maintainable code.

  • Experience applying engineering best practices including:

    • Parameterization
    • Configuration management
    • Structured logging
    • Exception handling
    • Modular design principles

SAS & Legacy Analytics (Foundational)

  • Working knowledge of Base SAS, Macros, and DI Studio.
  • Ability to interpret and analyze Legacy SAS code for migration to PySpark.

Data Engineering & Testing

  • Understanding of end-to-end data flows, orchestration frameworks, pipelines, and change data capture (CDC).
  • Experience creating ETL test cases, unit tests, and data comparison/validation frameworks.

Engineering Practices

  • Proficient in Git workflows, branching strategies, pull requests, and code reviews.
  • Ability to document technical decisions, architecture, and data flows.
  • Experience with CI/CD tooling for data engineering pipelines.

AWS & Platform Expertise (Advanced)

Strong hands-on experience with:

  • Amazon S3
  • EMR and AWS Glue
  • Glue Workflows
  • Amazon Athena
  • IAM
  • Solid understanding of distributed computing and big data processing in AWS environments.
  • Experience deploying and operating large-scale data pipelines in the cloud.

Desirable Experience

  • Experience within banking, financial services, or other regulated industries.
  • Background in SAS modernization or cloud migration programs.
  • Familiarity with DevOps practices and infrastructure-as-code tools such as Terraform or CloudFormation.
  • Experience working in Agile or Scrum delivery environments.
Role tech stack
SKILLFINDER INTERNATIONAL
Lead PySpark Engineer£449,000
This role has expired