PySpark Developer
<b>Requirements:</b>
<ul><li>5 years of hands-on experience with PySpark/Spark SQL</li><li>Strong proficiency in AWS Data Stack: EMR, Glue, S3, Athena, and Glue Workflows</li><li>Solid foundation in SAS for understanding and debugging legacy logic</li><li>Expertise in ETL/ELT, dimensions, facts, SCDs, and data mart architecture</li><li>Experience with parameterisation, exception handling, and modular Python design</li></ul>
<b>Responsibilities:</b>
<ul><li>Lead the end-to-end migration of SAS code to PySpark using automated tools and manual refactoring</li><li>Design, build, and troubleshoot complex ETL/ELT workflows and data marts on AWS</li><li>Optimise Spark workloads for execution efficiency, partitioning, and cost-effectiveness</li><li>Implement clean coding principles, modular design, and robust unit/comparative testing</li><li>Maintain Git-based workflows, CI/CD integration, and comprehensive technical documentation</li></ul>
<b>Technologies:</b>
<ul><li>AWS</li><li>CI/CD</li><li>Cloud</li><li>ETL</li><li>Git</li><li>Python</li><li>PySpark</li><li>SAS</li><li>SQL</li><li>Spark</li><li>DevOps</li></ul>
<p><b>More:</b></p>
<p>We are seeking a Lead PySpark Engineer for a 5-month contract, focused on a large-scale data modernization project transitioning legacy data workflows to a high-performance AWS cloud environment. This is a fully remote role for UK-based candidates, offering 33 days of holiday entitlement pro-rata. You will collaborate with our internal team to drive engineering excellence in the financial services industry.</p>
<p>last updated 8 week of 2026</p>