Senior Data Engineer - Python, Hadoop, Spark
Harvey Nash
Senior Data Engineer
Python/Hadoop/Spark – sought by leading investment bank based in London – Hybrid – contract*inside IR35 - umbrella*
Key Responsibilities:
- Design and implement scalable data pipelines that extract, transform and load data from various sources into the data Lakehouse.
- Help teams push the boundaries of analytical insights, creating new product features using data.
- Develop and automate large scale, high performance data processing systems (batch and real time) to drive growth and improve product experience.
- Develop and maintain infrastructure tooling for our data systems.
- Collaborate with software teams and business analysts to understand their data requirements and deliver quality fit for purpose data solutions.
- Ensure data quality and accuracy by implementing data quality checks, data contracts and data governance processes.
- Contribute to the ongoing development of our data architecture and data governance capabilities.
- Develop and maintain data models and data dictionaries.
Skills & Qualifications:
- Significant Experience with data modelling, ETL processes, and data warehousing.
- Significant exposure and hands-on in at least 2 of the programming languages - Python, Java, Scala, GoLang.
- Significant experience with Hadoop, Spark and other distributed processing platforms and frameworks.
- Experience working with Open table/storage formats like delta lake, apache iceberg or apache hudi.
- Experience of developing and managing real-time data streaming pipelines using Change data capture (CDC), Kafka and Apache Spark.
- Experience with SQL and database management systems such as Oracle, MySQL or PostgreSQL.
- Strong understanding of data governance, data quality, data contracts, and data security best practices.
- Exposure to data governance, catalogue, lineage and associated tools.
- Experience in setting up SLAs and contracts with the interfacing teams.
- Experience working with and configuring data visualization tools such as Tableau.
- Ability to work independently and as part of a team in a fast-paced environment.
- Experience working in a DevOps culture and willing to drive it. You are comfortable working with CI/CD tools (ideally IBM UrbanCode Deploy, TeamCity or Jenkins), monitoring tools and log aggregation tools. Ideally, you would have worked with VMs and/or Docker and orchestration systems like Kubernetes/OpenShift.
Please apply within for further details – Matt Holmes – Harvey Nash
Harvey Nash
Expertise level
Work arrangement
Similar Jobs in United Kingdom
2 weeks ago
Quantitative Developer
Client Server
2 weeks ago
Python Developer - AI
Source Technology
2 weeks ago
2 weeks ago
Python Developer
Mondrian Alpha
2 weeks ago
Similar Jobs in Greater London
Data Analyst / Data Scientist
Corriculo
remote
2 weeks ago
Senior Data Scientist - Real-time Data
Client Server
remote
2 weeks ago
Similar Jobs in London
Data Analyst / Data Scientist
Corriculo
remote
2 weeks ago
Senior Data Scientist - Real-time Data
Client Server
remote
2 weeks ago