Data Engineer - ETL, Python, Spark, Hadoop

Job Details

Data Engineer - ETL with Python, Spark & Hadoop

Location: Pittsburgh only (Hybrid 2-3 days a week).

Citizens, or L2S

Responsibilities:

Organize business needs into ETL/ELT logical models and ensure data structures are designed for flexibility to support the scalability of business solutions.
Craft and implement data pipelines utilizing Spark and Python
Define and deliver reusable components for the ETL/ELT framework.
Define optimal data flow for system integration and data migration.
Integrate new data management technologies and software engineering tools into existing structures.
Design, build, and maintain CI/CD pipelines in multiple integration and test environments.
Install, configure, and manage automated testing tools in the environment.

Experienced in the Design, Development, and Implementation of large-scale projects in financial industries using Data Warehousing ETL tools(Spark)
Experience in creating ETL transformations and jobs using PySpark and automating workflows using Orchestration tools like Airflow, Control-M
Strong knowledge and experience in SQL, Python, and Spark
Experience with Big Data/distributed frameworks such as Spark, Kubernetes, Hadoop, and Hive
Ability to design ETL/ELT solutions based on user reporting and archival requirements
Strong sense of customer service to consistently and effectively address client needs
Self-motivated; comfortable working independently under general direction
Hands-on experience in building and managing CI/CD pipelines
Basic knowledge of Azure Cloud Components