Data Engineer - ETL, Python, Spark, Hadoop
InfoGravity LLC.
2 months 1 week ago
Job Details
Data Engineer - ETL with Python, Spark & Hadoop
Location: Pittsburgh only (Hybrid 2-3 days a week).
Citizens, or L2S
Responsibilities:
- Organize business needs into ETL/ELT logical models and ensure data structures are designed for flexibility to support the scalability of business solutions.
- Craft and implement data pipelines utilizing Spark and Python
- Define and deliver reusable components for the ETL/ELT framework.
- Define optimal data flow for system integration and data migration.
- Integrate new data management technologies and software engineering tools into existing structures.
- Design, build, and maintain CI/CD pipelines in multiple integration and test environments.
- Install, configure, and manage automated testing tools in the environment.
Qualifications:
- Experienced in the Design, Development, and Implementation of large-scale projects in financial industries using Data Warehousing ETL tools(Spark)
- Experience in creating ETL transformations and jobs using PySpark and automating workflows using Orchestration tools like Airflow, Control-M
- Strong knowledge and experience in SQL, Python, and Spark
- Experience with Big Data/distributed frameworks such as Spark, Kubernetes, Hadoop, and Hive
- Ability to design ETL/ELT solutions based on user reporting and archival requirements
- Strong sense of customer service to consistently and effectively address client needs
- Self-motivated; comfortable working independently under general direction
- Hands-on experience in building and managing CI/CD pipelines
- Basic knowledge of Azure Cloud Components