Skip to main content

Data Engineer, Python/AWS

Data Engineer, Python/AWS
Credera
remote
4 months 3 weeks ago

Job Details

REMOTE ROLE for Contractor anywhere in N. America ** 12 month contract and possible extension ** Rate = $65 - 80/hr USD Based On Experience

Description

Data Engineers at TA Digital work closely with Subject Matter Experts (SMEs) to design the ontology (data model), develop data pipelines, and integrate Foundry with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data.

Responsibilities Include

  • Leverage Generative AI on AWS data
  • Integrate new data sources to Foundry using Data Connection
  • Implement 2-way integrations between Foundry and external systems
  • Develop pipelines transforming tabular or unstructured data
  • Implement data transformations in PySpark / PySpark SQL to derive new datasets or create ontology objects
  • Set up support structures for pipelines running in production
  • Monitor and debug critical issues such as data staleness or data quality
  • Improve performance of data pipelines (latency, resource usage)
  • Design and implement an ontology based on business requirements and available data
  • Provide data engineering context for application development

Requirements

  • Generative AI on AWS such as Amazon Bedrock, Amazon SageMaker, Amazon EC2, Amazon EC2 UltraClusters, AWS Trainium, or AWS Inferentia
  • Python – complete language proficiency
  • SQL – proficiency in querying language (join types, filtering, aggregation) and data modeling (relationship types, constraints)
  • PySpark – basic familiarity (DataFrame operations, PySpark SQL functions) and differences with other DataFrame implementations (Pandas)
  • Distributed compute – conceptual knowledge of Hadoop and Spark (driver, executors, partitions)
  • Databases – general familiarity with common relational database models and proprietary instantiations, such as SAP, Salesforce, etc.
  • Git – knowledge of version control / collaboration workflows and best practices
  • Iterative working – familiarity with agile and iterative working methodology and rapid user feedback gathering concepts
  • Data quality – best practices

Expertise level

Work arrangement

Key skills

Similar Jobs in United States