Responsibilities

Develop data pipelines to ingest, load, and transform data from multiple sources.
Leverage Data Platform, running on Google Cloud, to design, optimize, deploy and deliver data solutions in support of scientific discovery
Use programming languages like Java, Scala, Python and Open-Source RDBMS and NoSQL databases and Cloud based data store services such as MongoDB, DynamoDB, Elasticache, and Snowflake
The continuous delivery of technology solutions from product roadmaps adopting Agile and DevOps principles
Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences
Design and develop data pipelines, including Extract, Transform, Load (ETL) programs to extract data from various sources and transform the data to fit the target model
Test and deploy data pipelines to ensure compliance with data governance and security policies
Moving implementation to ownership of real-time and batch processing and data governance and policies
Maintain and enforce the business contracts on how data should be represented and stored
Ensures that technical delivery is fully compliant with Security, Quality and Regulatory standards
Keeps relevant technical documentation up to date in support of the lifecycle plan for audits/reviews.
Pro-actively engages in experimentation and innovation to drive relentless improvement e.g., new data engineering tools/frameworks
Implementing ETL processes, moving data between systems including S3, Snowflake, Kafka, and Spark
Work closely with our Data Scientists, SREs, and Product Managers to ensure software is high quality and meets user requirements

Required Qualifications

Bachelor's or Master's degree in Computer Science, Engineering, or related field.
5+ years of experience as a data engineer building ETL/ELT data pipelines.
Experience with data engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management (GIT, continuous integrations, testing, and operations)
Experience in programming language Python and SQL good to have Java, C#, C++, Go, Ruby, and Rust
Experience with Agile, DevOps & Automation [of testing, build, deployment, CI/CD, etc.], Airflow
Experience with Docker, Kubernetes, Shell Scripting
2+ years of experience with a public cloud (AWS, Microsoft Azure, Google Cloud)
3+ years experience with distributed data/computing tools (MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL)
2+ years experience working on real-time data and streaming applications
2+ years of experience with NoSQL implementations (DynamoDB, MongoDB, Redis, Elasticache)
2+ years of data warehousing experience (Redshift, Snowflake, Databricks, etc.)
2+ years of experience with UNIX/Linux including basic commands and shell scripting
Experienced with visualization tools like SSRS, Excel, PowerBI, Tableau, Google Looker, Azure Synapse

Required Skills: Python, SQL

Additional Skills: Python Developer, Data Engineer

Background Check: Yes

Vision It US

Data Engineer

Responsibilities

Required Qualifications

Expertise level

Work arrangement

Key skills

Similar Jobs in United States

AWS Engineer with Python

Software Engineer

Software Engineer with Kubernetes, Python/Golang - Hybrid, San Jose, CA/RTP, NC

Summer Teaching Assistant - Introduction to Python, Oakland

Python Full Stack Engineer

Similar Jobs in Arizona

Python Engineer with Snowflake Experience

Python Developer

Junior Python Django Developer

Python Developer

Python Engineer

Similar Jobs in Scottsdale

Senior Python Backend Developer

Staff Software Engineer, Backend Python

Data Architect / Technical Lead

Graph/TigerGraph Database Consultant with Python - Hybrid

Python Full-Stack Developer with TigerGraph Database and FastAPI - Hybrid in Scottsdale, AZ