Data Engineer
Vision It US
Responsibilities
- Develop data pipelines to ingest, load, and transform data from multiple sources.
- Leverage Data Platform, running on Google Cloud, to design, optimize, deploy and deliver data solutions in support of scientific discovery
- Use programming languages like Java, Scala, Python and Open-Source RDBMS and NoSQL databases and Cloud based data store services such as MongoDB, DynamoDB, Elasticache, and Snowflake
- The continuous delivery of technology solutions from product roadmaps adopting Agile and DevOps principles
- Collaborate with digital product managers, and deliver robust cloud-based solutions that drive powerful experiences
- Design and develop data pipelines, including Extract, Transform, Load (ETL) programs to extract data from various sources and transform the data to fit the target model
- Test and deploy data pipelines to ensure compliance with data governance and security policies
- Moving implementation to ownership of real-time and batch processing and data governance and policies
- Maintain and enforce the business contracts on how data should be represented and stored
- Ensures that technical delivery is fully compliant with Security, Quality and Regulatory standards
- Keeps relevant technical documentation up to date in support of the lifecycle plan for audits/reviews.
- Pro-actively engages in experimentation and innovation to drive relentless improvement e.g., new data engineering tools/frameworks
- Implementing ETL processes, moving data between systems including S3, Snowflake, Kafka, and Spark
- Work closely with our Data Scientists, SREs, and Product Managers to ensure software is high quality and meets user requirements
Required Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or related field.
- 5+ years of experience as a data engineer building ETL/ELT data pipelines.
- Experience with data engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management (GIT, continuous integrations, testing, and operations)
- Experience in programming language Python and SQL good to have Java, C#, C++, Go, Ruby, and Rust
- Experience with Agile, DevOps & Automation [of testing, build, deployment, CI/CD, etc.], Airflow
- Experience with Docker, Kubernetes, Shell Scripting
- 2+ years of experience with a public cloud (AWS, Microsoft Azure, Google Cloud)
- 3+ years experience with distributed data/computing tools (MapReduce, Hadoop, Hive, EMR, Kafka, Spark, Gurobi, or MySQL)
- 2+ years experience working on real-time data and streaming applications
- 2+ years of experience with NoSQL implementations (DynamoDB, MongoDB, Redis, Elasticache)
- 2+ years of data warehousing experience (Redshift, Snowflake, Databricks, etc.)
- 2+ years of experience with UNIX/Linux including basic commands and shell scripting
- Experienced with visualization tools like SSRS, Excel, PowerBI, Tableau, Google Looker, Azure Synapse
Required Skills: Python, SQL
Additional Skills: Python Developer, Data Engineer
Background Check: Yes
Vision It US
Expertise level
Work arrangement
Similar Jobs in United States
AWS Engineer with Python
Ampstek
2 weeks ago
Software Engineer
Ascendion
2 weeks ago
2 weeks ago
2 weeks ago
Python Full Stack Engineer
Quantum World Technologies Inc.
2 weeks ago
Similar Jobs in Arizona
8 months ago
Python Developer
Leidos
8 months 1 week ago
Junior Python Django Developer
Team Remotely Inc
8 months 1 week ago
Python Developer
Twine
remote
8 months 1 week ago
Python Engineer
National Black MBA Association
8 months 2 weeks ago