Senior Data Engineer

Pennsylvania - Fort Washington

Veeva Systems

Veeva Systems Inc. is a leader in cloud-based software for the global life sciences industry. Committed to innovation, product excellence, and customer success, Veeva has more than 1,100 customers, ranging from the world's largest...

View company page

Veeva [NYSE: VEEV] is the leader in cloud-based software for the global life sciences industry. Committed to innovation, product excellence, and customer success, our customers range from the world’s largest pharmaceutical companies to emerging biotechs. Veeva’s software helps our customers bring medicines and therapies to patients faster.
We are the first public company to become a Public Benefit Corporation. As a PBC, we are committed to making the industries we serve more productive, and we are committed to creating high-quality employment opportunities.
Veeva is a Work Anywhere company which means that you can choose to work in the environment that works best for you - on any given day. Whether you choose to work remotely from home or work in an office - it’s up to you.
Are you a Data Engineer in the Greater Philadelphia area looking for a challenging opportunity in data engineering, while working with some of the best engineers in the area?  Veeva OpenData is looking to hire a motivated Sr. Data Engineer that is hands-on with software and data.  This is an exciting opportunity to deliver top-quality data to the pharmaceutical industry.  This critical Veeva OpenData Engineering position is responsible for many aspects of the data we deliver to our ever-growing customers. We are looking for a passionate and enthusiastic individual who will contribute to the team, by working closely with people who are involved in the generation, handling, and consumption of our data.

What You’ll Do

  • Improving and maintaining data infrastructure and architecture
  • Identify, design, and implement data delivery mechanisms, including designing pipelines for greater scalability, optimizing data delivery, and automating manual processes using DevOps techniques and tools
  • Build new data pipelines for data ingestion, connection, transformation, and distribution
  • Create and expand existing job monitoring and QA/QC tools
  • Assist with deploying data pipelines to production
  • Working with global teams to improve automation of data pipelines and add new ones

Requirements

  • Degree in Computer Science, Engineering, Math or STEM fields 
  • Hands-on experience in the following:
  • At least 5 years’ experience working with various data sources (RDMS, No SQL, CSV, etc.) 
  • 2+ years experience with Apache Spark; 2+ years experience with Apache Airflow
  • 3+ years experience with AWS tools (EC2, EMR, Lambda, S3, Athena, Glue, etc.)
  • 3+ years experience coding in Python, Scala, or Java
  • 4+ years experience with the software development life cycle (Git, Pull Requests, Code Reviews, Testing, etc.) 
  • Current experience using CI/CD pipelines
  • Current experience in containerized development using Docker
  • Experience working with large and complex production datasets
  • Delivery of high-quality pull requests, showing strong code standards and unit testing practices 
  • Delivery of high-quality technical documentation 
  • Comfort with self-directed project management: requiring minimal oversight to assess a problem, formulate a solution, deliver code, and document changes.
  • Positive interactions with other departments stakeholders

Nice to Have

  • Experience working in the Healthcare industry or similar reference data
  • Experience with database modeling 
  • Experience with Machine Learning tools and techniques, understanding how models make decisions from data 
  • Experience with key and emerging technologies like Data Lake and Snowflake
  • Experience working on projects aimed at preemptively improving data governance and quality control in production 

Perks & Benefits

  • Allocations for continuous learning & development
  • Excellent Medical and Dental Benefits
  • Short-term and Long-term Disability
  • Generous PTO
  • 401(k)
  • Onsite gym for free
  • Healthy, free, provided lunches on Tuesdays
#LI-RemoteUS#BI-Remote
Veeva’s headquarters is located in the San Francisco Bay Area with offices in more than 15 countries around the world.
Veeva is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity or expression, religion, national origin or ancestry, age, disability, marital status, pregnancy, protected veteran status, protected genetic information, political affiliation, or any other characteristics protected by local laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us at talent_accommodations@veeva.com.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Airflow Athena AWS CI/CD Computer Science CSV Data pipelines DevOps Docker EC2 Engineering Git Lambda Machine Learning Pipelines Python Scala Snowflake Spark SQL STEM Testing

Perks/benefits: Career development

Region: North America
Country: United States
Job stats:  0  0  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.