Senior Data Engineer

McLean, Virginia, United States

Applications have closed

TheIncLab

At TheIncLab we design and build intelligent systems that learn and collaborate with humans. We are the first human-centered artificial intelligence experience (AI+X) lab. TheIncLab’s award-winning, multi-disciplinary team offers comprehensive...

View company page

Who We Are

TheIncLab is the first human-centered artificial intelligence experience (AI+X) lab. TheIncLab’s award-winning, multi-disciplinary team is focused on designing and developing AI-enabled systems that learn and collaborate with humans. The company offers its clients comprehensive capabilities for rapid ideation, software development and building of smart systems and hardware solutions. Its open, scalable AI architecture approach, combined with years of experience in interactive engineering and emerging technology innovation, allows for rapid prototyping and deployment of transformational concepts, products and solutions designed to work with meaningful human interaction, effectively bridging the gap between humans and intelligent systems.

Job Description

The Data Engineer will design, develop, and launch reliable data lakes and large datasets to fuse multiple datatypes including geography, telemetry, text, image, and video. The ideal candidate should have a strong systematic mindset and ability to communicate clearly in multiple technical contexts. Candidates should be passionate about finding insights in large datasets, while maintaining attention to database architecture, data reliability, efficiency, and quality.

Requirements

  • Work with business teams to collect and transform critical data into information and knowledge.
  • Effectively apply business knowledge and technical expertise to design, develop and implement databases, data warehouses, data marts, interfaces, custom programming, complex reports, analysis, and web-related applications for large datasets.
  • Identify and implement solutions for the data requirements, including building pipelines to collect data from disparate, external sources, implementing rules to validate that expected data is received, cleansed, transformed, massaged and in an optimized output format for the data store.
  • Work with business teams to ensure end-to-end design and delivered solution meets business data requirements.
  • Perform validation and analytics in support of the client requirements and evolves solutions through automation, optimizing performance with minimal human involvement.
  • Monitor pipeline status, performance, and troubleshoots issues while working on improvements to ensure the solution is the very best version to address the customer need.
  • Focus specifically on the development and maintenance of scalable data stores that supply big data in forms needed for business analysis.
  • Apply advanced consulting skills, extensive technical expertise, and has full industry knowledge to develop innovative solutions to complex problems.
  • Create and maintain technical documentation.
  • Protect assets and the integrity, security and privacy of information entrusted to or maintained by the organization.
  • Responsible for producing exemplary quality & scalable code and delivering features on time.
  • Assist in the ongoing development of technical best practices for data movement, data quality, data cleansing and other related activities.
  • Ability to travel (up to 25%).
  • Other duties as assigned.

Qualifications

  • Master's/Bachelor’s degree in any quantitative discipline such as Engineering, Computer Science, Economics, Statistics, Mathematics or equivalent work experience.
  • 7+ years of experience with programming languages for data manipulation, including Python, JavaScript, R, Scala, C++, C#, Java, or equivalent.
  • Experience working with structured, unstructured and/or semi-structured data and conducting analysis.
  • Experience creating mathematical systems to support modeling and simulation applications.
  • Experience creating models for Machine Learning (ML) and creating data for training ML models.
  • Experience with both relational and distributed database design and management in cloud-based environments.
  • Experience with cloud services such as AWS, Azure or Google Cloud.
  • Experience with enterprise DataOps, DevSecOps and MLOps processes to operationalize and monitor data science models.
  • Experience with Agile development.
  • Experience with CI/CD, including Git, Jenkins, and Docker.

Eligibility Requirements

  • Applicants must be a US citizen and be able to obtain a clearance due to the nature of the role.

Benefits

  • Medical, Dental, and Vision Insurance
  • 100% company-paid Short-Term and Long-Term Disability
  • 100% company-paid Basic Life Insurance
  • Paid Time Off
  • Paid Holidays
  • 401(k) wither employer matching and immediate vesting


No relocation assistance is available.

This is a direct hire position. We do not accept indirect resumes, recruiters, or Third Party.

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Tags: Agile Architecture AWS Azure Big Data CI/CD Computer Science Consulting DataOps Data quality Docker Economics Engineering GCP Git Google Cloud Java JavaScript Machine Learning Mathematics ML models MLOps Pipelines Privacy Prototyping Python R Scala Security Statistics Travel

Perks/benefits: Career development Health care Insurance Relocation support Team events

Region: North America
Country: United States
Job stats:  3  1  0
Category: Engineering Jobs

More jobs like this

Explore more AI, ML, Data Science career opportunities

Find even more open roles in Artificial Intelligence (AI), Machine Learning (ML), Natural Language Processing (NLP), Computer Vision (CV), Data Engineering, Data Analytics, Big Data, and Data Science in general - ordered by popularity of job title or skills, toolset and products used - below.