Lead Data Engineer

New York, NY or Remote (US only)

Applications have closed
Truveris logo
Truveris

Posted 1 month ago

Truveris is a digital health company that partners with employers, brokers, and pharmaceutical companies to dramatically improve people’s ability to afford and access prescription drugs. With expertise and technology solutions that span across the prescription drug ecosystem, we deliver the outcomes people and businesses need to thrive.
We are on a mission to transform the pharmacy benefit industry and are backed by leading venture capital firms including Canaan Partners, First Round, New Atlantic Ventures, New Leaf Venture Partners, Tribeca Venture Partners and McKesson Ventures. In 2018, Truveris was ranked as one of the fastest growing technology companies in the U.S. by Deloitte, Crain's, and Inc.
POSITION SUMMARY
As a Lead Data Engineer, you are a critical resource, working across teams to build and enhance our Unified Data Platform and Innovation Agenda. Collaboratively across teams, you will have the opportunity to build creative solutions to business and technical challenges. In this role, you will act as an expert in your domain area and serve as a partner to Product; all the while promoting best practices and mentoring other engineers within your team.  

RESPONSIBILITIES

  • Work as part of multifunctional teams to own the design and architecture of the logical entity model of the client's suite of products.
  • Convert logical models into physical data models employing sound database normalization techniques.
  • Create physical database objects like tables and views with appropriate data types, foreign keys, constraints, and upfront design and maintenance of proper indexes.
  • Create and maintain easy to follow technical documentation of data models.
  • Serve as a go-to resource for questions related to existing and proposed logical and physical data models.
  • Perform SQL code reviews and ensure that new database code meets company standards for readability, reliability, and performance.
  • Assist with resolving the performance of poorly executing stored procedures and queries.
  • Support team initiatives by developing tools and identifying opportunities for process automation; assist in evaluation and selection of standard tools for the department.
  • Support building and deploying the infrastructure for ingesting high-volume data from various sources.
  • Support developing and maintaining the data-related scripting for build/test/deployment automation.
  • Research individually and in collaboration with other teams on how to solve problems.
  • Partner with team and research, design, test, and evaluate new technologies and services as it applies to data warehousing and architecting.
  • Partner with the team to maintain an organization-wide view of current and future strategy and approach as it applies to data warehousing and architecting.
  • Identify and resolve bottlenecks and bugs.

REQUIRED

  • 5+ years of industry experience in data engineering, with a track record of manipulating, processing, and working with large datasets
  • Architecting, building, and maintaining end-to-end, high-throughput data systems and their supporting services
  • Designing data systems that are secure, testable, and modular, particularly in Python, as well as their support infrastructure and services (shell scripts, job schedulers, message queues, etc.)
  • Strong experience with SQL, preferably in Postgres and Redshift, and implementation best-practices (i.e. index management, constraints, performance tuning techniques, etc.)
  • Working with distributed systems as it pertains to data storage and computing
  • Experience with AWS-based database systems (RDS/Aurora, Redshift, etc...)
  • Architecting scalable data pipelines that handle a lot of data in a cost-effective manner using AWS technologies including Airflow, Spark, EMR, MSK, Kinesis, Redshift
  • Using profiling tools, debugging logs, performance metrics, and other data sources to make code- and application-level improvements
  • Developing for continuous integration and automated deployments
  • Familiarity with GIT and release engineering strategies.
  • Track record of working in Scrum / Agile software teams
  • BS/MS in Computer Science or a related technical field
  • Proficient in spoken and written English

NICE TO HAVE

  • Bachelor’s degree or higher in a technical field of study
  • No-SQL databases, Operational Data Sources, and Data Lakes is preferred but not required.
  • Familiarity with continuous delivery and DevOps
  • Flexibility and creativity in solution design - including leveraging emerging technologies
  • Ability to clearly explain and justify ideas when faced with competing alternatives
  • Ability to design, communicate and apply effective and  architectural design patterns across a wide range of technical problems
Truveris provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.
Job tags: Airflow AWS Data Warehousing Distributed Systems Engineering Postgres Python Redshift Research Scrum Spark SQL
Job region(s): North America Remote/Anywhere