Back to jobsleap
Senior Data Engineer
$200k – $250k/yr US remote full time senior Mar 20, 2026
About this role
ABOUT LEAP
Leap is one of the fastest-growing benefits solutions and a category-defining pioneer in employer specialty pharmacy. We are reshaping how life-changing therapies are delivered and financed, ensuring patients get the treatment they need while employers finally get a fair deal.
Specialty drugs and infusions represent nearly 10% of all healthcare spend and are the fastest-growing cost category for employers. Leap tackles this challenge with a novel approach: eliminating hidden markups, expanding access to high-quality infusion providers, and bringing clarity and fairness to how therapies are priced and paid for.
We’re proud to partner with numerous Fortune 500 companies and leading TPAs. Each patient we serve creates immediate ROI: lower costs, improved access, and better care. Join us as we redefine what’s possible in specialty care.
ABOUT THE ROLE
The Senior Data Engineer is responsible for owning Leap's data infrastructure end-to-end — from ingestion pipelines and warehouse architecture to the reporting layer that drives business decisions. This role partners closely with clinical operations, business operations, and leadership to ensure that data is reliable, traceable, and ready to power both human users and AI workloads. You will own the design decisions about how the data stack is built and evolved, operating with high autonomy in a small, fast-moving engineering team.
KEY RESPONSIBILITIES
Pipelines and Warehouse
- Build and own data pipelines and ETL processes for claims ingestion, drug pricing, and CRM sync using BigQuery and Python
- Design production pipelines for batch and streaming workloads, with a particular focus on high-volume claims data and new large-scale data sources on the roadmap
- Architect warehouse schemas and transformations with clear separation between raw, staging, and modeled layers
- Maintain data quality and reliability across systems that feed both human users and AI workloads, including row-count checks, schema drift detection, anomaly alerting, and silent upstream change detection
Data Governance
- Design pipelines to be idempotent and replayable, with raw data always preserved to enable reprocessing when logic changes
- Track data lineage across the full lifecycle — origin, transformation, and downstream dependencies
- Validate data at every stage before it reaches a dashboard or AI system
Reporting Infrastructure
- Build reporting systems that give sales, clinical, and leadership teams live visibility into business performance
- Create automated alerting that surfaces meaningful changes in data so the team acts on insights rather than requesting them
AI-Ready Data Infrastructure
- Build PHI-safe pipelines that support LLM workloads, agent systems, and automation
- Design a unified data architecture that connects claims, drug pricing, patient records, CRM activity, and clinical workflows into a coherent whole
- Own ingestion of external data from non-standard formats and sources across a diverse and growing provider base
QUALIFICATIONS
Required
- 5+ years of experience with Python, SQL, and dbt, with hands-on expertise in BigQuery, Snowflake, or a comparable cloud data warehouse and proficiency with orchestration tools such as Airflow, Dagster, or Prefect
- Demonstrated experience architecting data platforms, including decisions around batch vs. streaming, incremental vs. full-refresh, and warehouse structure
- Proven ability to build monitoring, lineage tracking, and governance systems that trace data from source to report
- Experience using AI tools in day-to-day work and building data infrastructure that AI systems can rely on in production
- Background as an early employee or founding data engineer responsible for building a data stack from the ground up
Preferred
- Healthcare or HIPAA experience; familiarity with ingestion tools such as Fivetran; CRM integrations (Salesforce, HubSpot); or prior experience building data infrastructure for LLM or AI workloads
- Experience with streaming frameworks such as Kafka, Pub/Sub, or Flink, or designing systems that handle both batch and real-time data flows
- Comfort with cloud infrastructure (GCP, AWS) and Linux/sysadmin fundamentals, including VM debugging, log management, and service administration
- A bias toward simple, cost-effective solutions — defaulting to open-source and applying sound judgment about when managed services justify their cost and lock-in
At Leap, we’re building an outlier company with real impact — and that takes focus, energy, and commitment. If that excites you, we’d love to hear from you.
Leap is an equal opportunity employer and welcomes applicants from all backgrounds. We’re committed to building a team that reflects a diversity of perspectives, experiences, and identities.