Skip to content

Lead Analytics Engineer - Data Modeling & Quality

Salary

$160,000 - $185,000

Location

Remote

Posted

Today

Arcadia's data platform powers population health analytics for health plans, ACOs, and provider groups across the country. As a Lead Analytics Engineer — Data Modeling & Quality, you sit at the intersection of data quality ownership and analytical data modeling. You'll own the SQL and DBT layer that transforms raw clinical and claims data into trusted, production-grade datasets, while also serving as the quality authority for the data those models produce.

This is a hybrid role — deeper SQL and DBT expertise than a traditional Data Health Professional, with a more analytical and model-focused scope than a Data Engineering role. You're less focused on pipeline infrastructure and more on the logic, shape, and trustworthiness of the data itself.

What success looks like

In 3 months

  • Independently triage and resolve pipeline data quality issues
  • Author at least one new DBT model or refactor an existing one to meet current modeling standards
  • Design a DBT test suite for a set of models lacking coverage
  • Understand the end-to-end pipeline from ingress through silver and gold, and be able to trace a data quality issue to its root layer

In 6 months

Data modeling & dbt development

  • Author, review, and maintain DBT models using Spark/Hudi from ingest through bronze and silver
  • Help clients understand their data model, assumptions, and limitations through intentional validation
  • Troubleshoot and fix issues, then write DBT tests to catch issues proactively
  • Optimize SQL performance for slow-running jobs
  • Partner with Data Engineering on Hudi table design, partition strategy, and incremental patterns

Data quality ownership

  • Triage and classify data quality alerts, distinguishing source-level issues from transform-layer failures
  • Design and maintain volume monitors and DQ monitors (null rate, distribution, future-date checks)
  • Author and apply clinical DQ rules (entity volume, field coverage, LOINC coverage, referential integrity) and claims validation rules across silver and gold layers
  • Conduct quality reviews for connector promotions — evaluating silver entity coverage, validation rule pass rates, and bronze-to-silver transformation correctness
  • Own the ticket queue for DQ, attribution, hierarchy, and customer-specific data quality issues, writing clear customer-facing findings

Cross-functional quality collaboration

  • Lead data quality reviews during connector installation and promotion (UAT → PRD), including claims validation playbooks and null analysis
  • Partner with Data Engineering on root-cause triage for errors, ingress anomalies, and silver table issues surfaced through data quality monitoring
  • Coordinate with the Measure Implementation Team (MIT) when data quality issues affect quality measure scores
  • Contribute to and enforce data modeling standards across teams

Technologies

  • Data modeling: DBT-Spark, SQL, Claude
  • Warehousing: Amazon Redshift, Apache Hudi, AWS Athena
  • Data quality: volume/DQ monitors, DBT tests
  • Orchestration: Argo Workflows, Airflow
  • Source control: Git / GitHub, PR-based review workflows
  • Observability: Grafana, Loki, Jira
  • Healthcare data: Claims (plan/professional/pharmacy), EHR (clinical entities), MPI

What you'll bring

Education:

  • Bachelor's or Master's degree in Computer Science, Statistics, Business, Economics, or a related field

Experience:

  • Advanced SQL: window functions, complex CTEs, aggregation patterns, performance tuning on columnar databases
  • DBT: hands-on experience authoring models, tests, macros, and yml documentation; familiarity with incremental strategies
  • Healthcare data literacy: working knowledge of claims data (professional, institutional, pharmacy), clinical data (EHR entities), and common quality dimensions (member months, coverage rates, null patterns)
  • Data quality mindset: ability to differentiate source data issues from transform issues, design systematic validation checks, and communicate data quality findings clearly

Skills:

  • Clear communicator — able to translate technical findings for clients and non-technical stakeholders
  • Strong analytical judgment — you can look at a distribution and know when something is wrong
  • Ability to manage several projects simultaneously, leveraging AI tooling to stay organized and efficient
  • Genuine desire to learn and apply AI tools for operational efficiency

Would love for you to have

  • Experience with Spark SQL and Hudi table format
  • Familiarity with data quality monitoring tools
  • Comfortable operating in an AI-first environment using Claude to build/verify various day-to-day workflows
  • Exposure to population health analytics concepts: HEDIS measures, risk adjustment, value-based care metrics
  • Python scripting for data investigation and automation
  • Experience with Argo Workflows or similar orchestration platforms
  • Healthcare data standards: ICD-10, CPT, NDC, LOINC, NPI