Skip to content

Director, Data & Analytics Engineering

Charlie Health LogoCharlie Health

Salary

$250,000 - $325,000

Location

New York, NY

Posted

Today

As the Director, Data & Analytics Engineering, you will lead a team of Data and Analytics Engineers building the data foundation of the Charlie Health Operating Model, an AI-first model where the warehouse is no longer a passive reporting layer, but the substrate of our World Model: a continuously updated, causal representation of every patient, provider, payor, and operational signal in our ecosystem.

You will own the pipelines, schemas, and storage systems that fuel both human and machine consumers, powering our 1,600+ dbt models, our real-time Intelligence Layer, and the agentic systems composing interventions on top of it. You will push us beyond Snowflake batch cycles into event-driven signal graphs that detect drift the moment a cohort's trajectory diverges from baseline, and you will architect the semantic retrieval surfaces that make our long-form clinical and operational artifacts AI-ready.

This is a hands-on, Player-Coach leadership role. You will set the technical vision, run sprint ceremonies and roadmap planning, mentor ICs and a Data Engineering Manager, support DRIs running cross-cutting 90-day missions, and stay close enough to the code to make sharp architectural calls and unblock the team. You will partner with leaders across ML, Data Science, BizOps, Product, and Engineering and own the shape of the data model architecture that the next decade of Charlie Health is built on.

We're a team of passionate, forward-thinking professionals eager to take on the behavioral health crisis and play a formative role in providing life-saving solutions. If you're inspired by our mission and energized by the idea that the distance between identifying an operational friction and shipping a clinical intervention should be measured in days rather than months, apply today.

Responsibilities

  • Own the warehouse and pipeline architecture that powers the Charlie Health World Model, the causal, continuously updated representation of our ecosystem that fuels reporting, product analytics, and the agentic Intelligence Layer
  • Drive the long-term roadmap for AI-ready data: real-time signal graphs beyond batch cycles, schema-validated event flows, and semantic retrieval surfaces for transcripts, notes, and other long-form artifacts
  • Define and execute a vision for scalable data model architecture in Snowflake, evolving 1,600+ existing dbt models toward extensible, well-governed, agent-consumable patterns
  • Partner with Machine Learning, Data Science, Engineering, Product, and BizOps to align data initiatives with company priorities and unlock new opportunities, particularly the agentic capabilities riding on top of the World Model
  • Lead execution end-to-end: sprint ceremonies, roadmap planning, and prioritization, with the discipline to keep a growing team shipping high-quality, reliable solutions on schedule
  • Oversee data integrity, documentation, testing, monitoring, and provenance across systems so that stakeholders, human and agent, can trust and self-serve on our data
  • Guide and contribute to architecture and design decisions across our stack (Dagster, Snowflake, dbt, Fivetran, Hightouch, Hex, Tableau), and resolve critical technical issues as a hands-on technical leader
  • Own the data infrastructure that powers company-wide KPI reporting, ad hoc analysis, product analytics, and dashboard development, ensuring the underlying models, pipelines, and semantic layer enable downstream teams to deliver clear, accurate insights as business questions evolve
  • Drive reliability, scalability, observability, security, and cost-efficiency improvements across the data stack, including the PHI gatekeeping and HITRUST-aligned patterns that govern signal ingestion
  • Identify bottlenecks and implement improvements to team workflows, tools, and development practices
  • Manage and mentor a growing team of Data and Analytics Engineers and one Data Engineering Manager. Operate as a Player-Coach: focus on craft, mentorship, high-leverage code review, and supporting DRIs running 90-day cross-cutting missions
  • Establish metrics that track progress, communicate priorities, and demonstrate business impact
  • Define and maintain data governance standards and proactively manage stakeholder expectations to drive scalable, trusted data use

Requirements

  • 10+ years of data or analytics engineering experience, with 4+ years managing and mentoring data and analytics engineers, with a focus on execution and delivery
  • Proven ability to drive team operations, including sprint ceremonies, roadmap planning, and prioritization across multiple workstreams
  • Hands-on background building and maintaining ELT pipelines using SQL, dbt, and OLAP databases like Snowflake
  • Proficiency in Python (preferred) or another language strong enough to guide and review technical work
  • Experience with workflow orchestration tools like Dagster, Airflow, or Prefect
  • Skilled at building scalable reporting solutions in Tableau or Hex and enabling self-serve analytics across the organization
  • Strong data modeling, provenance, and governance skills to support extensible, trusted, and consistent reporting and patterns
  • Track record of improving team processes, optimizing workflows, and delivering measurable impact
  • Effective cross-functional partner with Machine Learning, Data Science, Product and Engineering, aligning data work with business goals
  • Clear, concise communicator who can translate complex technical concepts for non-technical stakeholders
  • Comfortable navigating ambiguity, breaking down complex problems, and driving iterative solutions
  • Committed people leader with a history of coaching talent and fostering a high-performance, inclusive team culture

This role requires 4 days per week in our NYC office (Flatiron District)

Nice to haves

  • Familiarity with healthcare data standards (HIPAA, FHIR, HITRUST)
  • Experience with AWS cloud technologies
  • Experience working in a startup environment
  • Exposure to event-driven architectures (EventBridge, Kafka, CloudEvents) and JSON Schema registries
  • Familiarity with vector stores (Pgvector, Pinecone) and patterns for semantic retrieval over operational data
  • Experience supporting ML or agentic AI workloads as a data consumer — designing schemas, features, or context surfaces that downstream models and agents rely on
  • Awareness of LLMOps tooling and what it takes to keep AI systems observable and PHI-safe