Hi! I am Arushi

I'm a Business Intelligence & Data Engineer at Oxbridge Health, where I own the complete data & reporting stack. I build production-grade data infrastructure, end-to-end reporting platforms, and AI-powered analytics capabilities that transform how teams access and use data.

My work spans the full data lifecycle: from architecting cloud databases and designing data warehouses, through building automated ETL pipelines with CI/CD governance, to delivering self-service BI platforms and conversational analytics that enable non-technical stakeholders to independently access complex healthcare metrics.

Previously, I developed ML models for healthcare equity at Althea.ai and conducted research on synthetic-to-real domain adaptation at the AIEM Lab at Johns Hopkins University, where I earned my Master's in Electrical and Computer Engineering.

Featured Projects

🤖

AI-Powered Conversational Analytics for Healthcare Reporting

Oxbridge Health | March 2025 - July 2025

Designed and implemented a conversational analytics platform integrating MCP servers with Looker to remove SQL dependency for non-technical stakeholders across Finance, Operations, and Clinical teams, and configured Looker’s Explore Assistant with a structured AI agent instruction library to standardize natural language access, governance, and interpretation of core healthcare metrics.

Tech Stack: MCP Servers, Looker, LookML, AI Agents, Gemini

📊

End-to-End Healthcare Reporting Platform

Oxbridge Health | Aug 2024 - Present

Led end-to-end development of Reporting Platform and delivered 55 analytics dashboards across Finance, Operations, Claims, and Provider Network – driving initiatives from requirement gathering through production deployment. Designed a star schema data warehouse with robust staging architecture, built scalable ETL pipelines using AWS Glue and Step Functions, and developed Looker’s LookML semantic layer to power dashboards.

Tech Stack: PostgreSQL, AWS (Glue, Step Functions, RDS, S3, CloudWatch), Looker, LookML, Star Schema, GitHub Actions, CI/CD

🏥

Provider Utilization & Description Generation Pipeline

Oxbridge Health | Jan 2026 - Feb 2026

Automated provider description generation for a member-facing healthcare portal by building an end-to-end pipeline integrating CMS public data, commercial claims, and multi-network provider rosters to compute utilization statistics with peer percentile rankings at scale.

Tech Stack: Python, AWS Glue, PostgreSQL, PySpark, SQL, Jinja

💰

Claims Reconciliation & Payment Reporting Pipeline

Oxbridge Health | June 2025 - Present

Unified claims payment reporting across multiple third-party administrators by building an automated reconciliation pipeline integrating payment authorization, claims data, and TPA files into consolidated financial views supporting payment decisions and audit compliance.

Tech Stack: Python, SQL, AWS (Glue, Step Functions, S3), PostgreSQL

SDoH Risk Attribution & HEDIS Compliance Model

Althea.ai | Feb 2024 - Oct 2024

Built SDoH risk attribution model achieving 95% validation accuracy against historical claims data. Predicted HEDIS compliance with 86% accuracy, directly informing client's member outreach strategy, by developing XGBoost model combining socioeconomic and claims data features.

Tech Stack: Python, XGBoost, Scikit-Learn, AWS SageMaker, SQL, Geocoding APIs

Cardiac Function Prediction using Deep Learning

Johns Hopkins University | Jan 2023 - Apr 2023

Achieved 93% AUC classifying at-risk cardiac patients by building video vision transformer analyzing 10,031 echocardiogram videos to predict Ejection Fraction using DeepLabv3 segmentation and ViVIT architecture.

Tech Stack: PyTorch, DeepLabv3, Video Vision Transformers, Hugging Face, Python

Technologies & Tools

Data Engineering & Modeling

SQL, PySpark, PostgreSQL, Amazon Redshift, Star Schema, ETL/ELT Pipelines

Cloud & Infrastructure

AWS - Glue, Lambda, Step Functions, RDS, S3, Redshift, SNS, EventBridge, SageMaker, Athena, CloudWatch, DynamoDB, CDK

BI & Visualization

Looker, LookML, Tableau, AWS QuickSight, Plotly, Dash

Machine Learning & AI

PyTorch, Scikit-Learn, HuggingFace, TensorFlow, MCP Servers

Tools & Frameworks

GitHub Actions, Apache Airflow, CI/CD, Docker, Git, Jira, Xray, Confluence, Claude Code

About

What I Do

I specialize in building end-to-end data platforms that take organizations from raw data to self-service analytics. My work combines data engineering, cloud architecture, and AI to make complex healthcare data accessible to everyone -- from Finance analysts to Clinical teams.

Core Capabilities

  • End-to-End Reporting: Requirements extraction through productionization -- data warehouses, ETL pipelines, semantic layers, dashboards
  • Cloud Data Infrastructure: Database architecture, multi-environment deployment, CI/CD automation, observability
  • AI-Powered Analytics: Conversational analytics, MCP server integration, natural language querying, AI agents
  • Healthcare Domain: Claims processing, provider networks, episode-based care, HEDIS, financial reporting
  • Data Governance: HIPAA-compliant data handling, automated quality gates, migration protocols, audit trails

Get In Touch

I'm open to opportunities in Data Engineering, Analytics Engineering, BI Engineering, and AI/ML Engineering roles. Whether you're looking to build scalable data infrastructure, simplify reporting, or bring AI into your analytics stack, I'd love to connect.

Get In Touch

I'm always open to discussing new opportunities, collaborations, or answering questions about my work. Feel free to reach out!