Data Platform / Lakehouse Engineer Profiles

Platform Engineering & SRE Focused — USA Only · <10 Years Experience

🇺🇸 USA Only ✓ Confirmed <10 years Platform / SRE Iceberg / Delta Lake Spark / AWS / GCP / Azure EKS / Kubernetes

Summary — Verified USA Profiles

William Latshaw — Procon Analytics
📍 Riverview, Florida · ~2.5 years
Platform Build · Iceberg Lakehouse · AWS CDK · MWAA Airflow
Pranav Murali — Clover Health
📍 Dallas, Texas · 6+ years
Multi-Cloud Platform · CI/CD · MLOps · 40% latency reduction
Billy Eson — Charles River Development
📍 Mechanicsburg, Pennsylvania · 7 years
Azure Data Factory · Databricks · Terraform IaC · 45% faster integration
Sanjay Narayanan — Magpie Literacy
📍 New York, New York · 7 years
SRE Background · IaC · 2x team velocity · Argo CD workflows
Anup Kumar Joshi — Vanguard
📍 West Chester, Pennsylvania · ~6 years
Lakehouse/Ingestion · Iceberg · AWS Glue · FinTech compliance
Surabhi Chanchal — Crysalis Biosciences
📍 San Francisco Bay Area · ~8 years
Iceberg · Databricks · DBT · CI/CD · ML pipelines
Ranjit Lanzapalli — Capital One (via Compunnel)
📍 Glen Allen, Virginia · 20+ years — OUTSIDE CRITERIA
Staff Platform/SRE · Databricks · Delta Lake · Large-scale platforms
Note: Excluded — 20+ years total experience

Detailed Profiles — Verified & Under 10 Years

Procon Analytics ~2.5 years Platform Engineer 🇺🇸 Florida
William Latshaw
AWS Cloud/Platform Engineer
📍 Riverview, Florida, United States
Architected and implemented a fully private modular AWS data platform using TypeScript CDK IaC. Built real-time serverless telemetry ingestion pipeline processing 100 million events daily. Led modernization of legacy data warehouse to a serverless Iceberg lakehouse, enabling near real-time analytics for 100M+ daily records. Automated provisioning and deployment via GitHub Actions OIDC with containerized Airflow images. Implemented dynamic retention/tiering with Glacier, reducing S3 costs to sub-$1K/month while retaining analytical accessibility.
AWS CDK (TypeScript) MWAA Airflow 3.0.6 Iceberg Kinesis Firehose Lambda Glue Aurora Serverless v2 GitHub Actions OIDC Terraform VPC Mesh Docker/ECR
Clover Health 6+ years Senior Data Platform Engineer 🇺🇸 Texas
Pranav Murali
Senior Data Platform Engineer
📍 Dallas, Texas, United States (Remote)
Architect and scale enterprise-grade, cloud-native data platforms on AWS, GCP, and Azure, leveraging Snowflake, BigQuery, dbt, Airflow, and Python for real-time streaming and batch ETL/ELT pipelines. Reduced data latency by 40% and improved system availability by 35%. Reduced MTTR by 30% through proactive alerting and monitoring. CI/CD automation using Azure DevOps, AWS CodePipeline, GitHub, and CloudWatch. Also built MLOps pipelines with Kubernetes and Kubeflow.
Databricks Snowflake BigQuery PySpark dbt Delta Lake Airflow Kubernetes Docker Kubeflow Azure Data Factory Kinesis Lambda CI/CD
Strong platform/infrastructure focus — CI/CD automation, observability, MLOps. Remote at Clover Health.
Charles River Development 7 years Platform Engineer 🇺🇸 Pennsylvania
Billy Eson
Cloud & Data Platform Engineer
📍 Mechanicsburg, Pennsylvania, United States
Azure Data Factory + Synapse pipelines that cut data-integration time 45%. Re-engineered high-volume Databricks ETL frameworks processing billions of rows daily, reducing runtime 50% and increasing throughput 100×. Migrated PostgreSQL/Redshift to Synapse + Data Lake, achieving 35% cost reduction and 200% storage scalability. Implemented Microsoft Fabric–powered monitoring. Automated infrastructure with Terraform, ARM & Bicep, reducing management cost 90%. VP at Charles River (Aug 2022–Present).
Azure Data Factory Azure Synapse Databricks Microsoft Fabric Terraform ARM & Bicep DBT Delta Lake Snowflake Kafka AWS DevOps CI/CD
Strong infrastructure as code and CI/CD — Terraform, ARM, Bicep, Azure DevOps. FinTech (finance domain).
Magpie Literacy 7 years Platform Engineer 🇺🇸 New York
Sanjay Narayanan
Analytics & Data Platform Engineer
📍 New York, New York, United States (Remote)
Built loosely-coupled data lake and microservice infrastructure using Glue, Spark, Lambda, Batch, Step Functions, Athena, DBT — enabling inter-team collaboration. Rebuilt technical debt as IaC with CI/CD, increasing team velocity 2x. Migrated multiple TB of data, improving development speed cycle by ~50%. Prior SRE experience at Lumin Digital — maintained high security standards, improved workflows with Argo CD, Argo Workflows. SRE Team at Lumin before founding data engineer role.
AWS CDK CloudFormation Terraform Glue Spark Lambda Step Functions Athena DBT Kafka Kinesis Docker Kubernetes Argo CD GitHub Actions
Strong SRE/platform background — IaC, CI/CD, Docker, Argo workflows. SRE Team at Lumin Digital before current role. CMU grad.
Vanguard ~6 years Senior Data Engineer 🇺🇸 Pennsylvania
Anup Kumar Joshi
Senior Data Engineer | Lakehouse & Data Platform
📍 West Chester, Pennsylvania (Malvern, PA)
Owns end-to-end lakehouse ingestion for risk, security, and compliance analytics at Vanguard across 20+ enterprise sources, 50+ automated control checks, and 90+ files under daily SLA enforcement. Optimized Apache Iceberg table properties for high-concurrency ingestion. Led migration of Dremio products to Iceberg lakehouse tables. Built standardized DQ framework using AWS Glue and Step Functions. Led migration of 100+ repositories from Bitbucket to GitHub with CI/CD pipelines via CloudFormation.
Apache Iceberg AWS Glue Step Functions EMR Apache Hudi Athena Redshift Kinesis Lambda ECS Fargate Terraform Splunk PySpark GitHub CI/CD
Strong platform/IaC focus — CDK IaC, GitHub Actions, Splunk monitoring, compliance/audit-ready pipelines. FinTech at Vanguard.
Crysalis Biosciences ~8 years Senior Data Platform Engineer 🇺🇸 California
Surabhi Chanchal
Software Engineer | Data Platform Engineer | AI Engineer
📍 San Francisco Bay Area, California
Developed end-to-end ML pipelines on Databricks with PySpark and Delta Lake. Developed a Universal Data Layer (UDL) leveraging AWS S3, Apache Iceberg, and AWS Glue Catalog for zero-copy data sharing. Developed features for Self-Serve Data Observability Service — automating Data Quality, Data Recon, Data Drift, and Schema Drift configuration. Developed CI/CD pipelines for DBT models. Prior experience at Cognizant (2017–2021) building Spark streaming with Kafka and Snowflake.
Apache Iceberg Databricks PySpark Delta Lake DBT Airflow Kafka Snowflake AWS Glue Kubernetes Python Java CI/CD
MS in Computer Science from Georgia State (2024). Strong data platform + ML focus with Iceberg and observability.
Amazon 10 years Lead Data Engineer 🇺🇸 Texas
Aleem M
Lead Data Engineer / Data Platform Engineer
📍 Dallas, Texas, United States
Lead Data Engineer delivering production-grade data platforms on AWS using Databricks/Spark/Delta, Snowflake, dbt, Airflow, and Kafka. Owns platform standards for ingestion, transformation, orchestration, testing, monitoring, and release readiness. Builds and maintains Airflow DAGs on MWAA with reliable retry strategy, idempotent patterns, and safe backfills. Develops Databricks Spark pipelines for batch and near real-time processing; optimizes Delta tables. Currently at Amazon (Jul 2024–Present). Previously at AT&T and Microsoft.
Databricks Spark Delta Lake Snowflake dbt Airflow MWAA Kafka Terraform Python CloudWatch
10 years total — at the boundary. Strong platform focus with AWS, Databricks, and operational monitoring. Currently at Amazon. US-based.

Excluded — Outside Experience Criteria

Capital One 20+ years Staff Platform Engineer / SRE 🇺🇸 Virginia
Ranjit Lanzapalli
Staff Platform Engineer | SRE | Ex-Capital One
📍 Glen Allen, Virginia, United States
Staff-level Platform Engineer and SRE with 20+ years of experience designing, scaling, and operating mission-critical cloud and data platforms. Specialized in AWS, Site Reliability Engineering (SLOs, SLIs, incident management), Databricks, Delta Lake, and OneLake. Leading platform engineering and SRE efforts at Capital One. Previously at Wipro for 15 years. Very strong profile but 20+ years total experience — outside criteria.
AWS Databricks Delta Lake OneLake SRE Kubernetes Terraform CI/CD Python
❌ EXCLUDED: 20+ years total experience (2005–present). Profile is strong but exceeds the <10 year threshold.