Current
1,000+ records
Schema Design
ETL & Reporting
- Designed and maintained backend relational database schemas for tracking 1,000+ student performance records, writing migration scripts and enforcing data integrity constraints across semester transitions.
- Built source-to-target mapping documents and ETL pipelines to consolidate multi-source operational data for non-profit client (ESCH), delivering dashboards visualizing KPIs (completion rates, grade distributions) for stakeholders.
ETL (Smartsheet)
SQL Modeling
Budget Reporting
- Built an ETL pipeline extracting financial data from Smartsheet, transforming and loading it into a structured schema, then surfacing budget allocations, vendor payments, and delivery bottlenecks in Power BI dashboards.
- Wrote SQL queries (CTEs, multi-table JOINs) to consolidate fragmented invoice data across departmental systems into a unified reporting layer, eliminating manual reconciliation and reducing budget discrepancies.
600+ users
Python & SQL
90%+ uptime
- Built a Python (Pandas) + SQL log ingestion pipeline processing daily system usage logs for 600+ concurrent users, parsing raw log files, flagging hardware failure patterns, and loading structured records into a PostgreSQL reporting table.
- Delivered weekly pipeline health reports to department leadership tracking system responsiveness and exam load speeds, enabling proactive hardware optimization decisions that maintained 90%+ operational uptime.
500K+ rows
ETL pipelines
Data Quality
- Designed and maintained batch ETL pipelines ingesting 500K+ delivery records, applying incremental loading, deduplication, and data quality checks before surfacing results to downstream reporting tables.
- Built automated data quality validation scripts in Python (Pandas) with rule-based checks (null rates, referential integrity, range bounds) that eliminated recurring data errors before they reached downstream reports.
- Wrote optimized SQL (CTEs, window functions, JOINs) to model and query the curated data layer, identifying workflow bottlenecks and SLA breaches flagged in interactive Power BI dashboards used in stakeholder syncs.