About me

I specialize in building and optimizing ‘big data’ data pipelines, architectures, and data sets on cloud platforms like AWS and Azure. My skills are a blend of technical expertise and strategic thinking, allowing me to not only build systems but also to understand the business problems they are designed to solve.

My core competencies include:

  • Cloud Platforms: AWS (Glue, EMR, Redshift, S3, IAM) and Azure (Data Factory, Databricks).
  • Big Data & Streaming: Apache Spark, Apache Kafka, and Databricks.
  • Programming & Databases: Python, SQL, and advanced experience with relational and non-relational databases.
  • DevSecOps & Automation: Designing and deploying automated CI/CD pipelines and performing root cause analysis to ensure data integrity and compliance.


What I Do Best

  • Engineering Scalable Solutions: I design and implement data architectures that can handle terabytes of data daily while ensuring high performance and reliability.
  • Driving Business Value: My projects have consistently led to tangible outcomes, such as improving portfolio yield and providing critical insights to business stakeholders.
  • Problem Solving: I'm adept at performing root cause analysis to fix pipeline failures and ensure data integrity, which is the foundation of trustworthy analytics.

Track record

Education

  1. Arizona State University

    2024 — 2025

    Masters of Business Analytics

    - Relevant Coursework: Machine Learning in Business, Cloud Analytics, Supply Chain Analytics, Analytical Decision Modeling, AI and Data Analytics.

  2. D.Y.Patil University

    2017 — 2021

    Bachelors of technology-Electronics

Experience

  1. PROJECT LEAD

    JANUARY 2025 — MAY 2025

    DigiHeat ASU

    • Designed and built a robust Python data pipeline to automate backtesting on over a decade of stock data, increasing backtest ROI by 14% and enabling rapid strategy validation.

    • Engineered a dynamic rotation algorithm that strategically selected stocks based on market conditions, directly improving the portfolio’s annual yield by 11.3%.

    • Implemented a price normalization method to standardize data across multiple tickers, creating a reliable dataset that boosted predictive model accuracy by 22%.

    • Developed and deployed an automated ex-dividend calendar that streamlined a critical workflow for 3+ stakeholders and improved trade timing by 32% by eliminating manual tracking.

    • Created and delivered predictive dashboards providing real-time market insights, which improved the quality of over 200 simulated trade decisions.

    1. DATA ENGINEER

      February 2024 — August 2024

      Corsearch

      • Engineered and optimized ETL workflows using AWS Glue and Redshift to process over 1TB ofdata daily, implementing IAM-compliant access controls to ensure data security and governance.

      • Implemented a real-time streaming solution with Apache Kafka, accelerating data delivery by 50% for multiple teams and enabling faster, data-driven decision-making

      • Optimized performance of Spark SQL scripts via Airflow and AWS EMR, boosting pipeline throughput by 3x and significantly reducing processing time for critical datasets.

      • Secured and developed GDPR-compliant distributed pipelines, automating data lineage and access logs to achieve a 35% reduction in manual audit preparation time.

      • Led training for a team of 5 analysts on data engineering tools and best practices, boosting team efficiency and enabling them to handle scalable data processes independently.

      • Integrated automated data validation and quality checks into ETL pipelines, reducing downstream errors by 40% and boosting the reliability of brand protection analytics.

    2. DATA ENGINEER

      March 2023 — August 2023

      Hansa Cequity

      • Designed and deployed a secured Azure data platform (Synapse, Delta Lake, Kubernetes) for financial data, cutting data ingestion time by 40% while ensuring GDPR compliance and enabling faster insights

      • Automated end-to-end ETL/ELT pipelines using GitHub Actions for CI/CD, which reduced deployment time by 40% for loading over 1 million financial records into Snowflake.

      • Deployed automated SQL checks within Snowflake and Synapse pipelines, proactively identifying data inconsistencies and improving financial reporting accuracy by 35%.

      • Empowered finance teams with self-service ad-hoc reporting in Snowflake by developing optimized SQL views, delivering faster, tailored insights and reducing reliance on the engineering team.

      • Monitored and debugged data pipelines using Azure Monitor, Snowflake, and GitHub Actions, proactively cutting incident rates and ensuring 99% on-time data delivery.

    3. Data Engineer

      May 2021 — March 2023

      Vagmine Marine

      • Designed and implemented an OLAP data warehouse using a Star Schema, which improved marine analytics speed by 40% and provided a unified source of truth for key operational metrics.

      • Optimized complex MDX queries and calculated members in OLAP cubes, improving fleet and cost analysis efficiency by 35% for business stakeholders.

      • Collaborated with marine and logistics teams to translate complex operational needs into scalable data models, reducing data preparation time by 40% for critical business analyses.

      • Led the documentation and creation of a unified data dictionary and standards, significantly enhancing data literacy and reducing discrepancies in key business metrics.

      • Conducted weekly cross-functional reviews to share OLAP-driven insights, which cut reporting errors by 40% and accelerated data-driven decision-making across the organization.

My skills

Cloud Platforms: AWS (S3, Glue, Redshift, EMR, EC2, IAM, CloudWatch), Azure (Data Factory, Synapse, Data Lake Storage Gen2, Azure Monitor, Kubernetes)

Big Data:: Spark (PySpark, Spark SQL), Databricks, Apache Kafka

Data Warehousing: Snowflake, Amazon Redshift, Synapse Analytics, OLAP Cubes, Star Schema,Delta Lake

Orchestration & CI/CD: Apache Airflow, Azure Data Factory, GitHub Actions

Databases: SQL (Snowflake, Synapse, Redshift)

Languages & Framework: Python, SQL, MDX, dbt, Git, Docker, Jupyter, Excel, Power BI, Tableau

Contact

Contact Form