I'm Swapnil Bhakare

Data Engineer with 3.5+ years of experience across Data Engineering and Technical Support. Skilled in building scalable ETL pipelines, automating data workflows, and delivering analytics-ready data using AWS (Glue, Redshift, Lambda, EMR, Airflow), PySpark, and SQL. Passionate about data reliability, automation, and visualization.

A dedicated Data Engineer passionate about designing end-to-end data solutions and workflow automation platforms.

I specialize in developing reliable, scalable, and automated ETL pipelines using PySpark, Python, and AWS services. I enjoy turning raw data into actionable insights, ensuring data quality, and collaborating with cross-functional teams to enable data-driven decisions.

Education & Skills

Education

June 2017 – April 2020

Dr. Babasaheb Ambedkar Marathwada University

Bachelor of Computer Application (B.C.A)

Experience

Jan 2024 – Present

Data Engineer – BI Hub Solution

• Designed & implemented ETL pipelines using PySpark & AWS Glue for GenFlow automation platform.
• Migrated & transformed data into Redshift for analytics.
• Automated workflows using Airflow, Lambda, and Python scripts.
• Improved pipeline efficiency by 20% with optimized S3 → Glue → Redshift flow.
• Built validation scripts and monitoring systems ensuring data accuracy.

Jan 2022 – Dec 2023

Programmer – Tata Consultancy Services

• Delivered L1/L2 Technical Support ensuring SLA compliance.
• Diagnosed and resolved issues across Windows/Linux servers & Azure AD.
• Managed user provisioning, access control, and system health checks.
• Documented troubleshooting processes to enhance operational efficiency.

Sep 2021 – Nov 2021

Data Annotation Specialist – Digiyoda

• Labeled & validated large datasets (98% accuracy) for ML/NLP models.

Python

90%

PySpark

85%

SQL / Data Modeling

88%

AWS (S3, Glue, Redshift, Lambda, EMR)

80%

Airflow

75%

ETL / Data Pipelines

90%

Data Warehousing

80%

Power BI / Tableau

65%

Projects

GenFlow – Workflow Automation & ETL Orchestration

BI Hub Solution, 2024

Developed a workflow automation platform that orchestrates ETL pipelines with drag-and-drop design. Created PySpark Glue jobs for transformations and Redshift loading, integrated with Airflow for orchestration, and implemented monitoring, scheduling, and data validation for reliability.

GenFlow Project

Awards & Recognition

Languages

English (Professional) | Hindi (Native) | Marathi (Native)

Contact Me

Let’s collaborate on data engineering, workflow automation, or analytics projects.