Full-stack web apps, RAG systems, ML classifiers, ETL pipelines, data science notebooks, and business analytics dashboards. Use the filters below to narrow by discipline — the work spans software engineering, AI/ML, data engineering, data science, analytics, and business analysis.
Engineered the backend architecture for a content-management blogging platform as PHP Programmer at Writin. Built RESTful APIs for user authentication, post CRUD, and commenting systems. Designed the MySQL schema, integrated with the frontend team, and wrote responsive HTML/CSS/JS. Optimized database queries for real-world traffic patterns.
As PHP Web Developer at Notebooknb, built and maintained dynamic web applications contributing to the company's core product. Designed normalized MySQL schemas optimized for data-storage and retrieval operations, and implemented responsive UIs using HTML5, CSS3, and JavaScript ensuring cross-browser compatibility.
Built REST APIs at Rajlaxmi Solutions that powered the company's BI tooling — serving analytics data to internal dashboards and 40+ client-facing reports. Designed for throughput and pagination, handled auth, rate limiting, and consistent error responses. Contributed to Rajlaxmi winning an Industry Excellence Award for client analytics and insights.
This site — hand-coded from a blank canvas with no frontend framework. Custom design system in vanilla CSS (Fraunces + JetBrains Mono + Instrument Serif), responsive across mobile/tablet/desktop, scroll-reveal animations via IntersectionObserver, interactive SVG illustrations for project cards, client-side form handling. Demonstrates attention to craft: typography, motion, spacing.
Graduate-level algorithm design implementations from CS 600 at Stevens. Heap traversal orders, Huffman coding, Prim-Jarník MST optimization to O(n²), and dynamic programming variants. Implementations in Python with time-complexity analysis, Big-O proofs, and benchmarks comparing naive vs optimized versions.
Architected a zero-cost RAG pipeline using LangChain with local HuggingFace embeddings (384-dim) — eliminating cloud API spend while preserving semantic-similarity fidelity on large corpora. Persistent vector storage via ChromaDB + SQLite delivers sub-second retrieval across 400+ page documents. Tuned chunking with RecursiveCharacterTextSplitter.
Convolutional neural network for multi-class image classification built in PyTorch. Data augmentation pipeline (random crops, flips, normalization), training loop with learning-rate scheduling, checkpointing, and TensorBoard logging. Benchmarked against scikit-learn classical baselines — a useful lesson in when deep learning is actually worth the complexity.
NLP pipeline for binary and multi-class sentiment on text data. TF-IDF + Logistic Regression baseline, then fine-tuning a pretrained transformer (DistilBERT) for comparison. Proper train/val/test split, cross-validation, confusion matrices, and calibration analysis. Shows understanding of the whole stack — not just the fashionable bit.
Movie/product recommender system using matrix factorization (SVD, ALS) and item-based collaborative filtering. Evaluated with RMSE, precision@k, and recall@k on held-out test set. Also implemented a content-based hybrid layer using TF-IDF over product descriptions for cold-start handling.
Built a Medallion (Bronze/Silver/Gold) pipeline using Kafka & AWS Kinesis for real-time ingestion and Spark for distributed processing. Cut compute costs 40% via Incremental Materialization. Metadata-driven Gold layer in dbt Core with Jinja consolidated per-table SQL into a single reusable macro. SCD Type 2 + dbt tests + freshness checks ensure point-in-time accuracy.
The data platform I built and operated at Rajlaxmi Solutions: ELT/ETL pipelines processing 500k+ daily records across 40+ regional clients, consolidated into Snowflake with dimensional modeling. Parallel extraction and automated scheduling cut pipeline latency by 35%. Schema validation, anomaly detection, and SLA alerting kept company-wide KPIs honest — contributed to an Industry Excellence Award.
Led migration of 200GB+ of legacy databases into modern cloud frameworks at Rajlaxmi. Query performance improved 25% through indexing strategies, query rewriting, and adoption of columnar storage best practices. Built Airflow-scheduled ingestion & transformation that eliminated 15+ hours of manual reporting per month and established KPI definitions still in use today.
Scalable big-data pipeline analyzing millions of NYC taxi trips using Hadoop MapReduce and HBase to generate revenue, operational, and customer-behavior insights. Explored distributed data processing patterns, partitioning strategies, and HBase row-key design for efficient time-series lookups. Coursework project that deepened my understanding of batch big-data systems beyond Spark.
Real-time data pipeline monitoring vital health parameters from simulated IoT devices in hospitals. Apache Kafka for streaming ingestion, Spark Structured Streaming for windowed aggregations, and HBase for low-latency lookups of patient history. Threshold-based alerting on anomalous readings — an exercise in building end-to-end event-driven systems with real latency constraints.
Hands-on repository for learning distributed data processing using Python and Hadoop Streaming. Covers real-world MapReduce patterns (word count, inverted index, joins, top-K, secondary sort), optimization techniques (combiners, custom partitioners), and scale-testing notes. Built as a teaching artifact — the one I wish I'd had when I started with Hadoop.
Business-driven ETL and analytics project for a Spar Nord Bank case study analyzing refilling frequency of ATMs across Europe. Ingested transactional and demographic data, built a dimensional model for location-over-time analysis, and produced an analytical layer + dashboard supporting operational decisions around cash logistics. End-to-end: business question → data model → insight → recommendation.
Data-driven sales reporting and demand forecasting system deployed at Barnes & Noble College (Hoboken, NJ). Analyzed historical POS data to predict textbook demand, which improved inventory accuracy by 30%, minimized stockouts & overstock, and reduced checkout times by 20% during peak hours. Customer satisfaction: 95%.
Power BI and Tableau dashboards built during my time at Rajlaxmi — exposing operational and customer-facing KPIs across revenue, conversion, churn, and data quality. Defined canonical metrics, built semantic layer on top of Snowflake warehouse, created drill-downs for 40+ clients, and drove adoption with department leads. Saved 15+ hours of manual reporting per month.
Rigorous EDA work during my Rajlaxmi internship: distribution analysis, correlation matrices, outlier detection, and hypothesis testing on business data to define critical KPIs. Automated reporting workflows from findings. Solid proof-of-work that applied statistics and careful data-inspection translate directly to business impact.
More experiments, coursework, and in-progress work on GitHub.
view github →