Skip to content
View s3achan's full-sized avatar
  • Texas

Block or report s3achan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
s3achan/README.md

Senior Data Analyst

πŸ›’οΈ SQL β€’ 🐍 Python β€’ ❄️ Snowflake β€’ βš™οΈ dbt β€’ πŸ“ˆ Tableau

I build data products and AI-powered analytics tools that turn complex datasets into clear business insights.

🧠 About Me

  • πŸ€– Building GenAI & NLP systems β€” trade intent detection, sentiment analysis, and LLM-powered analytics tools
  • 🧾 Developed AI Receipt Tracker β€” production-style system using LLM vision to parse receipts, categorize spend, and generate analytics dashboards
  • 🎬 Built recommender systems β€” content-based (TF-IDF + cosine similarity) and collaborative filtering (MovieLens)
  • πŸ’³ Built fraud detection pipelines using Random Forest & XGBoost with revenue optimization analysis
  • πŸ›’ Performed customer segmentation on 500K+ retail transactions using RFM modeling, K-Means clustering, and behavioral analytics
  • πŸ“Š Strong in tree-based ML models β€” Decision Trees, Random Forest, XGBoost with hyperparameter tuning and cross-validation
  • πŸ“ˆ Experienced in ML evaluation β€” RMSE, MAE, Precision@K, Recall@K, clustering validation & dimensionality reduction (PCA)

Currently exploring LLMs, RAG pipelines, and AI-powered data workflows.

πŸ… IBM Certified Data Scientist & Data Analyst
πŸŽ“ Machine Learning Specialization β€” Stanford

β˜• Powered by curiosity, coffee, and F1 🏎️


🌟 Highlight Project β€” AI Receipt Tracker Β  GitHub Β  Live Demo

A production-grade AI-powered receipt parsing and spending tracker built end-to-end

Feature Details
🧠 AI Parsing GPT-4.1-mini vision + GPT-4o-mini text β€” reads PDFs and images
πŸ—„οΈ Database PostgreSQL (Supabase) + SQLAlchemy ORM with migrations
☁️ Cloud Storage AWS S3 for receipt file storage with presigned URLs
πŸ” Retry Logic Tenacity retry wrapper for rate limits & API timeouts
🧹 Smart Categorization 20+ categories with sub-categories + category memory from history
πŸ” Duplicate Detection SHA-256 file hashing to prevent double uploads
πŸ“Š Analytics Plotly dashboards β€” spend by category, store, month + drill-down
πŸ” Auth Password-protected with Streamlit session state
πŸ’° Cost Tracking Per-call token logging with monthly API cost monitoring
🧾 Multi-store Support Handles Costco, Target, Walmart β€” store-specific tax & discount logic

Tech: Python β€’ Streamlit β€’ OpenAI API β€’ PostgreSQL β€’ SQLAlchemy β€’ AWS S3 β€’ Plotly β€’ pandas β€’ boto3


πŸš€ Other Projects

Project Description Stack
🎬 CineMatch Content-based & collaborative filtering recommender with full ML evaluation Python, TF-IDF, Cosine Similarity, KMeans
πŸ’³ Credit Card Fraud Detection Live Demo Fraud detection with revenue optimization Python, Random Forest, XGBoost, SHAP
🚒 Titanic Survival Prediction Feature engineering + 6 model comparison achieving ~87% accuracy Decision Tree, XGBoost, SVM, KNN, Random Forest
πŸ“ˆ Trade Intent NLP Buy/Sell intent detection using POS tagging & sentiment analysis Python, NLP, ML
🏎️ F1 Sentiment Analysis Sentiment analysis on F1 tweets Python, NLP, Twitter API
πŸ›’ E-commerce Customer Segmentation Customer & product analytics on 500K+ UK retail transactions β€” K-Means clustering, RFM analysis, revenue seasonality, cancellation trends Python, K-Means, Seaborn, Pandas

πŸŽ“ Certifications

IBM Data Scientist IBM Data Analyst Stanford ML

πŸ› οΈ Tech Stack

Languages

Python SQL

ML & Data

Pandas NumPy Scikit-Learn XGBoost Matplotlib Seaborn SHAP

NLP & Gen AI
NLTK OpenAI Prompt Engineering

Cloud & Data Engineering

Snowflake Databricks dbt Airflow AWS Azure PostgreSQL Redshift Spark BigQuery SQLAlchemy Spark

Tools

Jupyter Git Shell

Visualization & BI

Tableau Power BI Streamlit Looker Plotly


πŸ“Š GitHub Stats

Repos Stars Followers Profile Views

Top Languages


πŸ“« Connect

LinkedIn GitHub

Pinned Loading

  1. ai-receipt-tracker ai-receipt-tracker Public

    AI-powered receipt parsing and spending tracker built with Streamlit and GPT-4o-mini.

    Python

  2. credit-card-fraud-revenue-optimization credit-card-fraud-revenue-optimization Public

    A machine learning project to detect fraudulent credit card transactions while optimizing revenue through intelligent decline strategies and risk-based decision making.

    Jupyter Notebook

  3. CineMatch-Recommender-Systems CineMatch-Recommender-Systems Public

    Hybrid movie recommender system using TF-IDF content-based filtering and collaborative filtering on the IMDB dataset. Built with Python, scikit-learn & pandas.

    Jupyter Notebook

  4. trade-intent-nlp-pos trade-intent-nlp-pos Public

    NLP-based Buy/Sell intent detection using POS tagging, sentiment analysis, and ML classification on financial text.

    Jupyter Notebook

  5. Titanic-Survival-Eda Titanic-Survival-Eda Public

    Exploratory data analysis of the Titanic dataset identifying key factors influencing passenger survival.

    Jupyter Notebook

  6. ecommerce-customer-segmentation ecommerce-customer-segmentation Public

    Transforms raw e-commerce transactions into customer intelligence using cohort analysis and K-Means segmentation to drive retention and revenue insights.

    Jupyter Notebook