Jack L. Good

Projects

Revenue-Optimizing Auto-Loan Pricing Engine with Random-Forest Uplift Modeling

Built an end-to-end pricing-optimization pipeline for an online auto-lender: cleaned 40 k+ loan records, explored acceptance-vs-APR trade-offs, and used a logistic baseline to quantify current revenue; then iteratively optimized each customer’s rate, demonstrating a portfolio-wide uplift of ≈ $84 k in expected revenue for the test cohort.
Bench-marked multiple ML models and surfaced actionable insights: engineered features (APR, FICO, competition rate, cost of funds, partner channel), tuned SVM, Random-Forest, K-NN, and a small feed-forward ANN via cross-validated grids, selected Random-Forest (AUC ≈ 0.89; test accuracy ≈ 0.88), and visualized feature importance to show rate & loan amount drive >70 % of acceptance variance.
Stack: Python, pandas, scikit-learn, TensorFlow/Keras, seaborn, matplotlib.

Real-Time Emotion Recognition in Unity Using Deep Convolutional Neural Networks

Real-time 8-class emotion inference: via webcam frames fed into a FER+ DCNN using Unity’s Barracuda engine, classifying neutral, happiness, surprise, sadness, anger, disgust, fear, and contempt at interactive frame rates.
Precision face alignment: with OpenCV’s 68-point landmarks (eyes, nose, mouth, chin), rotating/scaling each frame to match FER+’s 64×64 grayscale input.
Developer-friendly controls & debug UI: including Inspector sliders for zoom, padding, smoothing, and a live preview of the exact crop sent to the network.
Tags: computer vision, deep convolution neural networks, Unity, OpenCV (C#)

NLP-Driven Product Clustering Uncovers Customer Spend Patterns

Turned 500 K+ retail descriptions into a 200-term feature matrix: tokenized text, filtered nouns with NLTK POS-tagging, applied stemming, and one-hot-encoded price buckets to create a sparse SKU-feature table for downstream clustering.
Revealed five high-value product segments with K-Means + word-clouds: holiday gifts, fashion accessories, home-decor, lifestyle, and cozy “gift-set” items—linking each cluster to distinct spending behavior for targeted promos and inventory planning.
Stack: Python, pandas, NLTK, scikit-learn, seaborn, matplotlib, WordCloud.

Interactive 3D Weather Visualizer with ChatGPT for AR Wearables

AI-driven Weather Summaries on-Device: Seamlessly integrate ChatGPT with real-time OpenWeatherMap data and reverse-geocoded location info to generate on-tap, natural-language weather insights in your Spectacles AR lens.
Dynamic, Data-Responsive AR Pipeline: Leverage Texture2D mapping and dynamic VFX—driven by live temperature, conditions, and hourly forecasts—to render context-aware 3D models and icons that adapt instantly to changing weather.
Robust Asynchronous & Interaction Framework: Architect an end-to-end, gesture-driven Lens Studio experience with efficient API caching, error handling, and responsive UI components for smooth, high-performance AR on wearable hardware.
Stack: Augmented Reality, OpenAI ChatGPT API, Snap Lens Studio (JavaScript), Spectacles SDK, OpenWeatherMap & OpenCage APIs, WebGL/Texture2D rendering, geolocation & reverse-geocoding services.

Fine-tuned Vision Language Model for Generating Accurate Chart Captions

Fine-tuned vision–language model: Designed and fine-tuned a lightweight BLIP-based model to produce highly accurate, semantically rich captions for data visualizations—making complex chart content accessible to blind and visually impaired users.
Developed a minimalistic training pipeline: Distilled chart Q&A annotations into dense, trend-focused supervision signals, enabling rapid fine-tuning (3 epochs) on consumer-grade hardware with significant gains in caption clarity and relevance.
Stack: Python, PyTorch, Hugging Face Transformers, BLIP, PIL.

Regularized Regression Pipeline for Airbnb Price Prediction

Bench-marked linear, Ridge, and LASSO models on 540 Florence listings: engineered 24 host, location, and review features, visualized bias-variance trade-offs, and used 5-fold CV to pick λ values—cutting validation MSE from 4 953 (1-feature baseline) to 3 729 with Ridge.
Delivered interpretable pricing insights for hosts: Ridge shrank noisy coefficients, spotlighting bedrooms (+$58) and neighborhood premium (+$1.6 k in Centro Storico) while generalizing best on the unseen test set (MSE ≈ 5 124).
Stack: Python, pandas, scikit-learn, statsmodels, seaborn, matplotlib.

Term-Deposit Propensity Model — Ensemble Learning on Portuguese Bank Data

Business impact & pipeline: Built a production-style lead-scoring engine for a Portuguese bank—cleaned/encoded 16 raw fields into 50 + one-hot features with a ColumnTransformer, then generated fully reproducible splits to rank 20 k clients and cut ~90 % of wasted call-center outreach.
Ensemble approach (CART → Random Forest → XGBoost): Baseline CART → tuned Random Forest (AUC 0.942) → XGBoost (50 trees) topping ROC-AUC 0.946 on hold-out and slashing false-positive rate by 10 %, shipping a higher-yield call list to marketing ops.
Stack: Python, XGBoost, pandas, NumPy, scikit-learn, seaborn.

LSTM-Driven Forecasting of Coca-Cola (KO) Stock Prices

Built a full time-series pipeline from Yahoo Finance ingest to production-ready forecasts: pulled 10 yrs of KO closes, engineered rolling-mean baselines, framed a 90-day sliding window, and trained a 2-layer LSTM with early-stopping—cutting MAPE from 1.25 % (10-day MA) to < 0.8 % on the 20 % hold-out set.
Demonstrated sequence-learning gains over classical methods: tuned 64/32-unit LSTM stack, achieved RMSE ≈ 0.42 and tracked regime shifts in post-COVID recovery, illustrating how deep RNNs capture long-horizon patterns that moving averages miss.
Stack: Python, pandas, yfinance, TensorFlow/Keras, NumPy, scikit-learn, seaborn, matplotlib.

High-Accuracy Crop Recommendation Engine with K-NN & Soil-Climate Profiling

Built a precision-ag pipeline from raw agronomic data to deployable recommender: merged rainfall, nutrient, and climate tables into 2 k+ farm records, visualized N-P-K vs. humidity/temperature, then tuned a k-NN classifier (k = 3) via 5-fold CV to hit 97 % test accuracy on 22 crop classes.
Drove explainable, per-farm decisions for growers: standardized features to optimize Euclidean distance, generated similarity diagnostics (e.g., maize vs. peas soil demands), and delivered instant “what-if” predictions—recommending rice for sample N20 P40 K55/26 °C/7.2 pH scenarios.
Stack: Python, pandas, scikit-learn (StandardScaler, KNeighborsClassifier, GridSearchCV), seaborn, matplotlib.

Wine-Quality Classification with Linear & RBF Support-Vector Machines

Modeled Portuguese “Vinho Verde” quality from chemistry: explored 1.6 k red-wine samples, visualized alcohol vs. volatile-acidity correlations, and framed a binary “good vs. bad” task; tuned C on linear SVM (grid 0.001 → 0.02) and RBF SVM (C = 0.5 → 20) via 5-fold CV, lifting test accuracy from 72 % (2-feature model) to 77 % (full RBF).
Drove margin-vs-fit analysis & baseline benchmarking: plotted hyperplanes for extreme C values to illustrate over/under-fitting, contrasted SVM with logistic regression (75 % test accuracy), and surfaced key drivers—high alcohol and low volatile acidity—as top quality indicators.
Stack: Python, pandas, scikit-learn (SVC, GridSearchCV, StandardScaler), seaborn, matplotlib.

Voter-Turnout Modeling & Cut-Off Tuning with Logistic Regression

Predicted 2020 Georgia ballot-casting using 246 k voter records: engineered demographic and census-tract features (age, race, gender, income, education, car ownership, density), fit a non-regularized logistic model, and interpreted coefficients to surface key drivers (older, higher-income, female voters most likely to turn out).
Optimized decision thresholds & evaluated fairness trade-offs: generated ROC/accuracy-vs-cutoff curves, confusion-matrix analytics, and scenario tests (voter A/B/C) to illustrate how varying income or age shifts predicted probabilities; recommended 0.5 cutoff for balanced 71 % test accuracy with minimal over-fit.
Stack: Python, pandas, scikit-learn (LogisticRegression, confusion_matrix, RocCurveDisplay), seaborn, matplotlib.

Fashion-MNIST Image Classifier — 300-Neuron ANN in TensorFlow

Engineered and trained a lean feed-forward network on 60 k gray-scale fashion images: flattened 28×28 inputs, used a single 300-ReLU hidden layer, tuned epochs via learning-curve inspection (≃ 12) to hit 85 % train / 82 % test accuracy — matching Random Forest performance while cutting inference latency to milliseconds.
Bench-marked against five classic ML baselines (LogReg, DT, RF, KNN, XGBoost): measured wall-time vs. accuracy/AUC, showing the ANN delivers comparable accuracy to XGBoost (0.825 vs 0.894) but trains ≈ 3× faster than heavy tree ensembles on CPU; visualized mis-classified samples to guide future CNN upgrades.
Stack: Python, TensorFlow/Keras, NumPy, pandas, scikit-learn, matplotlib.

Experience

Education

Cornell University – College of Arts & Sciences/Bowers College of Computing and Information Science
Relevant Coursework: Machine Learning, Computer Vision, AI Reasoning & Decision, Database Systems, Data Structures & Object-Oriented Programming, Computer System Organization, Functional Programming, Discrete Structures, and other good stuff

Technical Skills

Languages: Python, Java, SQL, OCaml, C#, C++, JavaScript
Frameworks/Libs: PyTorch, TensorFlow, scikit-learn, pandas, NumPy, Matplotlib, OpenCV, Jupyter, DeepFace, LangChain, OpenAI, REST APIs
Tools: Git, Tableau, Stata, Figma, VS Code, Unity (C#/AR), LaTeX

Professional Experience

Cornell Tech – Undergraduate Researcher (Jun 2025 – Aug 2025)

ML/VLM applications in emotion detection and regulation

ATLAS Institute – Software Dev Intern (May 2024 – Aug 2024)

Human AI interaction research

Platt Park Capital – Data Intern (May 2023 – Aug 2023)

Healthcare and multi-site consumer

University of Colorado – Computational Chemistry Aide (Oct 2021 – Aug 2022)

Theoretical chemistry

Contact

Email: jacklianggood@gmail.com

Phone: (720) 668-7343

Other: Mutual Investment Club of Cornell, Phi Chi Theta Professional Business Fraternity.

Computer Science Student @ Cornell University. In my free time I enjoy playing tennis, fly fishing, watching the Broncos play, and listening to country/Sierreño music. Always interested in contributing to projects that improve accesibilty.

Projects

Revenue-Optimizing Auto-Loan Pricing Engine with Random-Forest Uplift Modeling

Real-Time Emotion Recognition in Unity Using Deep Convolutional Neural Networks

NLP-Driven Product Clustering Uncovers Customer Spend Patterns

Interactive 3D Weather Visualizer with ChatGPT for AR Wearables

Fine-tuned Vision Language Model for Generating Accurate Chart Captions

Regularized Regression Pipeline for Airbnb Price Prediction

Term-Deposit Propensity Model — Ensemble Learning on Portuguese Bank Data

LSTM-Driven Forecasting of Coca-Cola (KO) Stock Prices

High-Accuracy Crop Recommendation Engine with K-NN & Soil-Climate Profiling

Wine-Quality Classification with Linear & RBF Support-Vector Machines

Voter-Turnout Modeling & Cut-Off Tuning with Logistic Regression

Fashion-MNIST Image Classifier — 300-Neuron ANN in TensorFlow

Experience

Education

Technical Skills

Professional Experience

Cornell Tech – Undergraduate Researcher (Jun 2025 – Aug 2025)

ATLAS Institute – Software Dev Intern (May 2024 – Aug 2024)

Platt Park Capital – Data Intern (May 2023 – Aug 2023)

University of Colorado – Computational Chemistry Aide (Oct 2021 – Aug 2022)

Contact