Analysis of 17-18 EPL Team Attack Patterns (Fall 2024 / Revised Spring 2025)
STAT 430 - Practice of Applied Statistics (Report) (Code)
• Utilized publicly available spatio-temporal football data from Pappalardo et al. (originally from WyScout) to construct attacking sequences based on event types, spatial coordinates, and timing features
• Applied unsupervised learning using Fréchet and Gower distances with K-medoids clustering to define latent attack patterns, followed by statistical testing to verify team-level associations
• Discovered that Big 6 teams shared structurally similar attacking clusters with tactical variation, while Leicester City uniquely relied on fast, efficient duel-based sequences—achieving the highest goal output outside the Big 6
European Football League Player Stats (Spring 2022 / Re-developing from June 2025)
STAT 385 - Statistical Programming Methods (Streamlit) (Shiny App / Old Version)
• Developing an interactive Streamlit web app using Python to analyze and visualize major European football league (Big 5 & Eredivisie/Primeira Liga/Jupiler) players' performance data from Fbref, building on a prior R Shiny version
• Integrating data scraping (BeautifulSoup), radar chart visualization (Plotly), and player comparison features using cosine similarity to allow dynamic selection and comparison of similar players based on diverse in-game performance stats
Pitching Similarity (Spring 2023)
STAT 430 - Baseball Analytics (Shiny App)
• Created an R Shiny app with MLB Statcast data
• Designed pitching location heatmaps with ggplot to visualize and compare selected pitchers' pitching location and diverse stats such as pitch type and speed