Skip to content

apalensky/portfolio_website

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 

Repository files navigation

Alexander Palensky's Portfolio

drawing     drawing    drawing    drawingdrawing

Highlight projects

Project 1: National Lacrosse League box scores

Tools: Python (Jupyter) R (Rstudio) Markdown

Technical skills: Web Scraping Data Wrangling

  • Made all National Lacrosse League (NLL) box score data publicly available in one source for the first time, enabling easy online access and empowering community analysis.
  • Scraped and cleaned all publicly available box scores from 1993 through 2020, totaling nearly 55,000 records for floor players and nearly 6,000 records for goalies.
  • Applied sorts based on season and added leading performance analysis metrics per game, per 60 minutes of play, as ratios, and as percentages.
  • This project is ongoing, with data files downloadable from this Kaggle page. I will continue to add insights, applications, and images to this project so that others can learn more about box lacrosse trends.

Dig Deeper: NLL Github Repository

Project 2: Similarity scoring college pitchers

Tools: R (Rstudio) RShiny SQL Trackman

Technical skills: Data Wrangling Similarity Learning Technical Writing

  • Imported and transformed over 150,000 plate appearance events.
  • Generated novel pitch similarity scoring using R and Euclidean Distance.
  • Applied pitch usage weighting with Earth Mover's Distance to score arsenals.
  • Built a Shiny application for staff to use the model in offseason pitch design.

Dig Deeper: Article published on Medium

Project 3: Product recommendation engine

Tools: R (Rstudio) Rattle GUI PowerPoint

Technical skills: Data Wrangling Association Rule Mining Data Visualization Technical Consulting

  • Consulting project for a beverage distribution company covering 1,500 products, close to 2,500 customers, and more than 8250,000 transactions over the past four years.
  • Removed sparse records and top nationally selling brands from analysis to focus on marketing new and niche products, broken down by beverage category.
  • Captured insights from association rule mining to construct a recommendation engine which returned the top desired number of uncarried products sold at similar businesses or identified as emergent trends in the beverage industry .
  • Presented the company with the recommendation engine and a report covering commonly recomended products and how these recommendations varied by customer business category and size.

Project 4: National Park invasive fish species

Tools: R (Rstudio) PowerPoint

Technical skills: Web Scraping Data Wrangling Data Visualization

  • Imported nearly 120,000 known NationalPark wildlife species from the National Park Service Kaggle and reduced data to to extant fish species.
  • Scraped additional National Park data from online as well as known fish species non-endemic to each state from the United States Geological Survey (USGS).
  • Cleaned and joined our scraped data to perform feature exploration.
  • Choropleth mapped invasive fish species as a percentage of all fish species prevalent in states with National Parks.
  • Gave a presentation to undergraduate departmental professors on wildlife resource management effectiveness using research on states identified with high invasive ratios.

Dig Deeper: Invasive Fish Species Github Repository

Recent reads

  • A Short History of Nearly Everything, Bill Bryson
  • Freakonomics, Steven Levitt & Stephen Dubner
  • Genghis Khan and the Making of the Modern World, Jack Weatherford
  • Sapiens: A Brief History of Humankind, Yuval Noah Harari

Learning & certifications

About

Alexander's data science project portfolio

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors