Machine learning engineer, data scientist, and data engineer. I enjoy making data work, whether that means collection, modeling, transformation, or deployment.
Sometimes “work” is really just pointless stuff. I try to find ways to learn new skills by doing something dumb.
Repos / Projects
Here’s a sample of some dumb stuff I work on.
Spotify Smart Playlists - Automated workflows for making configuration driven Spotify playlists using DVC, DuckDB, and Github Actions. I told you this stuff was dumb.
Movies - Letterboxd clone (ish) with Go, SQLite, DVC and Obsidian for tracking the terrible movies I watch in my own database.
Movies Analysis - Full end-to-end DBT project built with DuckDB, using the SQLite database from the movies “app” as a source. Working on adding a Streamlit dashboard to it, eventually.
SVL - Haven’t touched this one in a while, but I wanted to learn how to write compilers so I decided to make a SQL-esque language for data visualizations. It was fun but not terribly useful.
Kafka Streams Examples - Really old repo where I taught myself Kafka Streams back when it was super new. Also includes a streaming pipeline inspired by an Onion article (seriously).
Datasets
I maintain several pretty popular datasets on data.world. You might notice a theme.
- 👣 Bigfoot Sightings (repo)
- 🛸 UFO Sightings (repo)
- 👻 Haunted Places (repo)
- 🐺 Dogman Sightings (repo)
All of these were collected from publicly accessible data, either via file downloads or web scrapers of highly questionable code quality.