About

Machine learning engineer, data scientist, and data engineer. I enjoy making data work, whether that means collection, modeling, transformation, or deployment.

Sometimes “work” is really just pointless stuff. I try to find ways to learn new skills by doing something dumb.

Repos / Projects

Here’s a sample of some dumb stuff I work on.

Spotify Smart Playlists - Automated workflows for making configuration driven Spotify playlists using DVC, DuckDB, and Github Actions. I told you this stuff was dumb.

Movies - Letterboxd clone (ish) with Go, SQLite, DVC and Obsidian for tracking the terrible movies I watch in my own database.

Movies Analysis - Full end-to-end DBT project built with DuckDB, using the SQLite database from the movies “app” as a source. Working on adding a Streamlit dashboard to it, eventually.

SVL - Haven’t touched this one in a while, but I wanted to learn how to write compilers so I decided to make a SQL-esque language for data visualizations. It was fun but not terribly useful.

Kafka Streams Examples - Really old repo where I taught myself Kafka Streams back when it was super new. Also includes a streaming pipeline inspired by an Onion article (seriously).

Datasets

I maintain several pretty popular datasets on data.world. You might notice a theme.

All of these were collected from publicly accessible data, either via file downloads or web scrapers of highly questionable code quality.

Repos / Projects#

Datasets#

Repos / Projects

Datasets