Building a map of Bigfoot sightings with Plotly’s Dash framwork.
Not too long ago, Plotly announced a new open source framework for Python-based web applications called Dash (announcement letter). It’s a framework in the vein of R’s Shiny, with the key advantage being that it isn’t R. Considering my recent efforts to make one million dollars by finding Bigfoot had stalled, I figured it was as good a time as any to take this framework for a spin and try to make bank.
Here’s what I want to build: an application that shows various highly critical and important visualizations and statistics about sasquatch sightings, with the added ability to filter the sightings based on words I think might help me narrow down the search to relevant areas. Here’s what the final thing will look like:
Basically it’ll be four plots - a map of the geolocated sightings, a line/scatter plot of sightings over time, a bar plot of sightings by day of week, and a donut chart (totally not a bar chart) of the percentage of sightings for each sighting class. The magic will be in the search bar, which will filter the titles of the sightings based on the text you put into it.
I’ve broken this into three parts: the first part will load the data, make a map, and initialize the server. The second part will be to add the additional plots with a more complex layout, and the third part will be adding the title filter bar and the associated interactions.
It’s highly recommended that you read the awesome Dash user guide before going through this. I’m going to assume you’re familiar with the HTML and core components, specifically the graphs. The purpose of this series of posts is to walk through a real app example rather than introduce the Dash framework itself, which the official docs already do a great job of. I’ll probably spend some time going over the plot semantics because the docs for Dash and Plotly aren’t particularly consistent in that regard, but if it’s obvious I’m not going to explain it.
We’re going to need some prerequisites installed. Generally the best idea is to do this in a Python or Anaconda virtual environment. These are the installations for Dash:
pip install dash pip install dash_renderer pip install dash_core_components pip install dash_html_components pip install plotly
In addition to those, we’ll also install a couple of things that will help us with the data:
pip install toolz pip install python-dotenv
Toolz is probably the single most useful Python library I’ve used - once you get used to it, it’s indispensable.
Python-dotenv is a library that makes reading
.env files easier - we’ll use it when we make the map, as Plotly’s map plot requires an API key from Mapbox.
I’ll start by creating my directory like so:
app.py .env data/
We’ve got the data, so it needs to be loaded into the app. This could be done with a pandas data frame, but I actually like the model of a list of dictionaries a bit better for this application. The semantics are cleaner, and performance really isn’t an issue - most of the time will be spent on the client rendering all of the plots, not filtering a 3000 element list.
The sightings do have some odd times in there, so in a later post I’ll go back and add a conditional to that comprehension. For now, I’ll leave it.
Initializing the app itself is basically the same as Flask.
The “front-end” components of the app - the ones that get turned into the DOM in the browser - live in the
Because I’m a total amateur at CSS I’m going to use Bootstrap’s grid system.
If you want to know the details of how Bootstrap’s grid system works, the documentation is excellent. Basically the only part that isn’t self explanatory in this code is that columns are 12 units across, and different column classes subdivide it.
At the moment, the layout doesn’t actually have anything in it - I’ll need to write a function that makes the map and modify the layout slightly once it’s done.
One of the things that makes Plotly unique is its declarative plot building semantics.
plotly.graph_objs module has wrapper classes for all of the plots, but honestly I think it’s a lot simpler to build them as dictionaries.
For this app, each plot’s going to be built with a function that returns a dictionary that roughly looks like the following:
Each function is going to take a list of dictionaries and return the dictionary that builds the plot.
Before we build that function we need to add our Mapbox key to the environment.
My favorite way of doing this is to use the
python-dotenv module and put it in a
.env should not be in version control.
I’ll put the mapbox key in the
.env file as follows:
Then, in the app (preferably near the top before any functions are built):
Now we’re ready to make the plot building function.
It’s a bit long, but it’s pretty simple.
The only real action is basically a
groupby - the rest is just populating the dict that will define the plot.
The data is a list of three dictionaries - in Plotly parlance these are called “traces”.
Essentially they define a group of points, in our case the grouping is done by the classification - Class A (direct sighting), Class B (indirect sighting), or Class C (hearsay, basically).
Each trace gets its own color assigned to it by Plotly.
type is also important:
"scattermapbox" says this is a set of points projected onto a Mapbox tiling system.
It requires “lat” and “lon” instead of x and y, which we get from using
That function walks over a sequence and applies some accessor (could be an index or a dictionary key) to it, returning a generator.
pluck (pluck first, then turn the generator into a list) because Plotly needs a list, then extracted the “latitude” and “longitude” field from the sightings list with it.
I also added the title to the
text field so we can see the titles when we hover over the points.
The Mapbox specific options are in the
As expected, the data structure described in PlotlyJS docs maps perfectly to the Python dictionary version - no need for objects.
Now that we’ve got our function we need to wire that into the actual layout. Going back to the layout, we just need to add the call:
We’ve got everything we need: data, layout and plot. The final piece to get this thing up and running is to start the server.
After that, start it up in the shell:
You should see that it starts a server at
If you go there, you’ll see
One step closer to making that million.
Conclusion, Part 1
In this post I walked through the high level structure of a Dash app. We read in the data, built a layout, create a plot-returning function for the map, and constructed the server. We’ve still got a ways to go - check out the upcoming part 2 for a more sophisticated layout with additional plots and the upcoming part 3 for an interactive search bar.
If you’re impatient, the full source code for everything is already on GitHub. There’s also a running example on Heroku: https://bigfoot-sightings-dash.herokuapp.com/. The app is running on Heroku’s free tier, so there’s no guarantee it will even be up (if my free hours run out). If it is up, it might take a minute to load if the server has to start from a sleep state.
Head on over to part 2 and check out more plots with a more interesting layout.