Back in the day It was fun coloring blank pictures. When it comes to python, I didn’t think it would be as fun. After all, building bar graphs and line graphs seems so convenient and easy (see my visualizations article here). Building geographic graphs seemed a little more intimidating. Up until now, I did not have a reason to do it. However, this makes the perfect condition to learn how to do it. Stakes are low, so it is a great idea to learn how to do it.
I like to say that I am one of those people that love to get a new technology and experiment. After, I like to read the documents.
Before we begin, we need to find some data to plot. I found data form Kaggle that showed the percentage of women in the labor force for about 50 countries. Now, let’s define what we want to do. We want to make a heat map using the values from our data. In order to do this, we will be using geopandas, pandas, and matplotlib. We are going to skip the data cleaning part, and we will head right into the fun part. However, you can see the whole code here, specifically, in the visuals jupyter notebook.
Before we get started, we need to tell our map which polygons to draw for each country. Thankfully, geopandas already has the polygons for the world map, so I opened the file in the following way.
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
From here, we need to pair the polygons to our data. We can use the geopandas merge() method to do this.
geospacial_df = world.merge(data_coordinates,left_on='name',right_on='Country')
The first argument is our data frame from our Kaggle data, left_on is the name of the column to use to merge our data frames from our world data frame, right_on is the name of the column to use to merge our data from our original data frame.
Plotting our data is easy. We do the following:
geospacial_df.plot(column="Female Labor Force Participation Rate",cmap="Blues")
We use the plot() method. On our first argument, we specify which column to plot. We use the cmap argument to specify which color to use for our plot. We’ll get the following plot.
The plot is okay. We can see that we indeed built the heat map for our values, but it’s hard to understand what we’re looking at. We need to make the map bigger, give more context, and clean it up a bit. We can do this with the following:
fig,ax = plt.subplots(figsize=(20,20))
geospacial_df.plot(ax=ax,column="Female Labor Force Participation Rate",cmap="Blues")
ax.set_title("Women in Labor Force")
First, we made a subplot. We do this so we can plot the rest of the world map on the same graph. This will help locating the values that we do have available. We change the size to 20,20 as well. Then, we remove the labels on both axes, and we add a tittle. This should help give context to our graph. We end up with the following:
That’s it! Congratulations, we just made our first geographical map.
I would suggest using this map with a bar graph. This way we can show the values better. Humans are bad at telling color shades, so a bar graph will help show the values better. We can show the top 10 values, or we can show as many or little as we need.
The Wrap Up
Geographical graphs are really nice to look at. It helps to visualize data that is based on location, but it is nice to have another visual to get the exact values of a location. Keep in mind that we can graph any location this way. We can graph regions on each country, regions within continents, or any combination. Good luck on your graphing journey!