Return to davevause.com

 

Kernel: Python 3 (Anaconda 2022)

Import bokeh and set up the Jupyter notebook environment.

Drop Alaska and Hawaii.


from os.path import dirname, join, realpath import sys as sys import pandas as pd import math import partisan_lean as pl from bokeh.models import ColorBar, ColumnDataSource, Label, LinearColorMapper from bokeh.palettes import RdBu #from bokeh.palettes import all_palettes #import colorcet as cc from bokeh.plotting import figure, show, output_file from bokeh.sampledata.us_states import data from bokeh.io import output_notebook, show, save output_file("US_map.html") output_notebook() states = pd.DataFrame.from_dict(data, orient="index") states.dropna(inplace=True) # doesn't help states.loc['AK',:] states.drop(['AK','HI'],inplace=True)
Loading BokehJS ...
MIME type unknown not supported

Read the U.S. Census data from:

https://www.census.gov/data/tables/time-series/demo/popest/2020s-state-total.html


import numpy as np fn = "NST-EST2022-POP.csv" state_pop = pd.read_csv(fn, header=4, skiprows=[0,1,2,3], skipfooter=7, engine='python', usecols=[0,4], names=['name','pop_2022'],) def drop_dot(row): row['name'] = row['name'][1:] #row['pop_2022'] = int(row['pop_2022'].replace(",",""))/400000 return row print(type(state_pop)) state_pop = state_pop.apply(drop_dot, axis=1) state_pop.head(5)
<class 'pandas.core.frame.DataFrame'>
name pop_2022
0 Alabama 5,074,296
1 Alaska 733,583
2 Arizona 7,359,197
3 Arkansas 3,045,637
4 California 39,029,342

"states" is the primary dataframe of the program. We're going to put all the data inside of a dataframe because, as we add more data and compute state centers, the Pandas dataframe will keep everything aligned on states names.

Add the center of each state to the dataframe.


def gen_center(row): lats = pd.Series(row['lats']).interpolate(method='linear').to_list() lons = pd.Series(row['lons']).interpolate(method='linear').to_list() lat = sum(lats)/len(lats) lon = sum(lons)/len(lons) row['center_lat'] = lat row['center_lon'] = lon return row states = states.apply(gen_center,1) states.head(2)
name region lats lons center_lat center_lon
NV Nevada Southwest [40.68928, 40.4958, 40.30302, 40.09896, 39.999... [-114.04392, -114.04558, -114.04619, -114.0464... 39.481429 -116.100247
AZ Arizona Southwest [34.87057, 35.00186, 35.00332, 35.07971, 35.11... [-114.63332, -114.63349, -114.63423, -114.6089... 33.707470 -112.636278

Calculate the partisan lean of each state.

I've downloaded FiveThirtyEight’s partisan lean metric from FiveThiryEight's 'How Red Or Blue Is Your State?' specifically, 2020.

Merge the partisan lean data and the population data into states.

This is where the Pandas dataframe really helps.


lean_df = pl.compute_lean("2020_partisan_lean.csv") states = pd.merge(left=states, right=lean_df, left_on="name", right_on="state") states = pd.merge(left=states, right=state_pop, left_on="name", right_on="name") states.head(5)
name region lats lons center_lat center_lon state lean_2020 Code Party Dot Color pop_2022
0 Nevada Southwest [40.68928, 40.4958, 40.30302, 40.09896, 39.999... [-114.04392, -114.04558, -114.04619, -114.0464... 39.481429 -116.100247 Nevada 1 NV Republican red 3,177,772
1 Arizona Southwest [34.87057, 35.00186, 35.00332, 35.07971, 35.11... [-114.63332, -114.63349, -114.63423, -114.6089... 33.707470 -112.636278 Arizona 8 AZ Republican red 7,359,197
2 Wisconsin Central [42.49273, 42.49433, 42.49562, 42.49561, 42.49... [-87.8156, -87.93137, -88.10268, -88.20645, -8... 44.148969 -90.290940 Wisconsin 3 WI Republican red 5,892,539
3 Georgia Southeast [32.29667, 32.24425, 32.09197, 32.03256, 32.02... [-81.12387, -81.15654, -81.02071, -80.75203, -... 32.456013 -82.620032 Georgia 13 GA Republican red 10,912,876
4 Kansas Central [36.99927, 36.99879, 36.99914, 36.99903, 36.99... [-96.28415, -96.55381, -96.91244, -97.1197, -9... 38.554094 -98.310560 Kansas 21 KS Republican red 2,937,150

Bokeh needs its data in the form of a Python dictionary where the data is in lists.

Build it out of "states".

It is called "source_data".

I also prepare circ_diam, where I make circle area linearly proportional to state population.


def str_to_int(row): row['circ_diam'] = int(row['pop_2022'].replace(',', ''))/500000 #row['circ_diam'] = math.sqrt(int(row['pop_2022'].replace(',', ''))/3.14)/50 return row states = states.apply(str_to_int, axis=1) source_data =dict( x = states['lons'].to_list(), y = states['lats'].to_list(), name = states['name'].to_list(), index=states['lean_2020'].to_list(), x_circ = states['center_lon'].to_list(), y_circ = states['center_lat'].to_list(), pop = states['pop_2022'].to_list(), circ_diam = states['circ_diam'].to_list() ) #source_data['x'] print(states.head(1))
name region lats \ 0 Nevada Southwest [40.68928, 40.4958, 40.30302, 40.09896, 39.999... lons center_lat center_lon \ 0 [-114.04392, -114.04558, -114.04619, -114.0464... 39.481429 -116.100247 state lean_2020 Code Party Dot Color pop_2022 circ_diam 0 Nevada 1 NV Republican red 3,177,772 6.355544

Build the Bokeh map.

The process of building a Bokeh visualization is creating a figure, then adding items to it. In this case, I add:

  • a Label of arbitrary text,

  • the states as Bokeh Patches,

  • the circles denoting population size. I set the fill_color values,

  • and the color bar to indicate the color range and meaning on the choropleth.


#print(type(data)) mapper = LinearColorMapper(palette=list(RdBu[11]), low=-45, high=45) p2 = figure( width=1000, height=500, # better aspect ratio background_fill_color="powderblue", # zone around image #x_axis_location=None, # remove axis x_range=(-130, -50), #y_axis_location=None, tooltips = [("Name", "@name"), ("Bias","@index"), ("Population", "@pop") ]) p2.title.text = "Political Bias and Population Size in U.S. States" p2.title.align = "center" p2.title.text_font_size = "21px" p2.title.text_color = "#333344" p2.grid.grid_line_color = 'White' #https://docs.bokeh.org/en/latest/docs/user_guide/basic/annotations.html#labels citation = Label(x=-128, y=27, x_units='data', y_units='data', text='Circle diameter propotional to state population.\nState color proportional to state political bias.',) #border_line_color='black', background_fill_color='white') p2.add_layout(citation) p2.patches('x','y', source=source_data, line_color='white', fill_color = dict(field="index", transform=mapper) ) p2.circle(x='x_circ', y='y_circ', source=source_data, color='black', size='circ_diam', fill_alpha=0.0) color_bar = ColorBar( color_mapper=mapper, width = 200, location="bottom_right", orientation="horizontal", title="Democratic......Republican", title_text_font_size="16px", title_text_font_style="bold", title_text_color="black", major_label_text_color="black", background_fill_alpha=0.0) p2.add_layout(color_bar) save(p2, 'us_diameter.html') #show(p2)
'/home/user/python/bokeh/us_diameter.html'

)