This is part of my research effort to find a home during this Apr/May period. Before that I never know there are 101 cities around San Francisco Bay Area …
I think this median sale price is the most important affordability factor to consider of an area. And here is the median sale price by city - data at subdistrict level may represent a community more precisely than at city level…
Pretty obviously, there are 2 centers in this “universe”, San Francisco and Palo Alto ;)
If you are more interested about exploring the data, you can stop here and there are quite a few websites that have pretty UI. and here are a few of them:
Following is my experience to craft the heat map…
From the map we can see, we need the geography data and the housing price data. Thanks for the free and openness of America society, all these data can be obtained online without cost.
The city boundary data is from census.gov TIGER, there you can find geo map data at all administrative levels, County Subdivision, Elementary School District, …
For census data on top of these map, American Community Survey (ACS) has many interesting facts. I think this is the destination when the survey you filled when applying a Driver License or enrolling your child for the elementary school.
Bay Area is not a administrative area, I find some area map from SF OpenData
OpenStreetMap data can be extracted to work offline with metro extracts
There are housing price data from the government such as fhfa.org But I find zillow data is much more easier to work with - arguably, it is not that accurate.
Libaries I’ve used and recommended,
Fiona reads and writes spatial data files
Performs cartographic transformations and geodetic computations.
Geometric objects, predicates, and operations
Use geometric objects as matplotlib paths and patches
Most of the geo data set either is in shapefile format or can be easily transform to the format. So if you model this process with Input/Process/Output, then you use Fiona to load the shapefile Input into memory; transform the coordination if multiple data source use different ones; process it with Shapely; then present the Output by transforming the geo objects to patches on matplotlib Axis with descartes.
If you don’t need to project the data on a earth globe, you don’t really need Basemap, Although you will find a lot hits with Google search. QGIS is the way to go if you need a full featured GIS system.
When you got a shp file and you want to have a quick view what it looks like, you can load it to QGIS, or use tool like geovis.
But you can also write some code in ipython notebook and here it is.
import fiona
from shapely import geometry
from descartes.patch import PolygonPatch#, PathPatch
from matplotlib import cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt
def plot_map(ax, source):
norm = colors.Normalize(vmin=1, vmax=2*len(source))
sm = cm.ScalarMappable(norm, cmap=cm.Paired)
alpha, cnt = .8, 1
for r in source:
edge_color, color = sm.to_rgba(cnt), sm.to_rgba(cnt+1)
cnt += 2
shape = geometry.shape(r["geometry"])
if isinstance(shape, geometry.Polygon):
poly = shape
ax.add_patch(PolygonPatch(poly, fc=color, ec=edge_color, alpha=alpha, zorder=1))
elif isinstance(shape, geometry.MultiPolygon):
for poly in shape:
ax.add_patch(PolygonPatch(poly, fc=color, ec=edge_color, alpha=alpha, zorder=1))
elif isinstance(shape, geometry.LineString):
ax.plot(*shape.xy, color='gray', linewidth=3, solid_capstyle='round', zorder=1)
elif isinstance(shape, geometry.MultiLineString):
for line in shape:
ax.plot(*line.xy, color='gray', linewidth=3, solid_capstyle='round', zorder=1)
else:
print r["geometry"]["type"]
raise Exception("?")
## main ##
plt.rcParams['figure.figsize'] = (9, 16)
fig = plt.figure()
ax = fig.gca()
source_shapefile = "bayarea.map/city_land.shp"
with fiona.open(source_shapefile) as source:
plot_map(ax, source)
ax.axis('scaled')
plt.show()
And github now rendering ipython notebook, you know? the above code rendering in github