Visualize large GeoJSONs in Databricks notebooks

Visualize large GeoJSONs in Databricks notebooks

See also Visualize large GeoJSONs in Google Colab.

You can use Folium, Kepler.gl or Lonboard on Databricks, but they all run into the same issue if your dataset is too large: the widgets won't load and you'd have to save the output as an HTML file. This you can then download and open locally (if the file is too large to download from Workspace files, you'd need to save it to Unity Catalog volumes and download from there).

# Folium
map.save("folium.html")

# Kepler.gl
map.save_to_html(file_name="kepler.html")

# lonboard
map.to_html("lonboard.html")

Pydeck

Pydeck, however, is the one example that is able to visualize ~100+ MB datasets:

# Data via https://catalog.data.gov/dataset/building-footprints-d97ff
LARGE_GEOJSON_URL = "https://opendata.dc.gov/api/download/v1/items/a657b34942564aa8b06f293cb0934cbd/geojson?layers=1"
import geopandas as gpd

gdf = gpd.read_file(LARGE_GEOJSON_URL)

bounds = dict(
    zip(
        ["min_lon", "min_lat", "max_lon", "max_lat"],
        gdf.total_bounds,
    )
)
%pip install pydeck --quiet
import pydeck as pdk

bbox_corners = [
    [bounds["min_lon"], bounds["min_lat"]],
    [bounds["max_lon"], bounds["max_lat"]],
]

view_state = pdk.data_utils.compute_view(bbox_corners)

geojson_layer = pdk.Layer(
    "GeoJsonLayer",
    LARGE_GEOJSON_URL,
    pickable=True,
    stroked=True,
    autoHighlight=True,
    get_fill_color=[200, 200, 255, 200],
    extruded=True,
    elevation_scale=0.005,
)

pdk.Deck(layers=[geojson_layer], initial_view_state=view_state)