Detailed Methodology — Depave Fort Lauderdale
This document describes the Fort Lauderdale (FTL) implementation of the Depave pipeline in sufficient detail for a GIS analyst to reproduce or extend the analysis. Source code, configuration files, and the NYC pilot for comparison live in a single repository; paths below are relative to the repository root.
1. Scope & study area
The analysis covers the municipal boundary of the City of Fort Lauderdale, Florida (Broward County). All spatial operations are carried out in EPSG:2236 (NAD83 / Florida East, US survey feet) — the Broward County engineering standard, suitable for metric operations (area, distance, buffer) across the study area. Web exports are reprojected to EPSG:4326 for MapLibre / PMTiles consumption.
Tract-level summaries use the 2020 TIGER/Line census tracts filtered to Broward County (COUNTYFP = 011) and clipped to the municipal boundary. Tracts whose clipped area is less than 50% of their original extent are dropped as edge slivers. After filtering, 47 census tracts remain inside the city.
The study-area and location selection is configuration-driven (config/locations/fort_lauderdale.yaml, config/study_areas_ftl.yaml); the orchestrator dispatches on the DEPAVE_LOCATION environment variable so the same CLI runs either NYC or FTL.
2. Deviations from DepaveLA / DepaveNYC
The Depave methodology originates in DepaveLA's parcel-level impervious-cover analysis and was extended in a Jamaica, Queens pilot (DepaveNYC). Fort Lauderdale substitutes several inputs because NYC-specific city data do not exist here.
| Step | NYC (DepaveNYC) | Fort Lauderdale |
|---|---|---|
| Pavement source | NYC Land Cover 2017 (0.5 m, 8-class, expert-labeled) | NAIP 1 m 4-band imagery classified in-house by a self-supervised random forest (fallback: ESA WorldCover 10 m built-up class) |
| Core/non-core cut | Planimetric roadbed + sidewalk + parking polygons (DoITT) | Hybrid FDOT RCI measured widths (state roads) + OSM class-estimated widths (local roads) + OSM sidewalks; parking and service roads are non-core |
| Flood layer | NYC DEP Stormwater Flood Maps (UCI PRIMo 1D/2D hydraulic model) | DEM-derived topographic proxy — depression depth + D8 flow accumulation on 3DEP lidar DTM |
| Heat layer | NYC DOHMH Heat Vulnerability Index (ZCTA) | NOAA/CAPA 2021 afternoon surface-temperature polygons (mobile-traverse campaign) |
| Canopy layer | Land Cover 2017 tree-canopy class (raster) | University of Miami lidar-derived canopy polygons |
| Equity overlay | NYS DAC (state) + EJNYC (city) | CEJST v2.0 (federal Justice40 designation, archived mirror); EJScreen fetched when available |
| Processing CRS | EPSG:2263 (NAD83 / NY Long Island, ftUS) | EPSG:2236 (NAD83 / FL East, ftUS) |
Downstream stages (needs scoring, stacked composite, equity overlay, web export) are shared between locations and unmodified.
3. Data sources
All URLs resolve at acquire time (DEPAVE_LOCATION=fort_lauderdale python3 scripts/run_pipeline.py --stages acquire). Endpoints are taken verbatim from config/data_sources_ftl.yaml.
| Dataset | Resolution / vintage | Native CRS | Source |
|---|---|---|---|
| FTL municipal boundary | Polygon / current | EPSG:4326 | FTL ArcGIS REST, GeneralPurpose/gisdata MapServer layer 44 |
| Census tracts | 2020 TIGER/Line (all FL, filtered to Broward) | EPSG:4269 | Census Bureau (tl_2020_12_tract.zip) |
| Broward parcels (MapPLUTO analog) | BCPA tax roll / current | EPSG:2236 | Broward County GIS (maint.broward.org/gis/Parcels.zip); DOR use code field DOR_UC |
| NAIP imagery | 1 m, 4-band (R,G,B,NIR), latest FL vintage | EPSG:26917 (UTM 17N) | Microsoft Planetary Computer STAC collection naip |
| ESA WorldCover | 10 m, 2021, v200 | EPSG:4326 | Planetary Computer STAC collection esa-worldcover (classes: 10 trees, 50 built, 80 water, …) |
| Microsoft Global Buildings | Polygon / rolling | EPSG:4326 | MS Buildings quadkey-indexed GeoJSON |
| OSM core features | Current snapshot | EPSG:4326 | Overpass API — highway ways, footway=sidewalk, amenity=parking, aeroway=aerodrome, leisure=park |
| FDOT RCI | Weekly updated, polyline | EPSG:26917 | FDOT FTP (ftp.fdot.gov); fields SURWIDTH (ft), LANE_CNT. 5,172 segments for Broward. |
| FTL parks | Polygon, 157 parks | EPSG:4326 | FTL MapServer layer 62 (Park and Recreation Areas) |
| 3DEP lidar DTM | 1 m native, resampled to 10 pixel units | EPSG:26917 | Planetary Computer STAC 3dep-lidar-dtm |
| UMiami tree canopy | Polygon, lidar-derived | EPSG:4326 | AGOL owner:michael.hu_UMiami AND title:"Canopy_FortLauderdale" |
| Surface temperature | Afternoon surface T, 2021 | EPSG:4326 | Broward GeoHub — NOAA/CAPA mobile-traverse campaign |
| CEJST v2.0 | Tract-level, archived mirror | EPSG:4269 | Public Environmental Data Partners mirror; field SN_C |
| EJScreen 2024 | Block-group, optional | — | EPA gaftp.epa.gov (fetched if available) |
| FEMA NFHL | Regulatory zones | EPSG:4269 | FEMA — cross-reference only, not used for needs scoring |
4. Pavement classification (NAIP RF)
The self-supervised random-forest classifier lives in src/depave/process/classify_naip.py. It produces a binary uint8 GeoTIFF (1 = pavement, 0 = other) at the NAIP native resolution that downstream stages consume in place of the NYC land-cover raster.
Training-sample generation
Four classes are used: pavement=0, building=1, vegetation=2, water=3. Raw sample targets are 15,000 / 5,000 / 5,000 / 2,000 respectively; all classes are down-balanced to the smallest class cap (max 5,000 per class) before training.
- Pavement: interpolate points every ~3 m along OSM highway centerlines of class
motorway, trunk, primary, secondary, tertiary, residential, unclassified. Service / driveway / footway are excluded from training (they often are not pavement, or are the very things we want to flag as non-core downstream). - Buildings: random interior points inside Microsoft Building Footprints eroded by 2 m, sampled proportional to building area.
- Vegetation: stratified-random pixels with NDVI > 0.4 in the NAIP itself — self-supervised, guaranteed to be green.
- Water (optional): random pixels where the ESA WorldCover raster equals class 80, reprojected into the NAIP CRS.
Feature set
Each sample pixel is represented by a 7-dimensional feature vector: raw R, G, B, NIR reflectance plus three derived indices:
NDVI = (NIR - R) / (NIR + R + eps)
NDWI = (G - NIR) / (G + NIR + eps)
brightness = (R + G + B) / 3
Pavement-sample filtering
OSM centerlines occasionally pass through tree canopy (shaded road), over bridges painted with shadow, or in edge pixels mixed with vegetation. We apply three sequential filters before training:
- Feature-space thresholds: keep samples with
NDVI ≤ 0.15,NDWI ≤ 0.10, andbrightness ≥ 5th percentile(drops deep shadow). - Mahalanobis outlier removal: compute the Mahalanobis distance of each remaining sample from the class centroid using the empirical covariance matrix plus a small ridge (
+ 1e-4 · I), and drop points withd² > χ²0.975, 7 dof ≈ 14.07. An initial implementation withsklearn.covariance.MinCovDet(support_fraction 0.5) collapsed to a near-singular sub-sample on already-filtered inputs and dropped ~100% of points; empirical covariance was adequate because the previous NDVI/NDWI/shadow filters had already removed gross outliers. A safety clamp prevents the filter from dropping more than 25% of input points — if it would, the threshold falls back to the 90th percentile.
Classifier
sklearn.ensemble.RandomForestClassifier with:
n_estimators = 100max_depth = 16class_weight = "balanced"n_jobs = -1,random_state = 42
Reported class-weighted train accuracy is approximately 98%; this is optimistic (same samples used for fit and score) but confirms the features separate the classes cleanly.
Prediction
The full NAIP raster is predicted tile-by-tile in 2048-pixel chunks to cap peak RAM. For each chunk, R/G/B/NIR are read, the 7-feature vector is built, valid pixels are predicted, and the binary pavement mask (class == 0) is written into the output array. The final raster is tiled, deflate-compressed, with 512-pixel internal block size.
5. Post-classification refinement
Three refinements are applied (in order) before the classifier output is handed to the extraction stage:
- Morphological opening (3×3 structuring element) to remove single-pixel noise and tighten edges.
- Building subtraction: the MS Buildings layer is rasterized into the NAIP grid and any pavement pixel under a building is zeroed — a safety net against misclassified commercial rooftops (white TPO, concrete).
- Canopy road fill: OSM highway centerlines of a permissive class list (all of PAVEMENT_HWY_CLASSES plus
*_linkvariants,living_street,service,busway) are buffered by per-class half-width and burned in as pavement, restoring continuity where tree canopy had hidden the road. Service roads are included here because they are pavement — they're just classified as non-core downstream. - Airport removal: OSM
aeroway = aerodromepolygons are rasterized and pavement inside them is zeroed (applies to Fort Lauderdale Executive and any smaller airfields in the study area). - Park removal: park polygons from both OSM (
leisure=park,landuse=recreation_ground) and the FTL official parks layer (MapServer layer 62, 157 polygons) are rasterized and pavement inside them is zeroed (~99 acres removed). Depaving park pavement is outside the scope of this analysis. - Study-area mask: pavement outside the FTL municipal boundary is zeroed.
Per-class canopy-fill half-widths (in US feet) are:
| OSM class | Half-width (ft) | OSM class | Half-width (ft) |
|---|---|---|---|
| motorway | 35.0 | tertiary | 18.0 |
| motorway_link | 22.0 | tertiary_link | 14.0 |
| trunk | 35.0 | residential | 14.0 |
| trunk_link | 22.0 | unclassified | 12.0 |
| primary | 30.0 | living_street | 12.0 |
| primary_link | 20.0 | service | 10.0 |
| secondary | 25.0 | busway | 18.0 |
| secondary_link | 18.0 | (default fallback) | 10.0 |
6. Core vs non-core classification
Implemented in src/depave/process/pavement_ftl.py. "Core" pavement is the pavement a city needs — travel lanes, sidewalks, formal parking. "Non-core" is the pool of depave candidates.
Core mask construction
The core mask uses a hybrid road-width strategy: FDOT Roadway Characteristics Inventory (RCI) provides surveyed SURWIDTH (surface width in feet) for 5,172 state and federal road segments in Broward County; OSM highway centerlines with class-based width estimates cover local/residential streets not in RCI. Both are rasterized into a unified mask.
- FDOT RCI state roads: each segment is buffered by
SURWIDTH / 2(measured half-width). Mean width 21 ft, range 2–72 ft. Covers arterials, collectors, and federal-aid routes — the roads where class-based estimates are least reliable. - OSM highways (local roads): buffered symmetrically by a per-feature half-width. When the
lanestag is present, usemax(lanes, 1) × 10 ftper side; otherwise use a per-class fallback. Service, track, driveway, footway, path, cycleway, bridleway, steps, and corridor are not part of the core mask — these are candidates for depaving. - OSM sidewalks (
footway=sidewalk) buffered by 4 ft on each side.
Parking (amenity=parking) is classified as non-core — parking lots are depave candidates. This differs from the original DepaveLA methodology where parking was core.
Per-class highway half-widths (ft) when lanes tag is missing:
| Class | Half-width (ft) | Class | Half-width (ft) |
|---|---|---|---|
| motorway | 30.0 | secondary_link | 15.0 |
| motorway_link | 20.0 | tertiary | 15.0 |
| trunk | 30.0 | tertiary_link | 12.0 |
| trunk_link | 20.0 | residential | 12.0 |
| primary | 25.0 | unclassified | 10.0 |
| primary_link | 18.0 | living_street | 10.0 |
| secondary | 20.0 | pedestrian | 8.0 |
| (default) | 10.0 | busway | 15.0 |
Raster-based classification
The core/non-core split is performed in raster space: the FDOT + OSM road/sidewalk buffers are rasterized at the NAIP grid resolution (1 m), then combined with the binary pavement mask to produce a 3-class raster (0 = no pavement, 1 = core, 2 = non-core). This avoids the expensive vector polygonization + spatial-cut operation (which took 30+ min on 47k polygons) and produces cleaner, pixel-aligned results. Airport and park exclusion zones are applied by zeroing pixels within those polygons.
7. Needs scoring
All four needs layers score census tracts (the clipped set of 47). Each returns a *_raw column and a min-max-normalized *_score column in [0, 1], where higher = more need.
Heat
The 2021 NOAA/CAPA mobile-traverse dataset provides afternoon surface-temperature polygons. Each tract's score is the area-weighted mean of overlapping temperature polygons. Implemented via heat_layer.compute_heat_scores — the FTL pipeline renames the temperature column to hvi so the shared code path treats it as a heat indicator.
Stormwater flood risk (DEM proxy)
Implemented in src/depave/process/stormwater_proxy.py. Pipeline:
- Clip the 3DEP DTM to the study area; mask nodata.
- Gaussian pre-smoothing (σ = 1.5 px) to erase tile-boundary micro-cliffs (sub-meter elevation offsets between adjacent lidar tiles would otherwise be interpreted as real sink edges).
- Priority-flood depression fill on the smoothed DEM (via
richdem, with a pure-numpy fallback). - Depression depth =
filled_smoothed − original_dem, clipped to ≥ 0. - D8 flow accumulation on the filled DEM; log-transformed (
log1p) because the distribution is heavy-tailed. - Both fields are percentile-rank-normalized to [0, 1], then combined as
score = 0.6 × depression + 0.4 × flow_acc(depression weighted higher because ponding is a more direct pluvial signal than flow concentration on flat terrain). - Tier breaks at the 50th / 75th / 90th percentile of non-zero composite pixels produce
limited / moderate / extremeclasses. - Per-tier pixel masks are polygonized, denoised (
buffer(1).buffer(-1).simplify(2)), and dissolved to one MultiPolygon per tier.
Per-tract flood score is the weighted sum of tier coverage:
flood_raw = 0.5 · pct_extreme + 0.3 · pct_moderate + 0.2 · pct_limited
flood_score = minmax(flood_raw)
Canopy deficit
FTL uses polygon-based canopy data (UMiami lidar-derived) rather than a raster class. The per-tract canopy percentage is intersection_area / tract_area; deficit is 1 − canopy_pct; the normalized score is a min-max rescaling across the tract set.
Pavement burden
Per-tract non-core pavement fraction = (tract ∩ non-core pavement union).area / tract.area, min-max normalized. Implemented in pavement_layer.compute_pavement_scores.
8. Stacked composite
Equal-weight mean of the four normalized scores:
stacked_score = mean(heat_score, flood_score, canopy_score, pavement_score)
Priority tracts are those at or above the 75th percentile of stacked_score (top quartile). A second column, n_high_needs, counts how many of the four dimensions a tract is simultaneously in the top quartile for — useful for context. Implemented in stacked_needs.py.
9. Equity overlay
Implemented in equity_overlay.py, reused from the NYC pipeline. The FTL orchestrator passes the CEJST disadvantaged subset (where field SN_C == 1, from the v2.0 archived mirror) as the dac argument; the ejnyc argument is an empty GeoDataFrame. A tract is flagged is_dac if it overlaps the CEJST disadvantaged union by at least 1% of its area. Priority + DAC intersection is reported as the headline equity overlap, alongside aggregate statistics (non-core acreage in priority tracts, in DAC priority tracts, etc.), and written to data/fort_lauderdale/processed/summary_stats.json.
10. Known limitations
- Concrete vs asphalt spectral overlap. Both are bright, low-NDVI, low-NDWI. The classifier treats them as the same class (pavement), which is appropriate for depaving scoping but means we cannot distinguish asphalt from concrete programmatically.
- Commercial rooftops. White TPO and bare concrete rooftops can look spectrally similar to pavement. Building-footprint subtraction catches most of these, but buildings missing from MS Buildings (very new construction, small accessory structures) may leak in as pavement.
- OSM completeness. Training-sample generation depends on OSM centerlines; the core mask uses both FDOT RCI (state roads) and OSM (local roads). Poorly mapped private drives or missing sidewalks may be misclassified. FDOT RCI covers only state and federal routes — local residential streets still rely on OSM class-based width estimates.
- Pluvial proxy ≠ hydraulic simulation. The stormwater proxy reflects where water would collect on the bare-earth surface. It does not account for storm drain capacity, pipe networks, tidal tailwater, or rainfall intensity. It is a hazard screen, not a flood model.
- Small sample size. 47 tracts is a small denominator for min-max normalization and percentile tiers. A single outlier tract can move scores noticeably. Results are best interpreted as relative rankings, not absolute magnitudes.
- CEJST archival status. The official
geoplatform.govCEJST endpoint was taken offline in January 2025; we mirror v2.0 from Public Environmental Data Partners. If the mirror structure changes,config/data_sources_ftl.yamlneeds updating. - NAIP vintage drift. NAIP flights are roughly every 2–3 years. Buildings constructed between the NAIP flight and the MS Buildings vintage may be classified as pavement and not subtracted.
11. Reproducibility
Full run from scratch
export DEPAVE_LOCATION=fort_lauderdale
python3 scripts/run_pipeline.py \
--study-area fort_lauderdale \
--stages all
Stage names (FTL): study_area, acquire, classify, pavement, needs, equity, export. Any comma-separated subset is accepted; --force re-downloads raw inputs where supported. The classify stage is gated on use_naip_classification: true in config/locations/fort_lauderdale.yaml; with the flag false, the pipeline falls back to ESA WorldCover class 50.
Runtime expectations
- acquire: network-bound, ~30–90 min depending on NAIP tile count and Overpass load. Tiles are cached under
data/fort_lauderdale/raw/. - classify: ~10–20 min on a recent Apple Silicon Mac (parallel RF predict in 2048-px chunks).
- pavement: ~3–8 min (polygonization + core/non-core cut over the full city).
- needs + equity + export: ~1–3 min combined.
Outputs
data/fort_lauderdale/interim/naip_pavement.tif— binary pavement rasterdata/fort_lauderdale/interim/pavement_classified.gpkg— polygons w/pavement_type ∈ {core, non-core}data/fort_lauderdale/interim/stormwater_proxy.gpkg— 3-tier polygonsdata/fort_lauderdale/processed/stacked_needs.gpkg— scored tractsdata/fort_lauderdale/processed/equity_overlay.gpkg— scored tracts +is_dac,is_prioritydata/fort_lauderdale/processed/summary_stats.json— headline numbersdata/fort_lauderdale/processed/*.geojson— web-ready layers (pavement, non-core, flood tiers, tracts, priority)
12. Credits & citation
Depave Fort Lauderdale is produced by ONE Architecture & Urbanism, Inc. The methodology extends the DepaveLA/DepaveNYC framework with open-data substitutions appropriate to a city without NYC-grade municipal GIS products.
Data providers: USGS 3DEP, USDA NAIP, Microsoft Planetary Computer, OpenStreetMap contributors, Microsoft Global Buildings, ESA WorldCover, NOAA/CAPA, University of Miami, Broward County GIS, City of Fort Lauderdale GIS, U.S. Census Bureau, EPA, and Public Environmental Data Partners (CEJST mirror). OpenStreetMap data are licensed under ODbL.
Suggested citation: Depave Fort Lauderdale: A Screening Analysis of Non-Core Pavement and Environmental Need. ONE Architecture & Urbanism, 2026.