Detailed Methodology — Depave Fort Lauderdale

This document describes the Fort Lauderdale (FTL) implementation of the Depave pipeline in sufficient detail for a GIS analyst to reproduce or extend the analysis. Source code, configuration files, and the NYC pilot for comparison live in a single repository; paths below are relative to the repository root.

Contents

Scope & study area
Deviations from DepaveLA / DepaveNYC
Data sources
Pavement classification (NAIP RF)
Post-classification refinement
Core vs non-core classification
Needs scoring
Stacked composite
Equity overlay
Known limitations
Reproducibility
Credits & citation

1. Scope & study area

The analysis covers the municipal boundary of the City of Fort Lauderdale, Florida (Broward County). All spatial operations are carried out in EPSG:2236 (NAD83 / Florida East, US survey feet) — the Broward County engineering standard, suitable for metric operations (area, distance, buffer) across the study area. Web exports are reprojected to EPSG:4326 for MapLibre / PMTiles consumption.

Tract-level summaries use the 2020 TIGER/Line census tracts filtered to Broward County (COUNTYFP = 011) and clipped to the municipal boundary. Tracts whose clipped area is less than 50% of their original extent are dropped as edge slivers. After filtering, 47 census tracts remain inside the city.

The study-area and location selection is configuration-driven (config/locations/fort_lauderdale.yaml, config/study_areas_ftl.yaml); the orchestrator dispatches on the DEPAVE_LOCATION environment variable so the same CLI runs either NYC or FTL.

2. Deviations from DepaveLA / DepaveNYC

The Depave methodology originates in DepaveLA's parcel-level impervious-cover analysis and was extended in a Jamaica, Queens pilot (DepaveNYC). Fort Lauderdale substitutes several inputs because NYC-specific city data do not exist here.

Step	NYC (DepaveNYC)	Fort Lauderdale
Pavement source	NYC Land Cover 2017 (0.5 m, 8-class, expert-labeled)	NAIP 1 m 4-band imagery classified in-house by a self-supervised random forest (fallback: ESA WorldCover 10 m built-up class)
Core/non-core cut	Planimetric roadbed + sidewalk + parking polygons (DoITT)	Hybrid FDOT RCI measured widths (state roads) + OSM class-estimated widths (local roads) + OSM sidewalks; parking and service roads are non-core
Flood layer	NYC DEP Stormwater Flood Maps (UCI PRIMo 1D/2D hydraulic model)	DEM-derived topographic proxy — depression depth + D8 flow accumulation on 3DEP lidar DTM
Heat layer	NYC DOHMH Heat Vulnerability Index (ZCTA)	NOAA/CAPA 2021 afternoon surface-temperature polygons (mobile-traverse campaign)
Canopy layer	Land Cover 2017 tree-canopy class (raster)	University of Miami lidar-derived canopy polygons
Equity overlay	NYS DAC (state) + EJNYC (city)	CEJST v2.0 (federal Justice40 designation, archived mirror); EJScreen fetched when available
Processing CRS	EPSG:2263 (NAD83 / NY Long Island, ftUS)	EPSG:2236 (NAD83 / FL East, ftUS)

Downstream stages (needs scoring, stacked composite, equity overlay, web export) are shared between locations and unmodified.

3. Data sources

All URLs resolve at acquire time (DEPAVE_LOCATION=fort_lauderdale python3 scripts/run_pipeline.py --stages acquire). Endpoints are taken verbatim from config/data_sources_ftl.yaml.

Dataset	Resolution / vintage	Native CRS	Source
FTL municipal boundary	Polygon / current	EPSG:4326	FTL ArcGIS REST, GeneralPurpose/gisdata MapServer layer 44
Census tracts	2020 TIGER/Line (all FL, filtered to Broward)	EPSG:4269	Census Bureau (`tl_2020_12_tract.zip`)
Broward parcels (MapPLUTO analog)	BCPA tax roll / current	EPSG:2236	Broward County GIS (`maint.broward.org/gis/Parcels.zip`); DOR use code field `DOR_UC`
NAIP imagery	1 m, 4-band (R,G,B,NIR), latest FL vintage	EPSG:26917 (UTM 17N)	Microsoft Planetary Computer STAC collection `naip`
ESA WorldCover	10 m, 2021, v200	EPSG:4326	Planetary Computer STAC collection `esa-worldcover` (classes: 10 trees, 50 built, 80 water, …)
Microsoft Global Buildings	Polygon / rolling	EPSG:4326	MS Buildings quadkey-indexed GeoJSON
OSM core features	Current snapshot	EPSG:4326	Overpass API — highway ways, `footway=sidewalk`, `amenity=parking`, `aeroway=aerodrome`, `leisure=park`
FDOT RCI	Weekly updated, polyline	EPSG:26917	FDOT FTP (`ftp.fdot.gov`); fields `SURWIDTH` (ft), `LANE_CNT`. 5,172 segments for Broward.
FTL parks	Polygon, 157 parks	EPSG:4326	FTL MapServer layer 62 (Park and Recreation Areas)
3DEP lidar DTM	1 m native, resampled to 10 pixel units	EPSG:26917	Planetary Computer STAC `3dep-lidar-dtm`
UMiami tree canopy	Polygon, lidar-derived	EPSG:4326	AGOL `owner:michael.hu_UMiami AND title:"Canopy_FortLauderdale"`
Surface temperature	Afternoon surface T, 2021	EPSG:4326	Broward GeoHub — NOAA/CAPA mobile-traverse campaign
CEJST v2.0	Tract-level, archived mirror	EPSG:4269	Public Environmental Data Partners mirror; field `SN_C`
EJScreen 2024	Block-group, optional	—	EPA `gaftp.epa.gov` (fetched if available)
FEMA NFHL	Regulatory zones	EPSG:4269	FEMA — cross-reference only, not used for needs scoring

4. Pavement classification (NAIP RF)

The self-supervised random-forest classifier lives in src/depave/process/classify_naip.py. It produces a binary uint8 GeoTIFF (1 = pavement, 0 = other) at the NAIP native resolution that downstream stages consume in place of the NYC land-cover raster.

Training-sample generation

Four classes are used: pavement=0, building=1, vegetation=2, water=3. Raw sample targets are 15,000 / 5,000 / 5,000 / 2,000 respectively; all classes are down-balanced to the smallest class cap (max 5,000 per class) before training.

Pavement: interpolate points every ~3 m along OSM highway centerlines of class motorway, trunk, primary, secondary, tertiary, residential, unclassified. Service / driveway / footway are excluded from training (they often are not pavement, or are the very things we want to flag as non-core downstream).
Buildings: random interior points inside Microsoft Building Footprints eroded by 2 m, sampled proportional to building area.
Vegetation: stratified-random pixels with NDVI > 0.4 in the NAIP itself — self-supervised, guaranteed to be green.
Water (optional): random pixels where the ESA WorldCover raster equals class 80, reprojected into the NAIP CRS.

Feature set

Each sample pixel is represented by a 7-dimensional feature vector: raw R, G, B, NIR reflectance plus three derived indices:

NDVI       = (NIR - R) / (NIR + R + eps)
NDWI       = (G - NIR) / (G + NIR + eps)
brightness = (R + G + B) / 3

Pavement-sample filtering

OSM centerlines occasionally pass through tree canopy (shaded road), over bridges painted with shadow, or in edge pixels mixed with vegetation. We apply three sequential filters before training:

Feature-space thresholds: keep samples with NDVI ≤ 0.15, NDWI ≤ 0.10, and brightness ≥ 5th percentile (drops deep shadow).
Mahalanobis outlier removal: compute the Mahalanobis distance of each remaining sample from the class centroid using the empirical covariance matrix plus a small ridge (+ 1e-4 · I), and drop points with d² > χ²_{0.975, 7 dof} ≈ 14.07. An initial implementation with sklearn.covariance.MinCovDet (support_fraction 0.5) collapsed to a near-singular sub-sample on already-filtered inputs and dropped ~100% of points; empirical covariance was adequate because the previous NDVI/NDWI/shadow filters had already removed gross outliers. A safety clamp prevents the filter from dropping more than 25% of input points — if it would, the threshold falls back to the 90th percentile.

Classifier

sklearn.ensemble.RandomForestClassifier with:

n_estimators = 100
max_depth = 16
class_weight = "balanced"
n_jobs = -1, random_state = 42

Reported class-weighted train accuracy is approximately 98%; this is optimistic (same samples used for fit and score) but confirms the features separate the classes cleanly.

Prediction

The full NAIP raster is predicted tile-by-tile in 2048-pixel chunks to cap peak RAM. For each chunk, R/G/B/NIR are read, the 7-feature vector is built, valid pixels are predicted, and the binary pavement mask (class == 0) is written into the output array. The final raster is tiled, deflate-compressed, with 512-pixel internal block size.

5. Post-classification refinement

Three refinements are applied (in order) before the classifier output is handed to the extraction stage:

Morphological opening (3×3 structuring element) to remove single-pixel noise and tighten edges.
Building subtraction: the MS Buildings layer is rasterized into the NAIP grid and any pavement pixel under a building is zeroed — a safety net against misclassified commercial rooftops (white TPO, concrete).
Canopy road fill: OSM highway centerlines of a permissive class list (all of PAVEMENT_HWY_CLASSES plus *_link variants, living_street, service, busway) are buffered by per-class half-width and burned in as pavement, restoring continuity where tree canopy had hidden the road. Service roads are included here because they are pavement — they're just classified as non-core downstream.
Airport removal: OSM aeroway = aerodrome polygons are rasterized and pavement inside them is zeroed (applies to Fort Lauderdale Executive and any smaller airfields in the study area).
Park removal: park polygons from both OSM (leisure=park, landuse=recreation_ground) and the FTL official parks layer (MapServer layer 62, 157 polygons) are rasterized and pavement inside them is zeroed (~99 acres removed). Depaving park pavement is outside the scope of this analysis.
Study-area mask: pavement outside the FTL municipal boundary is zeroed.

Per-class canopy-fill half-widths (in US feet) are:

OSM class	Half-width (ft)	OSM class	Half-width (ft)
motorway	35.0	tertiary	18.0
motorway_link	22.0	tertiary_link	14.0
trunk	35.0	residential	14.0
trunk_link	22.0	unclassified	12.0
primary	30.0	living_street	12.0
primary_link	20.0	service	10.0
secondary	25.0	busway	18.0
secondary_link	18.0	(default fallback)	10.0

6. Core vs non-core classification

Implemented in src/depave/process/pavement_ftl.py. "Core" pavement is the pavement a city needs — travel lanes, sidewalks, formal parking. "Non-core" is the pool of depave candidates.

Core mask construction

The core mask uses a hybrid road-width strategy: FDOT Roadway Characteristics Inventory (RCI) provides surveyed SURWIDTH (surface width in feet) for 5,172 state and federal road segments in Broward County; OSM highway centerlines with class-based width estimates cover local/residential streets not in RCI. Both are rasterized into a unified mask.

FDOT RCI state roads: each segment is buffered by SURWIDTH / 2 (measured half-width). Mean width 21 ft, range 2–72 ft. Covers arterials, collectors, and federal-aid routes — the roads where class-based estimates are least reliable.
OSM highways (local roads): buffered symmetrically by a per-feature half-width. When the lanes tag is present, use max(lanes, 1) × 10 ft per side; otherwise use a per-class fallback. Service, track, driveway, footway, path, cycleway, bridleway, steps, and corridor are not part of the core mask — these are candidates for depaving.
OSM sidewalks (footway=sidewalk) buffered by 4 ft on each side.

Parking (amenity=parking) is classified as non-core — parking lots are depave candidates. This differs from the original DepaveLA methodology where parking was core.

Per-class highway half-widths (ft) when lanes tag is missing:

Class	Half-width (ft)	Class	Half-width (ft)
motorway	30.0	secondary_link	15.0
motorway_link	20.0	tertiary	15.0
trunk	30.0	tertiary_link	12.0
trunk_link	20.0	residential	12.0
primary	25.0	unclassified	10.0
primary_link	18.0	living_street	10.0
secondary	20.0	pedestrian	8.0
(default)	10.0	busway	15.0

Raster-based classification

The core/non-core split is performed in raster space: the FDOT + OSM road/sidewalk buffers are rasterized at the NAIP grid resolution (1 m), then combined with the binary pavement mask to produce a 3-class raster (0 = no pavement, 1 = core, 2 = non-core). This avoids the expensive vector polygonization + spatial-cut operation (which took 30+ min on 47k polygons) and produces cleaner, pixel-aligned results. Airport and park exclusion zones are applied by zeroing pixels within those polygons.

7. Needs scoring

All four needs layers score census tracts (the clipped set of 47). Each returns a *_raw column and a min-max-normalized *_score column in [0, 1], where higher = more need.

Heat

The 2021 NOAA/CAPA mobile-traverse dataset provides afternoon surface-temperature polygons. Each tract's score is the area-weighted mean of overlapping temperature polygons. Implemented via heat_layer.compute_heat_scores — the FTL pipeline renames the temperature column to hvi so the shared code path treats it as a heat indicator.

Stormwater flood risk (DEM proxy)

Implemented in src/depave/process/stormwater_proxy.py. Pipeline:

Clip the 3DEP DTM to the study area; mask nodata.
Gaussian pre-smoothing (σ = 1.5 px) to erase tile-boundary micro-cliffs (sub-meter elevation offsets between adjacent lidar tiles would otherwise be interpreted as real sink edges).
Priority-flood depression fill on the smoothed DEM (via richdem, with a pure-numpy fallback).
Depression depth = filled_smoothed − original_dem, clipped to ≥ 0.
D8 flow accumulation on the filled DEM; log-transformed (log1p) because the distribution is heavy-tailed.
Both fields are percentile-rank-normalized to [0, 1], then combined as score = 0.6 × depression + 0.4 × flow_acc (depression weighted higher because ponding is a more direct pluvial signal than flow concentration on flat terrain).
Tier breaks at the 50th / 75th / 90th percentile of non-zero composite pixels produce limited / moderate / extreme classes.
Per-tier pixel masks are polygonized, denoised (buffer(1).buffer(-1).simplify(2)), and dissolved to one MultiPolygon per tier.

Per-tract flood score is the weighted sum of tier coverage:

flood_raw = 0.5 · pct_extreme + 0.3 · pct_moderate + 0.2 · pct_limited
flood_score = minmax(flood_raw)

Canopy deficit

FTL uses polygon-based canopy data (UMiami lidar-derived) rather than a raster class. The per-tract canopy percentage is intersection_area / tract_area; deficit is 1 − canopy_pct; the normalized score is a min-max rescaling across the tract set.

Pavement burden

Per-tract non-core pavement fraction = (tract ∩ non-core pavement union).area / tract.area, min-max normalized. Implemented in pavement_layer.compute_pavement_scores.

8. Stacked composite

Equal-weight mean of the four normalized scores:

stacked_score = mean(heat_score, flood_score, canopy_score, pavement_score)

Priority tracts are those at or above the 75th percentile of stacked_score (top quartile). A second column, n_high_needs, counts how many of the four dimensions a tract is simultaneously in the top quartile for — useful for context. Implemented in stacked_needs.py.

9. Equity overlay

Implemented in equity_overlay.py, reused from the NYC pipeline. The FTL orchestrator passes the CEJST disadvantaged subset (where field SN_C == 1, from the v2.0 archived mirror) as the dac argument; the ejnyc argument is an empty GeoDataFrame. A tract is flagged is_dac if it overlaps the CEJST disadvantaged union by at least 1% of its area. Priority + DAC intersection is reported as the headline equity overlap, alongside aggregate statistics (non-core acreage in priority tracts, in DAC priority tracts, etc.), and written to data/fort_lauderdale/processed/summary_stats.json.

10. Known limitations

Concrete vs asphalt spectral overlap. Both are bright, low-NDVI, low-NDWI. The classifier treats them as the same class (pavement), which is appropriate for depaving scoping but means we cannot distinguish asphalt from concrete programmatically.
Commercial rooftops. White TPO and bare concrete rooftops can look spectrally similar to pavement. Building-footprint subtraction catches most of these, but buildings missing from MS Buildings (very new construction, small accessory structures) may leak in as pavement.
OSM completeness. Training-sample generation depends on OSM centerlines; the core mask uses both FDOT RCI (state roads) and OSM (local roads). Poorly mapped private drives or missing sidewalks may be misclassified. FDOT RCI covers only state and federal routes — local residential streets still rely on OSM class-based width estimates.
Pluvial proxy ≠ hydraulic simulation. The stormwater proxy reflects where water would collect on the bare-earth surface. It does not account for storm drain capacity, pipe networks, tidal tailwater, or rainfall intensity. It is a hazard screen, not a flood model.
Small sample size. 47 tracts is a small denominator for min-max normalization and percentile tiers. A single outlier tract can move scores noticeably. Results are best interpreted as relative rankings, not absolute magnitudes.
CEJST archival status. The official geoplatform.gov CEJST endpoint was taken offline in January 2025; we mirror v2.0 from Public Environmental Data Partners. If the mirror structure changes, config/data_sources_ftl.yaml needs updating.
NAIP vintage drift. NAIP flights are roughly every 2–3 years. Buildings constructed between the NAIP flight and the MS Buildings vintage may be classified as pavement and not subtracted.

11. Reproducibility

Full run from scratch

export DEPAVE_LOCATION=fort_lauderdale
python3 scripts/run_pipeline.py \
    --study-area fort_lauderdale \
    --stages all

Stage names (FTL): study_area, acquire, classify, pavement, needs, equity, export. Any comma-separated subset is accepted; --force re-downloads raw inputs where supported. The classify stage is gated on use_naip_classification: true in config/locations/fort_lauderdale.yaml; with the flag false, the pipeline falls back to ESA WorldCover class 50.

Runtime expectations

acquire: network-bound, ~30–90 min depending on NAIP tile count and Overpass load. Tiles are cached under data/fort_lauderdale/raw/.
classify: ~10–20 min on a recent Apple Silicon Mac (parallel RF predict in 2048-px chunks).
pavement: ~3–8 min (polygonization + core/non-core cut over the full city).
needs + equity + export: ~1–3 min combined.

Outputs

data/fort_lauderdale/interim/naip_pavement.tif — binary pavement raster
data/fort_lauderdale/interim/pavement_classified.gpkg — polygons w/ pavement_type ∈ {core, non-core}
data/fort_lauderdale/interim/stormwater_proxy.gpkg — 3-tier polygons
data/fort_lauderdale/processed/stacked_needs.gpkg — scored tracts
data/fort_lauderdale/processed/equity_overlay.gpkg — scored tracts + is_dac, is_priority
data/fort_lauderdale/processed/summary_stats.json — headline numbers
data/fort_lauderdale/processed/*.geojson — web-ready layers (pavement, non-core, flood tiers, tracts, priority)

12. Credits & citation

Depave Fort Lauderdale is produced by ONE Architecture & Urbanism, Inc. The methodology extends the DepaveLA/DepaveNYC framework with open-data substitutions appropriate to a city without NYC-grade municipal GIS products.

Data providers: USGS 3DEP, USDA NAIP, Microsoft Planetary Computer, OpenStreetMap contributors, Microsoft Global Buildings, ESA WorldCover, NOAA/CAPA, University of Miami, Broward County GIS, City of Fort Lauderdale GIS, U.S. Census Bureau, EPA, and Public Environmental Data Partners (CEJST mirror). OpenStreetMap data are licensed under ODbL.

Suggested citation: Depave Fort Lauderdale: A Screening Analysis of Non-Core Pavement and Environmental Need. ONE Architecture & Urbanism, 2026.