How We Built a National Trash Schedule API from 8 Municipal Data Sources
There are 19,000+ municipalities in the US, each with different trash collection schedules and virtually no standardized APIs. Here's how we reverse-engineered ArcGIS, Socrata, and Algolia endpoints to build address-level lookup for 5M+ addresses.
The Problem: 19,000 Municipalities, Zero Standards
Ask any AI assistant what day your trash gets picked up, and you'll get a confident, wrong answer. ChatGPT told me Wednesday. Claude said Thursday. Google Gemini offered Tuesday with a disclaimer. My actual trash day is Friday.
This isn't the AI's fault โ the data simply doesn't exist in any centralized, machine-readable format. Every city handles waste collection differently: different haulers, different routes, different schedules, different data systems. The City of Boston uses ArcGIS with fields called TRASHDAY and RECOLLECT. Philadelphia uses collday. Houston splits it across three separate map layers. Denver uses a coordinate system measured in feet.
We built TrashAlert to solve this by creating a single API that normalizes all of these different municipal data sources into a consistent, address-level response.
Architecture: Census Geocoder โ Spatial Query โ Normalize
The core pipeline is simple in concept, complex in execution:
โ City ArcGIS/Socrata spatial query โ raw schedule data
โ Day code normalization โ { collectionDay: 'Friday' }
Step 1: The Census Geocoder (Free, No Auth)
The US Census Bureau provides a free geocoding API at geocoding.geo.census.gov. It takes a one-line address and returns standardized coordinates. No API key required, no rate limit concerns for our use case.
// Geocode any US address to lat/lng
const url = new URL('https://geocoding.geo.census.gov/geocoder/locations/onelineaddress')
url.searchParams.set('address', '7450 Northrup Dr, San Diego, CA 92126')
url.searchParams.set('benchmark', 'Public_AR_Current')
url.searchParams.set('format', 'json')
const { result } = await fetch(url).then(r => r.json())
// โ { lat: 32.8684, lng: -117.1464, matchedAddress: "7450 NORTHRUP DR, SAN DIEGO, CA, 92126" }The Census Geocoder also handles address normalization โ "Ave" vs "Avenue", "Blvd" vs "Boulevard" โ which saves us from building our own normalization layer for every city's naming conventions.
Step 2: Spatial Queries Against Municipal ArcGIS Servers
Most US cities publish geographic data through Esri ArcGIS. Collection zones are typically stored as polygon layers on MapServer or FeatureServer endpoints. A spatial "point in polygon" query with our geocoded coordinates tells us which collection zone an address falls in.
// Standard ArcGIS spatial query
const url = new URL(`${serviceUrl}/query`)
url.searchParams.set('geometry', `${lng},${lat}`)
url.searchParams.set('geometryType', 'esriGeometryPoint')
url.searchParams.set('inSR', '4326') // WGS84
url.searchParams.set('spatialRel', 'esriSpatialRelIntersects')
url.searchParams.set('outFields', 'TRASHDAY,RECOLLECT')
url.searchParams.set('returnGeometry', 'false')
url.searchParams.set('f', 'json')This works beautifully for Boston, Philadelphia, Los Angeles, Phoenix, and Houston. But not every city makes it easy.
City-by-City: Where It Gets Weird
Denver: Coordinate Projection in Feet
Denver's ArcGIS server uses WKID 2877 โ the NAD83 Colorado Central coordinate system, measured in feet. You can't send WGS84 coordinates (the standard lat/lng you get from any geocoder) directly. We have to project them first using Esri's Geometry Service:
// Project WGS84 โ WKID 2877 (NAD83 Colorado Central, feet)
const geometries = JSON.stringify({
geometryType: 'esriGeometryPoint',
geometries: [{ x: lng, y: lat }],
})
const projUrl = new URL(`${ESRI_GEOMETRY_SERVICE}/project`)
projUrl.searchParams.set('inSR', '4326') // WGS84
projUrl.searchParams.set('outSR', '2877') // Colorado feet
projUrl.searchParams.set('geometries', geometries)
// Then use an envelope buffer because Denver's polygons have gaps
const buffer = 50 // 50 feet
const envelope = {
xmin: projected.x - buffer,
ymin: projected.y - buffer,
xmax: projected.x + buffer,
ymax: projected.y + buffer,
}That 50-foot envelope buffer was discovered after hours of debugging "no results" for addresses that clearly fell within collection zones. Denver's polygon boundaries have tiny gaps that miss exact point-in-polygon queries.
Houston: Three Layers, Three Schemas
Houston's Solid Waste Management department publishes separate ArcGIS layers for trash (Layer 1), recycling (Layer 2), and bulk waste (Layer 3). We query all three in parallel:
const [trashAttrs, recycleAttrs, bulkAttrs] = await Promise.all([
arcgisSpatialQuery(HOUSTON_TRASH_URL, lat, lng, 'DAY,QUAD,SCHEDULE'),
arcgisSpatialQuery(HOUSTON_RECYCLING_URL, lat, lng, 'SERVICE_DAY,SCHEDULE'),
arcgisSpatialQuery(HOUSTON_BULK_URL, lat, lng, 'DAY,SCHEDULE'),
])
// Recycling SERVICE_DAY format: "MONDAY-A" โ day + alternating week
const parts = recycleAttrs.SERVICE_DAY.split('-')
// โ { day: "Monday", week: "A" }NYC: Socrata Instead of ArcGIS
New York City publishes DSNY (Department of Sanitation) data through Socrata Open Data, not ArcGIS. Socrata supports spatial queries via an intersects() function that accepts WKT geometry:
const url = new URL('https://data.cityofnewyork.us/resource/rv63-53db.json')
url.searchParams.set('$where',
`intersects(multipolygon, 'POINT(${lng} ${lat})')`)
url.searchParams.set('$select',
'district,freq_refuse,freq_recycling,freq_organics,freq_bulk')
// No API key required โ public datasetAustin: Two Socrata Datasets, Different Schemas
Austin uses Socrata for both garbage (brxe-dmqm) and recycling (ytb7-vtcd) data, but the field names differ between datasets: service_day + route_name for garbage, and service_day + service_week for recycling.
The Day Code Normalization Problem
Every city uses a different format to represent collection days:
- Boston:
TF= "Tuesday, Friday" (compound multi-day code) - Denver:
MO,TU,WE(two-letter day codes) - Denver recycling:
MA= "Monday A-week",WB= "Wednesday B-week" - Houston:
MONDAY(full day name), recycling:MONDAY-A - Phoenix:
TUESDAY(all caps full name) - Philadelphia:
MON,TUE(three-letter abbreviations)
Our normalizer uses a greedy matching algorithm that tries 4-character, 3-character, 2-character, then 1-character prefixes against a lookup table, handling compound codes like Boston's "TF" by consuming each day code sequentially.
Scale: 5M+ Addresses, Zero Bulk Imports
For our API-integrated cities (Boston, Philadelphia, NYC, LA, Houston, Denver, Austin, Phoenix), we store zero schedule data. Every lookup hits the city's live data in real-time. When Denver changes a route, we reflect it immediately without any data pipeline or ETL job.
The tradeoff is latency: a typical lookup requires 2-3 sequential network requests (Census Geocoder โ ArcGIS โ normalize). We keep it under 2 seconds with:
- Parallel queries where possible (Houston's three layers, Austin's two datasets)
- AbortSignal timeouts (5-15 seconds per request)
- Retry logic for flaky ArcGIS servers (2 attempts before failing)
- Vercel edge functions for low-latency deployment
What We Learned
Municipal APIs are more public than you think
Every ArcGIS endpoint we use is public โ no authentication required. Cities publish this data for their own internal tools and web maps, but the REST endpoints are discoverable and queryable.
The Census Geocoder is underrated
Free, no API key, handles address normalization, and returns Census block/tract data. It's the backbone of our pipeline and we've never hit a rate limit.
Coordinate systems will ruin your afternoon
When Denver queries returned nothing for valid addresses, we spent hours debugging before discovering WKID 2877. Always check the service's spatial reference before writing queries.
Every city is a special snowflake
There is no standard for how cities name their fields, structure their layers, or encode day-of-week values. Budget ~2-4 hours per city for discovery, testing, and edge cases.
How to Add a New City
Adding a new city to the pipeline typically takes 2-4 hours:
- Find the ArcGIS endpoint: Search
[city] arcgis rest servicesor check their GIS portal - Identify the collection layer: Look for "Solid Waste", "Collection", "Sanitation", "Refuse" in service names
- Examine the fields: Add
?f=jsonto the layer URL to see field definitions - Test a spatial query: Use a known address, geocode it, query the layer
- Write the lookup function: Handle the city's specific day codes and field names
- Add to the router: Register the city in the lookup map
What's Next
We're working on adding more cities to the real-time pipeline. The biggest challenge isn't the technical integration โ it's finding the endpoints. Some cities use private ArcGIS servers that require authentication. Others use Salesforce or custom systems that are harder to reverse-engineer.
If you know of a city's collection schedule API endpoint, let us know. And if you want to try it out, look up your address on TrashAlert โ it's free.