When someone submits a service request to NYC311, one of the pieces of information they provide is the location of their issue. Typically, this is an address or an intersection. If a call is placed, the operator requests the information from the caller and types it into their system. The website prompts the user to type in the information. The mobile app utilizes the phone's location services to determine the location for some requests, and the user can accept or change the provided address. For other requests the app prompts the user to type in an address. Once the location is entered into the system through any of these methods, it is input into a geocoder to extract additional locational information which is added to the data available on Open Data.
The NYC Department of Information Technology and Telecommunications (DoITT) has built multiple geocoders for NYC311 over the years. Today, different parts of NYC311 use either of these geocoders. The backbone of all geocoding processes in the city though is GeoSupport, the original tool built for and maintained by the Department of City Planning (DCP). The latest version of the tool is available for anyone to download from DCP's website. Updates are released quarterly, and the older versions are archived on BYTES of the Big Apple. There is also a User Guide for those who wish to build their own programs using GeoSupport.
GeoSupport provides multiple geocoding functions. A user has the ability to input an address, an intersection, cross streets, a Borough-Block-Lot (BBL) code, or a Building Identification Number (BIN), and get numerous data back as output. While geocoding refers to the process of transforming a location description such as an address into a coordinate, GeoSupport provides information beyond the coordinate. Each of the possible inputs for GeoSupport is also part of any of the outputs, as are location attributes such as community district, school district, census tract, and more. Note that while GeoSupport can provide school district and census tract information, these particular outputs are not included in the data set available for download on Open Data. They are are available however in the augmented data set we provide with our tool.
One-stop Shop geocoding
About 10% of the service requests do not have coordinate information. This could be because a location was not provided by the person submitting the request, or it may be because as with all things, geocoding is not foolproof. Information has to be input accurately, and even when checks are put in place to allow for discrepancies (for example, GeoSupport accepts both "Street" and "St"), there is a limit to how many are acceptable. Furthermore, geographic data changes, which can also create errors.
Of all of the requests without a specified coordinate value, nearly 503,000 (about a third) do have a value for the BBL. Using GeoSupport's BBL function, we were able to geocode all but 2,600 of these. The resulting coordinate is the centroid of the BBL. We were then able to use this coordinate to assign the service request additional spatial values.
Of the requests which have neither a coordinate nor a BBL, almost 150,000 have an address and two cross streets specified. Many of these rows however have values which are clearly invalid. For example, many rows have just a single digit or letter in the second cross street column. Nonetheless, we attempted to geocode what we could using GeoSupport's Function 3. This function takes as its input a segment of a street, that is an "on street" and the two consecutive streets which cross it. We were unable to geocode any of these values using GeoSupport, but are continuing to explore other geocoding methods.