Raleigh, NC – Last night the Triangle Code for America Brigades came together to help the City of Raleigh with their Open Data Portal. The task was to clean up the tagging and metadata for all the data files that have been added since the version 1.0 portal was launched in March 2013. At the beginning of 2014, Raleigh will be migrating the portal to version 2.0 and wants to have the data as clean as possible and so most useful to citizens and entrepreneurs.
Representatives from the Raleigh, Durham, Cary and (brand new) Morrisville Brigades met at the North Carolina Innovation Center for a mini-hackathon to clean up the Raleigh data sets.
Data Identification Problem
Hason Hare, the Open Data Manager for Raleigh gave us direction on what Raleigh needs. Each data set was to be reviewed for such things as data set naming, identifying the proper data category, cleaning up tagging, and verifying proper attribution for source of the data and licence of the data (Public Domain).
Having a common nomenclature for data descriptions and metadata is important. Currently, there is no standard for data set naming, field naming or metadata. This becomes a problem when open data sets from multiple municipalities are combined; what data in City A corresponds to the data from City B? Until there are uniform definitions and structure, Open Data will have problems with data integration across regions.
Issues of correctly naming data sets also has some importance. An example given by Jason is that police “crime” data sets are more accurately described as police “incident” data sets. As Jason sees it, an incident can be reported immediately, but there might be a lengthy process to properly identify a crime.
Data Definition Solution
There is light at the end of this open data tunnel as Jason has been invited to attend a meeting of the top US cities with Open Data programs to review data categorization and nomenclature. The meeting has been called by the Technology Office of the White House to start developing a data nomenclature. There is a high priority in solving this issue before it becomes overly burdensome for all of the Open Data programs.
It is good to see Raleigh included with the top US cities at the forefront of Open Data.