Documents/DOTO/3: Open Government Strategies/3.1.2: Data Set Selection and the DOT Data Inventory

3.1.2: Data Set Selection and the DOT Data Inventory

Develop a more mature data inventory and consider ways to break down barriers that make data inaccessible.

Other Information:

As of April 7, 2010, the DOT has released 13 datasets that are accessible on Data.gov. The DOT is developing a more mature data inventory and considering ways to break down barriers that make data inaccessible. As such, the DOT is exploring tools that can read structured data sets and produce data and the associated metadata in open formats such as eXtensible Markup Language (XML). (The XML language is a set of rules about the structure of data or documents. It makes the information interchange among a wide variety of systems easier.) The current inventory identifies approximately 50 data sets across the Department, including some that have never been released. A large proportion of the data sets in the current inventory are publicly available, but not in an open format. It is important to note that not every data set in the inventory may be suitable for release. However, the presumption shall be in favor of openness, to the extent permitted by law and subject to valid privacy, confidentiality, security, or other restrictions. Some examples of high-value data sets that are publicly available now, but not in open formats, include data on safety defects, car recalls, transit ridership, selected air carrier data, and selected transportation fatality data. These data sets will be considered in the data inventory prioritization process that is currently underway. DOT’s data inventory could contain many types of structured and unstructured information, including, but not limited to, XML data sets and comprehensive reports to external stakeholders (i.e., Congress). The DOT releases numerous comprehensive reports on its websites at varying timeframes. These reports assemble internal DOT data as well as data from state and local departments of transportation or other public entities. Consistent with completing our data inventory, DOT will identify cases in which we provide public information in electronic format, where the underlying data is not exposed. DOT will work with its Operating Administrations, where the data, subject matter expertise and analytical capability reside within the DOT, to develop a process whereby the release of the raw data contained in these reports is concurrently released to data.gov. The DOT will complete a comprehensive Department-wide data inventory, to support the data set selection and release process, by September 30, 2010. After completing this inventory, the DOT will establish timelines for publication of appropriate information not yet available for download in open formats and set specific target dates for release. Once those target dates are formalized, they will be included in the next iteration of the DOT Open Government Plan. In addition to creating a process for releasing data from the Department and developing a data inventory, the Department will also develop a method to prioritize data sets for release. We are considering enhancing usability by also indicating whether a high-value data set was previously unavailable, available only with a FOIA request, available only for purchase, or available but in a less user-friendly format. It is also important to designate what DOT data is considered high-value. The Open Government Directive defines high-value data as data that can be used to: • Increase agency accountability and responsiveness; • Improve public knowledge of the agency and its operations; • Further the core mission of the agency; • Create economic opportunity; or • Respond to need and demand as identified through public consultation.

Indicator(s):