[2015-04-30] DCO Data Type Registry Implementation Project Update uri icon

DCO ID 11121/2588-5037-8242-8988-CC

Date Submitted

  • 2015-04-30

Update Text

  • Background: The Research Data Alliance (RDA) - Data Type Registry (DTR) Working Group has addressed a core problem of research data management: facilitating parsing, understanding, and potentially reusing data assets produced by other research teams. This issue has traditionally been addressed at the syntactic level, e.g., file types, MIME types, and so forth, but not at the semantic level, e.g., the data in a table column are known to be integers but what do they represent? RDA DTR developed a framework aims to enable data creators to record and make explicit the implicit assumptions of a dataset. In related work, the RDA - Persistent Identifier Information Types (PIT) Working Group addressed the essential types of information associated with persistent identifiers. RDA PIT developed a conceptual model for structuring typed information, an application programming interface (API) for accessing typed information and a demonstrator implementing the interface. 

    Relevance to DCO: The Deep Carbon Observatory (DCO) Data Portal is built around a centrally-managed digital object identification, object registration and metadata management service. The DCO Portal provides the digital object registration process for DCO Community members, which includes two key components: (1) DCO-ID generation based on the global Handle System and (2) metadata collection for each registered DCO object, which may be any piece of DCO knowledge but importantly include datasets. Datasets created by the DCO science community span a wide range of formats and topics in Earth and space sciences. DCO researchers are expected to generate a large number of digital object registrations and therefore need an efficient, domain-appropriate means for curating and reusing their data based on the registered information.

    DCO-DS contributions: The DCO Data Science Team deployed data type registration capabilities complying with the DCO DTR and PIT recommendations in the DCO Data Portal to facilitate data curation in the DCO community, and provide feedback to the RDA community. In particular, DCO-DS developed a general information model that captures various properties of DCO; instances of this model (ie specific named collections of values of these properties that characterize DCO datasets) represent DCO data types. This information model was then used as the basis for an expansion of the DCO Dataset Browser, a data discovery tool for DCO data, and the DCO Data Type browser, a tool for exploring the DCO data type space and for discovering datasets based on specific properties of the data. DCO-DS participated in the RDA Plenary (Winter 2015) and presented our work. A conference report/blog was prepared by Stephan Zednik: http://tw.rpi.edu/weblog/2015/04/30/dco-ds-participation-at-research-data-alliance-plenary-5-meeting/