HydroShare is a collaborative environment being developed for open sharing of hydrologic data and models (Tarboton et al., 2014a; 2014b). The goal is to enable scientists to load data and models into HydroShare, easily discover and access hydrologic data and models, retrieve them to their desktop, or perform analyses in a distributed computing environment that may include grid, cloud, or high performance computing model instances, and ultimately publish data and models as permanent digital objects supporting reproducible research.
Collaborative Data Analysis and Publication is one use case driving the development of HydroShare (Figure 1). This extends existing Consortium of Universities for the Advancement of Hydrologic Science Inc. (CUAHSI) Hydrologic Information System (HIS) (Tarboton et al., 2009) data sharing functionality into a dynamic collaborative environment leading to the eventual archival publication of data.
Figure 1. Collaborative data analysis and publication use case.
At (1) data are observed and then loaded (2). In the current CUAHSI Hydrologic Information System (HIS) data is loaded into an observations data model relational database on a server that publishes it using web services (Horsburgh et al., 2008; 2010). Metadata is harvested by the HIS Central catalog, and supports geographic and context based data discovery. A desktop client user (3) discovers, downloads and analyzes the data, or uses it in a model. Steps 1 to 3 are supported by the existing CUAHSI HIS. HydroShare picks up from here allowing the user to next post the results (data and model) to HydroShare as resources, retaining provenance information on the original data source (4). This will be done through sharing features being added to the CUAHSI desktop client, HydroDesktop (Ames et al., 2012). HydroShare will also support direct entry of new resources. Upon ingestion, background actions parse metadata and enable analysis based on rules and policies. The user shares posted resources with colleagues (5), designating who has permission to access the resources. A group collaborates on refining the analysis, model or result. HydroShare tracks provenance supporting reproducibility and transparency. After iteration, the result is finalized and submitted for publication (6). At this point the resources produced (data, model, workflow, paper) are made immutable, access is opened and permanent persistent identifiers (e.g., DOIs) are assigned. The data may be moved to a permanent repository under the auspices of the CUAHSI Water Data Center (7) or alternative digital library or archive.