Figure 1 depicts main modules of ODN. Typical workflow for data publishing and data consumption is as follows:
- Data publisher creates new dataset record in ODN/InternalCatalog.
- Data publisher specifies metadata of the dataset record, such as its name, description, license, topic, etc.
- Data publisher associates the dataset record with a data publishing pipeline (created either by using wizard or by directly using ODN/UnifiedViews, an ETL tool for RDF data).
- Data publisher specifies sources of the dataset (Excel spreadsheets, tables in relational database, REST service etc.).
- Data publisher specified how the data sources should be transformed (Certain rows/columns removed/adjusted in the tabular data sources, data sources are anonymised, integrated with other data sources, cleansed etc.)
- Data publisher specifies how the data should be published (in which formats they will be available to data consumers - CSV/RDF dumps, REST API, SPARQL Endpoint API)
- Data publisher clicks "Publish" button to publish data sources for the given dataset as defined.
- Publishing pipeline is run in ODN/UnifiedViews, which ensures extraction of data sources, its transformation, and creation of output published data.
- ODN/InternalCatalog is updated with links to newly published data.
- ODN/Publication module prepares REST API, SPARQL Endpoint API for the published data.
- Data publisher verifies the publication process, makes the dataset publicly available.
- As a result, dataset record in ODN/InternalCatalog is pushed to publicly available ODN/PublicCatalog.
- Data consumer may access published data (using ODN/PublicCatalog)
- Data consumer may download dumps of published data
- Data consumer may access published data via REST API/SPARQL Endpoint API
- Application developer may access Apache Atom feed to get published data which changed from the last time.
- Data publisher may update data at any time (by re-running publication process).
- Update may be automated (by setting a schedule for repeated re-run of the publication process)
- (Optional) Data publisher may also configure search strategy for the published dataset, so that data consumers may search the dataset using tailored search engine.