You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 27 Next »

  1. Create new dataset record in ODN/InternalCatalog and associate it with a publishing pipeline  
    1. Specify metadata of the dataset record, such as its name, description, license, topic, etc - create a new catalog record for a dataset 
    2. Associate the dataset record with a data publishing pipeline (created directly using ODN/UnifiedViews, an ETL tool) - use one of the following approaches: create new associated pipeline created manuallyassociate an existing non-associated pipelinecreate a new associated pipeline as modified copy of an existing pipeline, here's a quick guide on how to create a pipeline from scratch
      1. Specify source data for the dataset (data files, e.g. Excel spreadsheets, tables in relational database, data accessible via API etc.)
      2. Specify how the data sources should be transformed (Certain rows/columns removed/adjusted in the tabular data sources, data sources are anonymised, integrated with other data sources, cleansed etc.)
      3. Specify how the data should be published (in which formats they will be available to data consumers - CSV/RDF/... dumps, REST API, SPARQL Endpoint API)
    3. Dissociate a pipeline if it is not needed for producing data resources for a dataset anymore - dissociate a pipeline

  2. Create and publish resources for the given dataset
    1. Manualy executing the publishing pipeline which ensures extraction of data sources, its transformation, and creation of output published data - create/update dataset resource(s) on demand.
    2. Automated execution of publishing pipeline according to schedule set - display/manage pipeline scheduling information
    3. Update ODN/InternalCatalog by ODN/Publication module with resources created for a dataset containing - create/update dataset resource(s) automatically when data processing done:
      1. downloadable links to newly published data in case of file dumps 
      2. REST API and SPARQL Endpoint API interfaces for data to be accessible in more sophisticated way dedicated for automated use of data by 3rd party applications, etc.
      3. resource specific metadata describing the resource of the dataset

  3. Verify and debug the publication process, make the dataset publicly available
    1. verify the publication status after pipeline execution - display last publication status of a pipeline
    2. tune the configuration of a pipeline - debug pipeline 
    3. make the dataset publicly available in ODN/PublicCatalog - make dataset publicly available
    4. publish the publicly available dataset also to external catalog(s) (both metadata and data is pushed) which is defined per dataset - add external catalog for update, modify external catalog for update, delete external catalog for update

  4. Create a visualization of a dataset resource accessible via SPARQL endpoint and publish the visualization - add visualization to dataset

  5. Display (using ODN/InternalCatalog):
    1. metadata of the dataset - retrieve metadata about dataset via public catalog API
    2. metadata of dataset resource - list available resources for a dataset

  6. Access published data (using ODN/PublicCatalog)
    1. download file dumps of published data - download latest version of public dataset file dump
    2. access published data via REST API/SPARQL Endpoint API - retrieving public data from a dataset via REST API

  7. Display the list of datasets contained in the catalog using:
    1. GUI of the catalog (list of datasets) - list of datasets contained in public catalog (GUI)
    2. API of the catalog (list of datasets in machine readable form) - list of datasets contained in the public catalog (API)

  8. Display (using ODN/PublicCatalog):
    1. metadata of the dataset - - retrieve metadata about dataset via public catalog API
    2. metadata of dataset resource - list available resources for a dataset

  9. Share a dataset or its resource by using social media - social media sharing

  10. Working with public datasets, you can
    1. browse public data records - publicly available catalog record browser
    2. search them using keywords - publicly available catalog records search using keywords
    3. filter them - publicly available catalog records filtering

  11. And also
    1. manage communication language to communicate with other consumers in the native language - GUI language management
    2. display available resources for a public dataset -  list of available resources for a public dataset

 

  • No labels