- Create new dataset record in ODN/InternalCatalog and associate it with a publishing pipeline
- Specify metadata of the dataset record, such as its name, description, license, topic, etc - create a new catalog record for a dataset
- Associate the dataset record with a data publishing pipeline (created directly using ODN/UnifiedViews, an ETL tool) - use one of the following approaches: create new associated pipeline created manually, associate an existing non-associated pipeline, create a new associated pipeline as modified copy of an existing pipeline, here's a quick guide on how to create a pipeline from scratch, additionaly this descriptive list of DPUs may come in handy while creating a pipeline.
- Specify source data for the dataset (data files, e.g. Excel spreadsheets, tables in relational database, data accessible via API etc.)
- Specify how the data sources should be transformed (Certain rows/columns removed/adjusted in the tabular data sources, data sources are anonymised, integrated with other data sources, cleansed etc.)
- Specify how the data should be published (in which formats they will be available to data consumers - CSV/RDF/... dumps, REST API, SPARQL Endpoint API)
- Dissociate a pipeline if it is not needed for producing data resources for a dataset anymore - dissociate a pipeline
- Create and publish resources for the given dataset
- Manualy executing the publishing pipeline which ensures extraction of data sources, its transformation, and creation of output published data - create/update dataset resource(s) on demand.
- Automated execution of publishing pipeline according to schedule set - display/manage pipeline scheduling information
- Update ODN/InternalCatalog by ODN/Publication module with resources created for a dataset containing - create/update dataset resource(s) automatically when data processing done:
- downloadable links to newly published data in case of file dumps
- REST API and SPARQL Endpoint API interfaces for data to be accessible in more sophisticated way dedicated for automated use of data by 3rd party applications, etc.
- resource specific metadata describing the resource of the dataset
- Verify and debug the publication process
...
- verify the publication status after pipeline execution - display last publication status of a pipeline
- tune the configuration of a pipeline - debug pipeline
...
- Make the dataset publicly available
- make the dataset publicly available in ODN/PublicCatalog - make dataset publicly available
- publish the publicly available dataset also to external catalog(s) (both metadata and data is pushed) which is defined per dataset - add external catalog for update, modify external catalog for update,
...
- Create a visualization of a dataset resource accessible via SPARQL endpoint and publish the visualization - create visualization, add visualization to dataset
...
- Display or retrieve metadata about dataset and its resource(s) (all datasets - private and public, using ODN/InternalCatalog):
- display metadata about the dataset - browse catalog records
- display metadata about dataset resource(s) - list available resources for a dataset
- retrieve metadata about the dataset - retrieve metadata about the dataset via internal catalog API
...
- Access published data (all datasets - private and public, using ODN/InternalCatalog)
- download file dumps of published data - download latest version of file dump
...
- retrieve published data via REST API
...
...
...
- retrieve published data via SPARQL endpoint - retrieving data from a dataset via SPARQL endpoint
- retrieve published data via SPARQL endpoint - retrieving data from a dataset via SPARQL endpoint
- Working with catalog you can
- search for datasets using keywords - search for catalog records using internal catalog
- filter them - catalog records filtering in internal catalog
- Also you can...
- modify your own user profile - modify users own user profile
- export pipeline to zip file and import it to another instance of ODN - export pipeline to zip file and import pipeline without DPU JARs
Create new dataset record in ODN/InternalCatalog and associate it with a publishing pipeline |
|
Create and publish resources for the given dataset |
|
Verify and debug the publication process, make the dataset publicly available |
|
Create a visualization of a dataset resource accessible via SPARQL endpoint and publish the visualization - create visualization, add visualization to dataset |
Display or retrieve metadata about dataset and its resource(s) (all datasets - private and public, using ODN/InternalCatalog): |
|
...
- manage communication language to communicate with other consumers in the native language - GUI language management
- display available resources for a public dataset - list of available resources for a public dataset
...
...
|
Access published data (all datasets - private and public, using ODN/InternalCatalog) |
|
Working with catalog you can |
|
Also you can... |
|