DPU = Data Processing Units
for working with datasets in Unified Views module. User and admin use them to build the pipelines for executing tasks in ODN. They are the basic tool or building blocks.
If you would like to create a new DPU, refer to https://github.com/UnifiedViews/Plugin-DevEnv
List of core DPUs to work with in Unified Views
DPU will allow you to (action followed by the N° of DPU) perform requested actions with your data, datasets, files or list of files.
Basicly, DPUs are classified by their initial lettre as
E- extracting tools
T- transformation tools
L- loading tools
Briefly:
- upload 4 /download 1, zip 22 /unzip 20, merge 8, 13, filter 6, rename 9
- convert among formats or types of data / files 3, 5, 19
- extract data (with following actions) 2, 10
- transform data 14, 15, 16, 17, 18, 21
- extract metadata 12
- search and replace (string, patterns) 7
- validate XML 11,
*DPU - plugin on the data processing pipelines, which executes certain transformation, cleansing, quality assessment on top of the processed data. DPU encapsulates certain business logic needed when processing data (e.g., one DPU may extract data from a SPARQL endpoint or apply a SPARQL query). Every DPU must define its required/optional inputs and produced outputs.
1. E-FilesDownload
Downloads list of files. Replaces E-FilesFromLocal and E-HttpDownload.
Name | Type | DataUnit | Description |
---|---|---|---|
output | o | FilesDataUnit | Downloaded files. |
https://github.com/UnifiedViews/Plugins/tree/master/e-filesDownload
Extracts data from external relational database tables into internal database.
Name | Type | DataUnit | Description |
---|---|---|---|
outputTables | o | RelationalDataUnit | Extracted database tables |
https://github.com/UnifiedViews/Plugins/tree/master/e-relationalFromSql
3. L-FilesToVirtuoso
VirtuosoLoader issues Virtuoso internal functions to load directory of RDF data.
Name | Type | DataUnit | Description |
---|---|---|---|
TODO: provide Name, Dataunit and Description of input | i |
https://github.com/UnifiedViews/Plugins/tree/master/l-filesToVirtuoso
4. L-FilesUpload
Uploads list of files. Replaces L-FilesToLocalFS and L-FilesToScp.
Name | Type | DataUnit | Description |
---|---|---|---|
filesInput | i | FilesDataUnit | Files to upload to specified destination. |
https://github.com/UnifiedViews/Plugins/tree/master/l-filesUpload
5. L-RelationalToSql
Loads input internal database tables into external SQL database (currently PostgreSQL) supported.
Name | Type | DataUnit | Description |
---|---|---|---|
inTablesData | i | RelationalDataUnit | Input database tables |
https://github.com/UnifiedViews/Plugins/tree/master/l-relationalToSql
6. T-FilesFilter
Filters files.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | FilesDataUnit | List of files to be filtered. |
output | o | FilesDataUnit | List of files passing the filter. |
https://github.com/UnifiedViews/Plugins/tree/master/t-filesFilter
7. L-FilesFindAndReplace
Finds and replaces strings (patterns) in files.
Name | Type | DataUnit | Description |
---|---|---|---|
filesInput | i | FilesDataUnit | Input files |
filesOutput | o | FilesDataUnit | Output files |
https://github.com/UnifiedViews/Plugins/tree/master/t-filesFindAndReplace
8. T-FilesMerger
Merges Files inputs.
Name | Type | DataUnit | Description |
---|---|---|---|
filesInput | i | FilesDataUnit | DataUnit to which user connects all inputs which has to be merged. |
filesOutput | o | FilesDataUnit | DataUnit which outputs all files from input. |
https://github.com/UnifiedViews/Plugins/tree/master/t-filesMerger
9. T-FilesRenamer
Renames files.
Name | Type | DataUnit | Description |
---|---|---|---|
inFilesData | i | FilesDataUnit | File name to be modified. |
outFilesData | o | FilesDataUnit | File name after modification. |
https://github.com/UnifiedViews/Plugins/tree/master/t-filesRenamer
10. T-FilesToRdf
Extracts RDF data from Files (any file format) and adds them to RDF.
Name | Type | DataUnit | Description |
---|---|---|---|
filesInput | i | FilesDataUnit | Input file containing data. |
rdfOutput | o | RDFDataUnit | RDF data extracted. |
https://github.com/UnifiedViews/Plugins/tree/master/t-filesToRdf
11. T-FilterValidXml
Validates XML inputs in 3 ways: checks if the XML is well formed, checks if it conforms to a specified XSD scheme, validate using specified XSLT template.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | FilesDataUnit | List of files to be validated. |
outputValid | o | FilesDataUnit | List of files passing the validation. |
outputInalid | o | FilesDataUnit | List of files that does not pass the validation. |
https://github.com/UnifiedViews/Plugins/tree/master/t-filterValidXml
12. T-Metadata
Generates metadata on output from input.
Name | Type | DataUnit | Description |
---|---|---|---|
data | i | RDFDataUnit | Data to be described. |
metadata | o | RDFDataUnit | Descriptive data. |
https://github.com/UnifiedViews/Plugins/tree/master/t-metadata
13. T-RdfMerger
Merges RDF data in no time.
Name | Type | DataUnit | Description |
---|---|---|---|
rdfInput | i | RDFDataUnit | DataUnit to which user connects all inputs which has to be merged. |
rdfOutput | o | RDFDataUnit | DataUnit which outputs all input graphs. |
https://github.com/UnifiedViews/Plugins/tree/master/t-rdfMerger
14. T-RdfToFiles
Transforms RDF graphs into files.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | RDFDataUnit | RDF graph. |
output | o | FilesDataUnit | File containing RDF triples. |
https://github.com/UnifiedViews/Plugins/tree/master/t-rdfToFiles
15. T-Relational
Transforms N input tables into 1 output table using SELECT SQL queries.
Name | Type | DataUnit | Description |
---|---|---|---|
inputTables | i | RelationalDataUnit | Source database tables |
outputTable | o | RelationalDataUnit | Output (transformed) table |
https://github.com/UnifiedViews/Plugins/tree/master/t-relational
16. T-SparqlConstruct
Transforms input using SPARQL construct.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | RDFDataUnit | RDF input |
output | o | RDFDataUnit | RDF output (transformed) |
https://github.com/UnifiedViews/Plugins/tree/master/t-sparqlConstruct
17. T-SparqlUpdate
Transform input using SPARQL construct.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | RDFDataUnit | RDF input |
output | o | RDFDataUnit | RDF output (transformed) |
https://github.com/UnifiedViews/Plugins/tree/master/t-sparqlUpdate
18. T-Tabular
Converts tabular data into RDF data.
Name | Type | DataUnit | Description |
---|---|---|---|
table | i | FilesDataUnit | Input file containing tabular data. |
triplifiedTable | o | RDFDataUnit | RDF data. |
https://github.com/UnifiedViews/Plugins/tree/master/t-tabular
19. T-TabularToRelational
Parses tabular file to relational data unit.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | FilesDataUnit | List of files to parse. |
output | o | RelationalDataUnit | Relational dataunit with parsed data. |
https://github.com/UnifiedViews/Plugins/tree/master/t-tabularToRelational
20. T-UnZipper
UnZips input file into files based on zip content.
Name | Type | DataUnit | Description |
---|---|---|---|
input | i | FilesDataUnit | File to unzip. |
output | o | FilesDataUnit | List of unzipped files. |
https://github.com/UnifiedViews/Plugins/tree/master/t-unzipper
21. T-Xslt
Does XSL Transformation over files and outputs Files.
Name | Type | DataUnit | Description |
---|---|---|---|
files | i | FilesDataUnit | File to be transformed. |
files | o | FilesDataUnit | Transformed file of given type. |
config | i | RDFDataUnit | Configuration (template parameters). |
https://github.com/UnifiedViews/Plugins/tree/master/t-xslt
22. T-Zipper
Zips input files into zip file of given name.
input | i | FilesDataUnit | List of files to zip. |
output | o | FilesDataUnit | Name of zip file. |
https://github.com/UnifiedViews/Plugins/tree/master/t-zipper