CSV on the Web: A Primer

Created by Hanečák Peter, last modified on Feb 11, 2016

1 Comment

Hanečák Peter
CSV is one of the most popular formats for publishing data on the web. It is concise, easy to understand by both humans and computers, and aligns nicely to the tabular nature of most data.
But CSV is also a poor format for data. There is no mechanism within CSV to indicate the type of data in a particular column, or whether values in a particular column must be unique. It is therefore hard to validate and prone to errors such as missing values or differing data types within a column.
The CSV on the Web Working Group has developed standard ways to express useful metadata about CSV files and other kinds of tabular data. This primer takes you through the ways in which these standards work together, covering:
What we mean by "tabular data" and "CSV"
Where files that provide metadata about CSV live
How to create a schema to validate the content of a CSV file
How to specify how a CSV file should be converted to RDF or JSON
How to provide other documentation and metadata about a CSV file
Where possible, this primer links back to the normative definitions of terms and properties in the standards. Nothing in this primer overrides those normative definitions.
- Permalink
- Feb 11, 2016