Skip to content

Commit 19f8705

Browse files
committed
Updated README with a few pointers about the philosophy of design of the
CSV Schema language
1 parent 98d5336 commit 19f8705

File tree

1 file changed

+20
-0
lines changed

1 file changed

+20
-0
lines changed

README.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,23 @@ Repository Organisation
2222
the specification version number.
2323

2424
Released under the [Mozilla Public Licence version 2.0](http://www.mozilla.org/MPL/2.0/).
25+
26+
27+
Philosophy
28+
----------
29+
A few bullet-points that guide our thinking in the design of the CSV Schema Language:
30+
31+
* Simple CSV Schema Language.
32+
A DSL (Domain Specifc Language) was desired that could be expressed in plain text and should be simple enough that Metadata experts could easily write it without having to know a programming language or data/document modelling language such as XML or RDF. Note, the CSV Schema Language is **NOT** itself expressed in CSV, it is expressed in a simple text format.
33+
34+
* Context is King!
35+
Schema rules are written for each column of the CSV file. Each set of column rules is then asserted against each row of the CSV file in turn. Each rule in the CSV Schema operates on the current context (e.g. defined Column and parsed Row), unless otherwise specified. Hopefully this makes the rules short and concise.
36+
37+
* Streaming.
38+
Often the Metadata files that we receive are very large as they contain many records about a Collection which itself can be huge. The CSV Schema Language was designed with an eye to being able to write a Validation tool which could read the CSV file as a stream. Few steps require mnenomization of data from the CSV file, and where they do this is limited and should be easily optimisable to keep memory use to a minimum.
39+
40+
* Sane Defaults.
41+
We try to do the right thing by default, CSV files and their bretheren (Tab Separated Values etc.) can come in many shapes and sizes, by default we parse CSV according to [RFC 4180](http://tools.ietf.org/html/rfc4180 "Common Format and MIME Type for Comma-Separated Values (CSV) Files"), of course we allow you to customize this behaviour in the CSV Schema.
42+
43+
* CSV Schema is ***NOT*** a Programming Language.
44+
This is worth stressing as it was something we had to keep site of ourselves during development; CSV Schema is a simple data definition and validation language for CSV!

0 commit comments

Comments
 (0)