Skip to content

Commit dd29ef4

Browse files
authored
Merge pull request #2 from ws-garcia/rfc4180-fully-compliant-csv-parser
[closes #1] RFC-4180 fully compliant CSV parser
2 parents 228d6e7 + c079ead commit dd29ef4

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+2957
-1487
lines changed

.editorconfig

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
[*]
2+
trim_trailing_whitespace = true
3+
insert_final_newline = true
4+
charset = utf-8
5+
6+
[*.{bas,cls}]
7+
end_of_line = crlf

.gitattributes

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
* text=auto
2+
*.bas text eol=crlf
3+
*.cls text eol=crlf

LICENSE

Lines changed: 10 additions & 670 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 14 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -1,94 +1,26 @@
1-
# VBA-CSV interface
2-
[![version](https://img.shields.io/static/v1?label=version&message=v1.0.1&color=brightgreen&style=plastic)](https://github.com/ws-garcia/VBA-CSV-interface/releases/tag/v1.0.1)
1+
# ![VBA-CSV interface](/docs/assets/img/CSVinterface.png)
2+
[![version](https://img.shields.io/static/v1?label=version&message=v1.1.0&color=brightgreen&style=plastic)](https://github.com/ws-garcia/VBA-CSV-interface/releases/tag/v1.1.0)
33
[![version](https://img.shields.io/static/v1?label=licence&message=GPL&color=informational&style=plastic)](https://www.gnu.org/licenses/)
4-
## Table of contents
5-
* [Intro](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#intro)
6-
* [Advantages](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#advantages)
7-
* [Philosophy](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#philosophy)
8-
* [Rules](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#rules)
9-
* [Usage](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#usage)
10-
* [Limitations](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#Limitations)
11-
* [Benchmark](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#benchmark)
12-
* [Licence](https://github.com/ws-garcia/VBA-CSV-interface/blob/master/README.md#licence)
13-
## Intro
14-
The CSV, stands from Comma Separated Values, files are special kind of tabulated plain text data widely used in data exchange. There is no globally accepted standard format for that kind of files, however, out there are well formed standards such as [RFC4180](https://www.ietf.org/rfc/rfc4180.txt) proposed by The Internet Society.
15-
Although many solutions has been developed for work with CSV files into VBA, including projects from [@sdkn104](https://github.com/sdkn104/VBA-CSV) and [@Senipah](https://github.com/Senipah/VBA-Better-Array) on Github, the vast majority of these have serious performance lacks. This argumentations conduce to the development of a VBA class module that allows users exchange data between VBA arrays and CSV files at high speed.
16-
### Advantages
17-
* Partialy compliant with RFC4180 CSV standard (there are few differences).
18-
* Exported data is 100% Excel spreadsheet compatible.
19-
* The data is always interpreted as text, excluding any quote mark when imported it.
20-
* Writes and reads files at high speed.
21-
* Minimal CPU overheat.
22-
* User have the option to import only certain range of records from given CSV file.
23-
* Simple code logic that allows you easy modify and enhance it!
24-
## Philosophy
25-
The VBA CSVinterface class module is designed for gain advantage from the well structured CSV files, this means, there isn't automatic syntax check, given the user decide how the class will works. This can be seen as a weakness, but the class get a speed-up on writing and reading procedures at time the user controls how the file is interpreted, keeping in mind that, in fact, VBA is a language with slow code execution speed.
26-
Under this idealization it's easy to develop a solution that implicity complies with the RFC4180 standart for user specified CSV document format. In order to achieve this, the user must to follow the rules specified below.
27-
## Rules
28-
1. Each record is located on a separate line, delimited by a line break (CRLF, CR, LF).
29-
2. The last record in the file may or may not have an ending line break.
30-
3. There maybe an optional header line appearing as the first line of the file with the same format as normal record lines. This header will contain names corresponding to the fields in the file and should contain the same number of fields as the records in the rest of the file.
31-
4. Within the header and each record, there may be one or more fields, separated by the fields separator (Comma, Semicolon, Space, Tab). Each line should contain the same number of fields throughout the file. **_Use the RemoveSpaces method to avoid let spaces betwen fields and records separators_**. The last field in the record must not be followed by a fields separator.
32-
5. Each field may or may not be escaped with the selected escape char. **_The user can choose between escape, coerce, every fields or neither one_**.
33-
6. Fields containing special chars (line breaks, double quotes, apostrophe, and commas) should be escaped using selected escape char.
34-
## Usage
35-
Import whole CSV file into an VBA array
36-
```vbscript
37-
Dim CSVix As CSVinterface
38-
Dim MyArray As variant
39-
Set CSVix = New CSVinterface
40-
Call CSVix.OpenConnection(fileName)
41-
Call CSVix.ImportFromCSV
42-
MyArray = CSVix .CSVdata
43-
Set CSVix = Nothing
44-
```
45-
Import a range of records from CSV file into a VBA array
46-
```vbscript
47-
Dim CSVix As CSVinterface
48-
Dim MyArray As variant
49-
Set CSVix = New CSVinterface
50-
CSVix.StartingRecord = 10
51-
CSVix.EndingRecord = 20
52-
Call CSVix.OpenConnection(fileName)
53-
Call CSVix.ImportFromCSV
54-
MyArray = CSVix .CSVdata
55-
Set CSVix = Nothing
56-
```
57-
Set the char to encapsulate, coerce, fields
58-
```vbscript
59-
CSVix.EscapeChar = NullChar
60-
CSVix.EscapeChar = Apostrophe
61-
CSVix.EscapeChar = DoubleQuotes
62-
```
63-
Set fields and records delimiters
64-
```vbscript
65-
CSVix.FieldsDelimiter = ";"
66-
CSVix.RecordsDelimiter = vbCrLf
67-
```
68-
### Limitations
69-
* __Line breaks support__: the class only allow line breaks adding the escape char to all the records's fields over the whole file lengt. This is a intentionally limitation to ensure the speed over processing the data. Keep in mind that the class doesn't distinguist between number, dates and strings, all data is readed as text and you can put in an Excel sheet to let Microsoft software format it.
70-
## Benchmark
71-
The class was tested against many solutions using the oldest, lowest-processing capacity laptop I could find: Win 7 Starter 32-bit, Intel® Atom™ CPU N2600 @1.60 GHz, 1 GB RAM.
72-
The times showed, seconds, in the bellow table are the average of ten (10) calls to the import procedure (supposed most costly to the CPU). The files used in the test haven twelve fields with variable number of records.
734

74-
|*Procedure (Author)*|*1K rec (102 KB)*|*5K rec (511 KB)*|*10K rec (0.99 MB)*|*100K rec (9.95 MB)*|
75-
|:--------------------------|-----------------:|----------------:|----------------:|-----------------:|
76-
|*ImportFromCSV (W. García)*|_0.0352_|_0.1930_|_0.3688_|_3.6172_|
77-
|*ParseCSVToArray/ADO (@sdkn104)*|1.4349|47.3177|202.82|>1,000|
78-
|*ImportCSVinArray (Wester)*|0.1042|0.6484|1.0182|10.250|
79-
|*ArrayFromCSV (Heffernan)*|0.2396|1.7839|2.2057|22.385|
80-
|*FromCSV(@Senipah)*|0.3594|3.8333|16.6172|>1,000|
5+
## Introductory words
6+
7+
VBA CSV interface is a class module developed to accomplish the data exchange task between VBA arrays and CSV files at high speed. Projects from [@sdkn104](https://github.com/sdkn104/VBA-CSV) and [@Senipah](https://github.com/Senipah/VBA-Better-Array), both on Github, were used for comparative performance purposes.
818

82-
Considering the system specification for the test machine (4 MB/sec. when it writes files to an USB), the above times was stunning!: up to 2.75 MB/sec. for reading operations.
83-
The image below show the performance of the VBA CSVinterface class on a mid updated laptop.
9+
## Getting started
8410

85-
![BenchMark](Benchmark.png)
11+
If you don't know how to get started with VBA-CSV Interface class, visit the [documentation repo](https://ws-garcia.github.io/VBA-CSV-interface/).
12+
13+
## Benchmark
14+
15+
The benchmark results for VBA-CSV Interface are available at [this site](https://ws-garcia.github.io/VBA-CSV-interface/home/getting_started.html#benchmark).
8616

8717
## Licence
88-
Copyright (C) 2020 [W. García](https://github.com/ws-garcia/VBA-CSV-interface/).
18+
19+
Copyright (C) 2020 [W. García](https://github.com/ws-garcia/).
8920

9021
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
9122

9223
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
9324

9425
You should have received a copy of the GNU General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.
26+

_config.yml

Lines changed: 0 additions & 1 deletion
This file was deleted.

csv-data/CSVs.zip

10.1 MB
Binary file not shown.

docs/_config.yml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#theme: jekyll-theme-architect
2+
remote_theme: ws-garcia/just-the-docs
3+
# Set a path/url to a logo that will be displayed instead of the title
4+
logo: "/assets/img/CSVinterface.png"
5+
# Back to top link
6+
back_to_top: true
7+
back_to_top_text: "Back to top"
8+
# Enable or disable the site search
9+
# Supports true (default) or false
10+
search_enabled: true
11+
# Enable support for hyphenated search words:
12+
search_tokenizer_separator: /[\s/]+/
13+
# Aux links for the upper right navigation
14+
aux_links:
15+
"VBA CSV interface on GitHub":
16+
- "//github.com/ws-garcia/VBA-CSV-interface"
17+
# Heading anchor links appear on hover over h1-h6 tags in page content
18+
# allowing users to deep link to a particular heading on a page.
19+
#
20+
# Supports true (default) or false/nil
21+
heading_anchors: true
22+
# Color scheme currently only supports "dark" or nil (default)
23+
color_scheme: nil
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
title: EscapeType
3+
parent: Enumerations
4+
grand_parent: API
5+
nav_order: 1
6+
---
7+
8+
# EscapeType Enum
9+
{: .fs-9 }
10+
11+
Provides a list of constants for use to configure the char used as escape one.
12+
{: .fs-6 .fw-300 }
13+
14+
---
15+
16+
## Parts
17+
18+
|**_Constant_**|**_Member name_**|
19+
|:----------|:----------|
20+
|0|*NullChar*|
21+
|1|*Apostrophe*|
22+
|2|*DoubleQuotes*|
23+
24+
---
25+
26+
## Syntax
27+
28+
*variable* = `EscapeType`.*Constant*
29+
30+
---
31+
32+
## Remarks
33+
34+
The `EscapeType.NullChar` value is used with the`QuotationMode.All` setting to indicates the CSV file does not use any escape char in its whole length. This values combination conduces the CSV file to be parse/write assuming the `FieldsDelimiter` property is enough for the import/export operations.
35+
36+
In the case the `FieldsDelimiter` property is not enough for successfully done the import/export operations, the `QuotationMode.DoubleQuotes` value would be used for parse/write an CSV having fields to be escaped with double quote and the `QuotationMode.Apostrophe` values for parse/write an CSV having fields to be escaped with the apostrophe.
37+
38+
See also
39+
: [EscapeChar Property](https://ws-garcia.github.io/VBA-CSV-interface/api/properties/escapechar.html), [QuotationMode Enumeration](https://ws-garcia.github.io/VBA-CSV-interface/api/enumerations/quotationmode.html), [FieldsDelimiter Property](https://ws-garcia.github.io/VBA-CSV-interface/api/properties/fieldsdelimiter.html).
40+
41+
[Back to Enumerations overview](https://ws-garcia.github.io/VBA-CSV-interface/api/enumerations/)

docs/api/enumerations/index.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
title: Enumerations
3+
parent: API
4+
has_children: true
5+
nav_order: 2
6+
---
7+
8+
# Enumerations
9+
{: .fs-9 }
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
title: QuotationMode
3+
parent: Enumerations
4+
grand_parent: API
5+
nav_order: 2
6+
---
7+
8+
# QuotationMode Enum
9+
{: .fs-9 }
10+
11+
Provides a list of constants to configure the CSV parsing/writing operation behavior.
12+
{: .fs-6 .fw-300 }
13+
14+
---
15+
16+
## Parts
17+
18+
|**_Constant_**|**_Member name_**|
19+
|:----------|:----------|
20+
|0|*Critical*|
21+
|1|*All*|
22+
23+
---
24+
25+
## Syntax
26+
27+
*variable* = `QuotationMode`.*Constant*
28+
29+
---
30+
31+
## Remarks
32+
33+
The `QuotationMode.Critical` value, default one, is used to indicates the CSV file must use escape char only in fields having special char. The `QuotationMode.All` value most be used for those CSV files in wich all its fields will be escaped with the escape char given with the `EscapeChar` property.
34+
35+
See also
36+
: [EscapeChar Property](https://ws-garcia.github.io/VBA-CSV-interface/api/properties/escapechar.html).
37+
38+
[Back to Enumerations overview](https://ws-garcia.github.io/VBA-CSV-interface/api/enumerations/)

0 commit comments

Comments
 (0)