Skip to content

Commit 05b7f4f

Browse files
authored
Merge pull request #30 from ws-garcia/CSV-Interface-v4.2.0
Csv interface v4.2.0
2 parents 1b23247 + c0c684d commit 05b7f4f

35 files changed

+3397
-1114
lines changed

README.md

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
## Introductory words
77

8-
The most powerful and comprehensive CSV/[TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values)/[DSV](https://www.linuxtopia.org/online_books/programming_books/art_of_unix_programming/ch05s02.html) data management library for VBA, providing parsing/writing capabilities compliant with RFC-4180 specifications and a complete set of tools for manipulating records and fields: [dedupe](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/dedupe.html), [sort](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/sort.html) and [filter](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/filter.html) records; [rearrange](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/rearrangefields.html), [shift](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/shiftfield.html), [merge](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/mergefields.html) and [split](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/splitfield.html) fields; and much more!
8+
The most powerful and comprehensive CSV/[TSV](https://www.iana.org/assignments/media-types/text/tab-separated-values)/[DSV](https://www.linuxtopia.org/online_books/programming_books/art_of_unix_programming/ch05s02.html) data management library for VBA, providing parsing/writing capabilities compliant with RFC-4180 specifications and a complete set of tools for manipulating records and fields: [dedupe](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/dedupe.html), [sort](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/sort.html) and [filter](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/filter.html) records; [rearrange](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/rearrangefields.html), [shift](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/shiftfield.html), [merge](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/mergefields.html) and [split](https://ws-garcia.github.io/VBA-CSV-interface/api/methods/splitfield.html) fields. Is your data spread over two or more CSV files? Don't worry, here you will find [Left, Right and Inner](https://ws-garcia.github.io/VBA-CSV-interface/api/csvarraylist.html) joins, and much more!
99

1010
## Advantages
1111
* __RFC-4180 specs compliant__.
@@ -343,25 +343,30 @@ Sub DelimitersGuessing()
343343
End Sub
344344
```
345345

346-
With a CSV file parser you can do many things, for example, an user can parse the contents of the Windows clipboard and dump it to an Excel Worksheet with a procedure like the following (thanks to @OlimilO1402 [for the `CBGetText` function code](https://github.com/OlimilO1402/XL_ClipboardReader/blob/main/Modules/MClipboard.bas)):
346+
With VBA CSV interface, many things can be done, for example, an user can perform like SQL joins such as:
347347

348348
```
349-
Sub ImportFromClipBoard()
350-
Dim CSVint As CSVinterface
351-
Dim CSVstring As String
352-
Dim SPACE_CHR As String
349+
Sub JoinTwoTables()
350+
Dim WB As Workbook
351+
Dim WS As Worksheet
352+
Dim t1 As CSVArrayList
353+
Dim t2 As CSVArrayList
354+
Dim arrT1() As Variant
355+
Dim arrT2() As Variant
356+
Dim rTable As CSVArrayList
353357
354-
SPACE_CHR = " "
355-
CSVstring = Join$(Split(CBGetText, SPACE_CHR), vbTab) ' Replace all space char with Tab char
356-
Set CSVint = New CSVinterface
357-
With CSVint.parseConfig
358-
.dialect.fieldsDelimiter = vbTab ' Columns delimiter
359-
.dialect.recordsDelimiter = vbCrLf ' Rows delimiter
360-
End With
361-
With CSVint
362-
.ImportFromCSVString CSVstring, .parseConfig ' Import the CSV to internal object
363-
.DumpToSheet
364-
End With
358+
Set WB = ThisWorkbook
359+
Set WS = WB.Sheets("Orders"): arrT1() = WS.Range("A1:G21").Value2
360+
Set WS = WB.Sheets("Ships and sales"): arrT2() = WS.Range("A1:F27").Value2
361+
Set t1 = New CSVArrayList: t1.items = arrT1
362+
Set t2 = New CSVArrayList: t2.items = arrT2
363+
' Join 1st, "Region", and 3th to 5th fields of left table with "Total_Revenue" field from the right table,
364+
' on "Order_ID" of both tables and Total_Revenue, from the right table, is greater than 3000000
365+
' and Region, from the left table, is equal to "Central America and the Caribbean"
366+
Set rTable = t1.LeftJoin(t1, t2, _
367+
"{1,Region,3-5};{Total_Revenue}", _
368+
"Order_ID;Order_ID", _
369+
"t2.Total_Revenue>3000000 & t1.Region='Central America and the Caribbean'")
365370
End Sub
366371
```
367372

docs/api/csvarraylist.md

Lines changed: 72 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,9 @@ nav_order: 5
55
---
66

77
# CSVArrayList
8-
{: .d-inline-block }
8+
{: .fs-6 }
99

10-
New
11-
{: .label .label-purple }
12-
13-
Class module developed to emulate some functionalities from the `ArrayList` present in some most modern languages. The `CSVArrayList` serve as a container for all the data read from CSV files and can be used to manipulate the stored items, or to store data that does not come from a CSV file, according to the user's request.
10+
Class module developed to emulate some functionalities from the `ArrayList` present in some most modern languages. The `CSVArrayList` serves as a container for all the data read from CSV files and can be used to manipulate the stored items, or to store data that does not come from a CSV file, according to the user's request.
1411
{: .fs-4 .fw-300 }
1512

1613
---
@@ -37,6 +34,11 @@ Class module developed to emulate some functionalities from the `ArrayList` pres
3734
<td style="text-align: left;">Appends a copy of the specified values to the current instance. In contrast to the <code>Add</code> method, the data is operated on before being stored, so if the values to be appended to the current instance are not one-dimensional arrays, they will be properly stored as one-dimensional array. In this way, the user will be able to use the data sorting methods provided by the class as long as no multi-dimensional arrays are stored in the current instance.</td>
3835
</tr>
3936
<tr>
37+
<td style="text-align: left; color:blue;"><em>AddIndexedItem</em></td>
38+
<td style="text-align: left;">Method</td>
39+
<td style="text-align: left;">Appends a copy of the specified values to the current instance using a string-type key. This allows access to the elements by providing an index or a key. If the key exist, the item will be modified only if the <code>UpdateExistingItems</code> parameter is ser to <code>True</code>.</td>
40+
</tr>
41+
<tr>
4042
<td style="text-align: left; color:blue;"><em>Clear</em></td>
4143
<td style="text-align: left;">Method</td>
4244
<td style="text-align: left;">Reinitializes the current instance.</td>
@@ -82,6 +84,31 @@ Class module developed to emulate some functionalities from the `ArrayList` pres
8284
<td style="text-align: left;">Returns a filtered array list using the <code>CSVexpressions</code> class module.</td>
8385
</tr>
8486
<tr>
87+
<td style="text-align: left; color:blue;"><em>GetIndexedItem</em></td>
88+
<td style="text-align: left;">Method</td>
89+
<td style="text-align: left;">Gets an indexed Item, by its key, from the current instance.</td>
90+
</tr>
91+
<tr>
92+
<td style="text-align: left; color:blue;"><em>Group</em></td>
93+
<td style="text-align: left;">Method</td>
94+
<td style="text-align: left;">Groups rows having the same values into a summary.</td>
95+
</tr>
96+
<tr>
97+
<td style="text-align: left; color:blue;"><em>IndexedItems</em></td>
98+
<td style="text-align: left;">Property</td>
99+
<td style="text-align: left;">Gets all indexed Items from the current instance.</td>
100+
</tr>
101+
<tr>
102+
<td style="text-align: left; color:blue;"><em>Indexing</em></td>
103+
<td style="text-align: left;">Property</td>
104+
<td style="text-align: left;">Indicates whether the current instance is used to store indexed elements.</td>
105+
</tr>
106+
<tr>
107+
<td style="text-align: left; color:blue;"><em>Inner, Left and Right Join</em></td>
108+
<td style="text-align: left;">Method</td>
109+
<td style="text-align: left;">Run a like SQL join on the provided data tables.<br>1) Use a string such as <code>{1-2,5,ID};{1-6}</code> as a predicate of the columns to indicate the join of columns 1 to 2, 5 and ID of leftTable with the columns 1 to 6 of rightTable.<br>2) Use a string such as <code>{*};{1-3}</code> to indicate the union of ALL columns of leftTable with columns 1 to 3 of rightTable.<br>3) The predicate must use the dot syntax <code>[t1.#][t1.fieldName]</code> to indicate the fields of the table, where t1 refers to the leftTable.<br>4) The matchKeys predicate must be given as <code>#/$;#/$</code></td>
110+
</tr>
111+
<tr>
85112
<td style="text-align: left; color:blue;"><em>Insert</em></td>
86113
<td style="text-align: left;">Method</td>
87114
<td style="text-align: left;">Inserts an Item, at the given Index, in the current instance of the class.</td>
@@ -97,6 +124,21 @@ Class module developed to emulate some functionalities from the `ArrayList` pres
97124
<td style="text-align: left;">Gets or sets an Item, by its index, from the current instance. This is the default property, so the user can use abbreviated expressions such as <code>expression(i)</code> to access the Item <code>i</code>, where <code>expression</code> represents a <code>CSVArrayList</code> object.</td>
98125
</tr>
99126
<tr>
127+
<td style="text-align: left; color:blue;"><em>ItemExist</em></td>
128+
<td style="text-align: left;">Method</td>
129+
<td style="text-align: left;">Checks if a given field exists in a record of the current instance. Returns <code>False</code> when the key can not be found.</td>
130+
</tr>
131+
<tr>
132+
<td style="text-align: left; color:blue;"><em>ItemIndex</em></td>
133+
<td style="text-align: left;">Method</td>
134+
<td style="text-align: left;">Performs a search, on a given field, and retrieves the index of the target record. USE ONLY WITH SORTED DATA.</td>
135+
</tr>
136+
<tr>
137+
<td style="text-align: left; color:blue;"><em>ItemKey</em></td>
138+
<td style="text-align: left;">Method</td>
139+
<td style="text-align: left;">Gets the key at given position.</td>
140+
</tr>
141+
<tr>
100142
<td style="text-align: left; color:blue;"><em>items</em></td>
101143
<td style="text-align: left;">Property</td>
102144
<td style="text-align: left;">Gets or sets the collection of elements from or to the current instance. To set the elements, the <code>AValue</code> parameter must be an array.</td>
@@ -107,11 +149,31 @@ Class module developed to emulate some functionalities from the `ArrayList` pres
107149
<td style="text-align: left;">Turns a jagged array into a two dim array. The method will successively deconstruct and delete the jagged array, passing its contents to the specified two-dimensional array.</td>
108150
</tr>
109151
<tr>
152+
<td style="text-align: left; color:blue;"><em>KeyExist</em></td>
153+
<td style="text-align: left;">Method</td>
154+
<td style="text-align: left;">Searches for a key in the internal indexed records and returns <code>True</code> when found.</td>
155+
</tr>
156+
<tr>
157+
<td style="text-align: left; color:blue;"><em>KeyIndex</em></td>
158+
<td style="text-align: left;">Method</td>
159+
<td style="text-align: left;">Searches for an element in the internal indexed records, using a key, in the current instance (ONLY when the data is already sorted in ascending order). Returns the index of the element when found and -1 when the key is not found.</td>
160+
</tr>
161+
<tr>
162+
<td style="text-align: left; color:blue;"><em>Keys</em></td>
163+
<td style="text-align: left;">Property</td>
164+
<td style="text-align: left;">Gets all indexed Items from the current instance.</td>
165+
</tr>
166+
<tr>
110167
<td style="text-align: left; color:blue;"><em>MultiDimensional</em></td>
111168
<td style="text-align: left;">Method</td>
112169
<td style="text-align: left;">Checks if an array has more than one dimension and returns <code>True</code> or <code>False</code>.</td>
113170
</tr>
114171
<tr>
172+
<td style="text-align: left; color:blue;"><em>Reduce</em></td>
173+
<td style="text-align: left;">Method</td>
174+
<td style="text-align: left;">Reduces the internal array list to the result by evaluate the <code>ReductionExpression</code> parameter over all items.</td>
175+
</tr>
176+
<tr>
115177
<td style="text-align: left; color:blue;"><em>Reinitialize</em></td>
116178
<td style="text-align: left;">Method</td>
117179
<td style="text-align: left;">Reinitializes the current instance of the class and reserves the storage space desired by the user through the <code>bufferSize</code> parameter.</td>
@@ -122,6 +184,11 @@ Class module developed to emulate some functionalities from the `ArrayList` pres
122184
<td style="text-align: left;">Removes the Item at specified Index.</td>
123185
</tr>
124186
<tr>
187+
<td style="text-align: left; color:blue;"><em>RemoveIndexedItem</em></td>
188+
<td style="text-align: left;">Method</td>
189+
<td style="text-align: left;">Removes an indexed Item using the specified key.</td>
190+
</tr>
191+
<tr>
125192
<td style="text-align: left; color:blue;"><em>RemoveRange</em></td>
126193
<td style="text-align: left;">Method</td>
127194
<td style="text-align: left;">Removes a range of Items starting at the specified Index.</td>

docs/api/csvdialect.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,7 @@ nav_order: 6
55
---
66

77
# CSVdialect
8-
{: .d-inline-block }
9-
10-
New
11-
{: .label .label-purple }
8+
{: .fs-6 }
129

1310
Class module developed to share CSV dialects, or group of specific and related configuration, which instructs the parser on how to interpret the character set read from a CSV file. This container travels through the parsing and sniffer methods.
1411
{: .fs-4 .fw-300 }

docs/api/csvsniffer.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,7 @@ nav_order: 7
55
---
66

77
# CSVSniffer
8-
{: .d-inline-block }
9-
10-
New
11-
{: .label .label-purple }
8+
{: .fs-6 }
129

1310
Class module developed as an attempt to sniff/guess CSV dialects without user intervention. In some preliminary tests, the sniffer was 100% accurate, but there is always the risk of facing ambiguous cases that can only be solved with human intervention. This class is inspired by the [work of scientist Till Roman Döhmen](https://homepages.cwi.nl/~boncz/msc/2016-Doehmen.pdf), with some improvements to disambiguate the most complicated cases.
1411
{: .fs-4 .fw-300 }

docs/api/csvtextstream.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,7 @@ nav_order: 8
55
---
66

77
# CSVTextStream
8-
{: .d-inline-block }
9-
10-
New
11-
{: .label .label-purple }
8+
{: .fs-6 }
129

1310
Easy-to-use class module developed to enable I/O operations over "big" text files, at high speed, from VBA. The module hasn’t reference to any external API library and has the ability to read and write UTF-8 encoded files.
1411
{: .fs-4 .fw-300 }

docs/api/enumerations/sortingalgorithms.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,10 @@ Provides a list of constants to configure the sorting algorithm used when sortin
2020

2121
|**_Constant_**|**_Member name_**|
2222
|:----------|:----------|
23-
|0|*SA_IntroSort*|
24-
|1|*SA_Quicksort*|
25-
|2|*SA_TimSort*|
26-
|3|*SA_HeapSort*|
27-
|4|*SA_MergeSort*|
23+
|0|*SA_Quicksort*|
24+
|1|*SA_TimSort*|
25+
|2|*SA_HeapSort*|
26+
|3|*SA_MergeSort*|
2827

2928
---
3029

@@ -34,7 +33,7 @@ Provides a list of constants to configure the sorting algorithm used when sortin
3433

3534
>📝**Note**
3635
>{: .text-grey-lt-000 .bg-green-000 }
37-
>The default value for the `SortingAlgorithms` enumeration is `SA_IntroSort` which is a variant of Quicksort.
36+
>The default value for the `SortingAlgorithms` enumeration is `SA_Quicksort` which is a variant of the classic Quicksort.
3837
{: .text-grey-dk-300 .bg-grey-lt-000 }
3938

4039
See also

docs/api/methods/csvsubsetsplit.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Splits the CSV data into a set of files in which each piece has a related portio
1515

1616
## Syntax
1717

18-
*expression*.`CSVsubsetSplit`*(filePath, \[subsetColumn:= 1\], \[headers:= True\])*
18+
*expression*.`CSVsubsetSplit`*(filePath, \[subsetColumns:= 1\], \[headers:= True\], \[repeatHeaders:= True\], \[streamSize:= 20])*
1919

2020
### Parameters
2121

@@ -32,13 +32,21 @@ Splits the CSV data into a set of files in which each piece has a related portio
3232
<td style="text-align: left;">Required. Identifier specifying a <code>String</code> Type variable representing the full path to the target CSV file.</td>
3333
</tr>
3434
<tr>
35-
<td style="text-align: left;"><em>subsetColumn</em></td>
36-
<td style="text-align: left;">Optional. Identifier specifying a <code>Long</code> Type variable representing the index of the field on which the creation of the data groups will take place.</td>
35+
<td style="text-align: left;"><em>subsetColumns</em></td>
36+
<td style="text-align: left;">Optional. Identifier specifying a <code>Variant</code> Type variable representing the indexes of the fields on which the data groups will be created.</td>
3737
</tr>
3838
<tr>
3939
<td style="text-align: left;"><em>headers</em></td>
4040
<td style="text-align: left;">Optional. Identifier specifying a <code>Boolean</code> Type variable indicating whether the target CSV file has a header record.</td>
4141
</tr>
42+
<tr>
43+
<td style="text-align: left;"><em>repeatHeaders</em></td>
44+
<td style="text-align: left;">Optional. Identifier specifying a <code>Boolean</code> Type variable indicating whether the header record, from the target CSV file, will be copied to all created files.</td>
45+
</tr>
46+
<tr>
47+
<td style="text-align: left;"><em>streamSize</em></td>
48+
<td style="text-align: left;">Optional. Identifier specifying a <code>Long</code> Type variable representing the buffer size factor used to read the target CSV file.</td>
49+
</tr>
4250
</tbody>
4351
</table>
4452

@@ -50,11 +58,11 @@ Splits the CSV data into a set of files in which each piece has a related portio
5058

5159
## Behavior
5260

53-
The `CSVsubsetSplit` method will create a file for each different value (data grouping) in the field at the *subsetColumn* position, then all related data is appended to the respective file. Use the *headers* parameter to include a header record in each new CSV file. When the CSV file has a header record and the user sets the *header* parameter to `False`, the header row is saved in a separate file and the rest of CSV files will have no header record.
61+
The `CSVsubsetSplit` method will create a file for each different value (data grouping) in the fields at the *subsetColumns* position, then all related data is appended to the respective file. Use the *headers* parameter to include a header record in each new CSV file. The *subsetColumns* parameter can be a single value or an array of `Long` values. When the CSV file has a header record and the user sets the *header* parameter to `False`, the header row is saved in a separate file and the rest of CSV files will have no header record. The user can control when to include the headers by using the *repeatHeaders* parameter.
5462

5563
>📝**Note**
5664
>{: .text-grey-lt-000 .bg-green-000 }
57-
>The result subsets will be saved in a folder named [\*-subsets], where (\*) denotes the name of the source CSV file.
65+
>The result subsets will be saved in a folder named [\*-WorkDir], where (\*) denotes the name of the source CSV file.
5866
{: .text-grey-dk-300 .bg-grey-lt-000 }
5967

6068
### ☕Example

docs/api/methods/dedupe.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,7 @@ nav_order: 4
66
---
77

88
# Dedupe
9-
{: .d-inline-block }
10-
11-
New
12-
{: .label .label-purple }
9+
{: .fs-6 }
1310

1411
Returns a list of records as a result of the deduplication of the imported CSV data.
1512
{: .fs-4 .fw-300 }

docs/api/methods/dumptosheet.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Dumps data from a source, or from the current instance, to an Excel WorkSheet.
1515

1616
## Syntax
1717

18-
*expression*.`DumpToSheet`*(\[WBookName\], \[SheetName\], \[RngName:= "A1"\], \[DataSource:= `Nothing`\])*
18+
*expression*.`DumpToSheet`*(\[WBookName\], \[SheetName\], \[RngName:= "A1"\], \[DataSource:= `Nothing`\], \[BlockAutoFormat:= `True`\])*
1919

2020
### Parameters
2121

@@ -43,6 +43,10 @@ Dumps data from a source, or from the current instance, to an Excel WorkSheet.
4343
<td style="text-align: left;"><em>DataSource</em></td>
4444
<td style="text-align: left;">Optional. Identifier specifying a <code>CSVArrayList</code> object variable representing the data to copy from.</td>
4545
</tr>
46+
<tr>
47+
<td style="text-align: left;"><em>BlockAutoFormat</em></td>
48+
<td style="text-align: left;">Optional. Identifier specifying a <code>Boolean</code> Type variable.</td>
49+
</tr>
4650
</tbody>
4751
</table>
4852

@@ -62,7 +66,7 @@ See also
6266

6367
## Behavior
6468

65-
When the *WBookName* parameter is omitted the data is dumped into the Workbook that holds the CSV interface's *VBAProject*. Omitting the *SheetName* parameter adds a new Worksheet to the desired Workbook. Also, if the *RngName* parameter is omitted the data will dumped starting on the "A1" named cell in the desired Worksheet.
69+
When the *WBookName* parameter is omitted the data is dumped into the Workbook that holds the CSV interface's *VBAProject*. Omitting the *SheetName* parameter adds a new Worksheet to the desired Workbook. Also, if the *RngName* parameter is omitted the data will dumped starting on the "A1" named cell in the desired Worksheet. Use the *BlockAutoFormat* parameter if you believe that the target CSV data may induce some sort of [injection to your machine](http://georgemauer.net/2017/10/07/csv-injection.html).
6670

6771
>📝**Note**
6872
>{: .text-grey-lt-000 .bg-green-000 }

0 commit comments

Comments
 (0)