Skip to content

Commit 53de172

Browse files
committed
Documentation additions
1 parent e756177 commit 53de172

File tree

6 files changed

+153
-75
lines changed

6 files changed

+153
-75
lines changed

README.md

Lines changed: 94 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,19 @@
44

55
<p align="center">
66
<a href="https://swift.org/about/#swiftorg-and-open-source"><img src="docs/assets/badges/Swift.svg" alt="Swift 5.x"></a>
7-
<a href="https://www.apple.com/macos"><img src="docs/assets/badges/Apple.svg" alt="macOS 10.10+ - iOS 8+ - tvOS 9+ - watchOS 2+"></a>
7+
<a href="https://github.com/dehesa/CodableCSV/wiki/Implicit-dependencies"><img src="docs/assets/badges/Apple.svg" alt="macOS 10.10+ - iOS 8+ - tvOS 9+ - watchOS 2+"></a>
88
<a href="https://ubuntu.com"><img src="docs/assets/badges/Ubuntu.svg" alt="Ubuntu 18.04"></a>
99
<a href="http://doge.mit-license.org"><img src="docs/assets/badges/License.svg" alt="MIT License"></a>
1010
</p>
1111

1212
[CodableCSV](https://github.com/dehesa/CodableCSV) provides:
1313

1414
- Imperative CSV reader/writer (row-by-row and/or field-by-field).
15-
- Declarative `Codable` encoder/decoder.
16-
- Support for multiple inputs/outputs: `String`s, `Data` blobs, and `URL`s.
17-
- Support for multiple string encodings and Byte Order Markers (BOM).
18-
- Extremely configurable: delimiters, escaping scalar, trim strategy, presampling, and numerous codable strategies.
15+
- Declarative `Codable` encoder/decoder and lazy row decoder.
16+
- Support multiple inputs/outputs: `String`s, `Data` blobs, and `URL`s.
17+
- Support numerous string encodings and Byte Order Markers (BOM).
18+
- Extensive configuration: delimiters, escaping scalar, trim strategy, presampling, codable strategies, etc.
19+
- [RFC4180](https://tools.ietf.org/html/rfc4180) compliant with default configuration and CRLF (`\r\n`) row delimiter.
1920
- Multiplatform support with no dependencies.
2021

2122
> The Swift Standard Library and Foundation are considered implicit requirements.
@@ -83,7 +84,9 @@ A `CSVReadder` parses CSV data from a given input (`String`, or `Data`, or file)
8384
let data: Data = ...
8485
let result = try CSVReader.decode(input: data)
8586
```
87+
8688
Once the input is completely parsed, you can choose how to access the decoded data:
89+
8790
```swift
8891
let headers: [String] = result.headers
8992
// Access the CSV rows (i.e. raw [String] values)
@@ -107,7 +110,9 @@ A `CSVReadder` parses CSV data from a given input (`String`, or `Data`, or file)
107110
let reader = try CSVReader(input: string) { $0.headerStrategy = .firstLine }
108111
let rowA = try reader.readRow()
109112
```
113+
110114
Parse a row at a time, till `nil` is return; or exit the scope and the reader will clean up all used memory.
115+
111116
```swift
112117
// Let's assume the input is:
113118
let string = "numA,numB,numC\n1,2,3\n4,5,6\n7,8,9"
@@ -332,13 +337,13 @@ let decoder = CSVDecoder { $0.bufferingStrategy = .sequential }
332337
let content: [Student] = try decoder.decode([Student].self, from: URL("~/Desktop/Student.csv"))
333338
```
334339
335-
If you are dealing with a big CSV file, it is preferred to used direct file decoding, a `.sequential` or `.unrequested` buffering strategy, and set *presampling* to false; since then memory usage is drastically reduced.
340+
If you are dealing with a big CSV file, it is preferred to used direct file decoding, a `.sequential` or `.unrequested` buffering strategy, and set _presampling_ to false; since then memory usage is drastically reduced.
336341
337342
### Decoder Configuration
338343
339344
The decoding process can be tweaked by specifying configuration values at initialization time. `CSVDecoder` accepts the [same configuration values as `CSVReader`](#Reader-Configuration) plus the following ones:
340345
341-
- `nilStrategy` (default: `.empty`) indicates how the `nil` *concept* (absence of value) is represented on the CSV.
346+
- `nilStrategy` (default: `.empty`) indicates how the `nil` _concept_ (absence of value) is represented on the CSV.
342347
343348
- `boolStrategy` (default: `.insensitive`) defines how strings are decoded to `Bool` values.
344349
@@ -352,7 +357,7 @@ The decoding process can be tweaked by specifying configuration values at initia
352357
353358
- `bufferingStrategy` (default `.keepAll`) controls the behavior of `KeyedDecodingContainer`s.
354359
355-
Selecting a buffering strategy affects the the decoding performance and the amount of memory used during the process. For more information check this README's [Tips using `Codable`](#Tips-using-codable) section and the [`Strategy.DecodingBuffer` definition](sources/Codable/Decodable/DecodingStrategy.swift).
360+
Selecting a buffering strategy affects the decoding performance and the amount of memory used during the process. For more information check this README's [Tips using `Codable`](#Tips-using-codable) section and the [`Strategy.DecodingBuffer` definition](sources/Codable/Decodable/DecodingStrategy.swift).
356361
357362
The configuration values can be set during `CSVDecoder` initialization or at any point before the `decode` function is called.
358363
@@ -394,7 +399,7 @@ If you are dealing with a big CSV content, it is preferred to use direct file en
394399
395400
The encoding process can be tweaked by specifying configuration values. `CSVEncoder` accepts the [same configuration values as `CSVWriter`](#Writer-Configuration) plus the following ones:
396401
397-
- `nilStrategy` (default: `.empty`) indicates how the `nil` *concept* (absence of value) is represented on the CSV.
402+
- `nilStrategy` (default: `.empty`) indicates how the `nil` _concept_ (absence of value) is represented on the CSV.
398403
399404
- `boolStrategy` (default: `.deferredToString`) defines how Boolean values are encoded to `String` values.
400405
@@ -435,25 +440,25 @@ encoder.dataStrategy = .custom { (data, encoder) in
435440
436441
### Tips using `Codable`
437442
438-
`Codable` is fairly easy to use and most Swift standard library types already conform to it. However, sometimes it is tricky to get custom types to comply to `Codable` for specific functionality. That is why I am leaving here some tips and advices concerning its usage:
443+
`Codable` is fairly easy to use and most Swift standard library types already conform to it. However, sometimes it is tricky to get custom types to comply to `Codable` for specific functionality.
439444
440445
<ul>
441446
<details><summary>Basic adoption.</summary><p>
442447
443448
When a custom type conforms to `Codable`, the type is stating that it has the ability to decode itself from and encode itself to a external representation. Which representation depends on the decoder or encoder chosen. Foundation provides support for [JSON and Property Lists](https://developer.apple.com/documentation/foundation/archives_and_serialization) and the community provide many other formats, such as: [YAML](https://github.com/jpsim/Yams), [XML](https://github.com/MaxDesiatov/XMLCoder), [BSON](https://github.com/OpenKitten/BSON), and CSV (through this library).
444449
445-
Lets see a regular CSV encoding/decoding usage through `Codable`'s interface. Let's suppose we have a list of students formatted in a CSV file:
450+
Usually a CSV represent a long list of _entities_. The following is a simple example representing a list of students.
446451
447452
```swift
448-
let data = """
449-
name,age,hasPet
450-
John,22,true
451-
Marine,23,false
452-
Alta,24,true
453-
"""
453+
let string = """
454+
name,age,hasPet
455+
John,22,true
456+
Marine,23,false
457+
Alta,24,true
458+
"""
454459
```
455460
456-
In Swift, a _student_ has the following structure:
461+
A _student_ can be represented as a structure:
457462
458463
```swift
459464
struct Student: Codable {
@@ -463,11 +468,11 @@ struct Student: Codable {
463468
}
464469
```
465470
466-
To decode the CSV data, we just need to create a decoder and call `decode` on it passing the given data.
471+
To decode the list of students, create a decoder and call `decode` on it passing the CSV sample.
467472
468473
```swift
469474
let decoder = CSVDecoder { $0.headerStrategy = .firstLine }
470-
let students = try decoder.decode([Student], from: data)
475+
let students = try decoder.decode([Student].self, from: string)
471476
```
472477
473478
The inverse process (from Swift to CSV) is very similar (and simple).
@@ -485,9 +490,11 @@ When encoding/decoding CSV data, it is important to keep several points in mind:
485490
486491
</p>
487492
<ul>
488-
<details><summary>Default behavior requires a CSV with a headers row.</summary><p>
493+
<details><summary><code>Codable</code>'s automatic synthesis requires CSV files with a headers row.</summary><p>
489494
490-
The default behavior (i.e. not including `init(from:)` and `encode(to:)`) rely on the existance of the synthesized `CodingKey`s whose `stringValue`s are the property names. For these properties to match any CSV field, the CSV data must contain a _headers row_ at the very beginning. If your CSV doesn't contain a _headers row_, you can specify coding keys with integer values representing the field index.
495+
`Codable` is able to synthesize `init(from:)` and `encode(to:)` for your custom types when all its members/properties conform to `Codable`. This automatic synthesis create a hidden `CodingKeys` enumeration containing all your property names.
496+
497+
During decoding, `CSVDecoder` tries to match the enumeration string values with a field position within a row. For this to work the CSV data must contain a _headers row_ with the property names. If your CSV doesn't contain a _headers row_, you can specify coding keys with integer values representing the field index.
491498
492499
```swift
493500
struct Student: Codable {
@@ -506,7 +513,7 @@ struct Student: Codable {
506513
> Using integer coding keys has the added benefit of better encoder/decoder performance. By explicitly indicating the field index, you let the decoder skip the functionality of matching coding keys string values to headers.
507514
508515
</p></details>
509-
<details><summary>A CSV is a long list of records/rows.</summary><p>
516+
<details><summary>A CSV is a long list of rows/records.</summary><p>
510517
511518
CSV formatted data is commonly used with flat hierarchies (e.g. a list of students, a list of car models, etc.). Nested structures, such as the ones found in JSON files, are not supported by default in CSV implementations (e.g. a list of users, where each user has a list of services she uses, and each service has a list of the user's configuration values).
512519
@@ -602,7 +609,60 @@ struct Student: Codable {
602609
603610
<details><summary>Encoding/decoding strategies.</summary><p>
604611
605-
#warning("TODO:")
612+
[SE167](https://github.com/apple/swift-evolution/blob/master/proposals/0167-swift-encoders.md) proposal introduced to Foundation a new JSON and PLIST encoder/decoder. This proposal also featured encoding/decoding strategies as a new way to configure the encoding/decoding process. `CodableCSV` continues this _tradition_ and mirrors such strategies including some new ones specific to the CSV file format.
613+
614+
To configure the encoding/decoding process, you need to set the configuration values of the `CSVEncoder`/`CSVDecoder` before calling the `encode()`/`decode()` functions. There are two ways to set configuration values:
615+
616+
- At initialization time, passing the `Configuration` structure to the initializer.
617+
618+
```swift
619+
var config = CSVDecoder.Configuration()
620+
config.nilStrategy = .empty
621+
config.decimalStrategy = .local(.current)
622+
config.dataStrategy = .base64
623+
config.bufferingStrategy = .sequential
624+
config.trimStrategy = .whitespaces
625+
config.encoding = .utf16
626+
config.delimiters.row = "\r\n"
627+
628+
let decoder = CSVDecoder(configuration: config)
629+
```
630+
631+
Alternatively, there are convenience initializers accepting a closure with a `inout Configuration` value.
632+
633+
```swift
634+
let decoder = CSVDecoder {
635+
$0.nilStrategy = .empty
636+
$0.decimalStrategy = .local(.current)
637+
// and so on and so forth
638+
}
639+
```
640+
641+
- `CSVEncoder` and `CSVDecoder` implement `@dynamicMemberLookup` exclusively for their configuration values. Therefore you can set configuration values after initialization or after a encoding/decoding process has been performed.
642+
643+
```swift
644+
let decoder = CSVDecoder()
645+
decoder.bufferingStrategy = .sequential
646+
decoder.decode([Student].self, from: url1)
647+
648+
decoder.bufferingStrategy = .keepAll
649+
decoder.decode([Pets].self, from: url2)
650+
```
651+
652+
The strategies labeled with `.custom` let you insert behavior into the encoding/decoding process without forcing you to manually conform to `init(from:)` and `encode(to:)`. When set, they will reference the targeted type for the whole process. For example, if you want to decode a CSV file where empty fields are marked with the word `null` (for some reason). You could do the following:
653+
654+
```swift
655+
let decoder = CSVDecoder()
656+
decoder.nilStrategy = .custom({ (decoder) -> Bool in
657+
do {
658+
let container = try decoder.singleValueContainer()
659+
let field = try container.decode(String.self)
660+
return field == "null"
661+
} catch let error {
662+
return false
663+
}
664+
})
665+
```
606666
607667
</p></details>
608668
@@ -619,21 +679,19 @@ struct Student: Codable {
619679
<img src="docs/assets/Roadmap.svg" alt="Roadmap"/>
620680
</p>
621681
622-
The library has been heavily documented and any contribution is welcome. Please take a look at the [How to contribute](docs/CONTRIBUTING.md) document or peer into a more detailed roadmap on the [Github projects](https://github.com/dehesa/CodableCSV/projects).
682+
The library has been heavily documented and any contribution is welcome. Check the small [How to contribute](docs/CONTRIBUTING.md) document or take a look at the [Github projects](https://github.com/dehesa/CodableCSV/projects) for a more in-depth roadmap.
623683
624684
### Community
625685
626-
If `CodableCSV` is not of your liking, the Swift community has other CSV solutions:
627-
- [CSV.swift](https://github.com/yaslab/CSV.swift) is a simpler library with a focus on conforming to the [RFC4180](https://tools.ietf.org/html/rfc4180) standard.
628-
629-
It offers an imperative CSV reader/writer and a row decoder. However, it lacks a CSV encoder, whole file decoding, and configurability (e.g. custom field/row delimiters, escaping scalar selection, presampling, etc.).
630-
631-
- [SwiftCSV](https://github.com/swiftcsv/SwiftCSV) is an older/popular CSV parse-only library.
632-
633-
It offers a well-tested imperative CSV parser with a slower development cycle. It lacks an imperative writer, an encoder, a decoder, and parsing configuration values.
634-
635-
- [SwiftCSVExport](https://github.com/vigneshuvi/SwiftCSVExport) reads/writes CSV imperatively with great Objective-C support.
686+
If `CodableCSV` is not of your liking, the Swift community has developed other CSV solutions:
636687
637-
It offers an imperative CSV reader/writer relying on the Objective-C toolchain, which makes it great to use on Objective-C project.
688+
- [CSV.swift](https://github.com/yaslab/CSV.swift) offers an imperative CSV reader/writer and a _lazy_ row decoder and adheres to the [RFC4180](https://tools.ietf.org/html/rfc4180) standard.
689+
- [SwiftCSV](https://github.com/swiftcsv/SwiftCSV) is a well-tested parse-only library which loads the whole CSV in memory (not intended for large files).
690+
- [CSwiftV](https://github.com/Daniel1of1/CSwiftV) is a parse-only library which loads the CSV in memory and parses it in a single go (no imperative reading).
691+
- [CSVImporter](https://github.com/Flinesoft/CSVImporter) is an asynchronous parse-only library with support for big CSV files (incremental loading).
692+
- [SwiftCSVExport](https://github.com/vigneshuvi/SwiftCSVExport) reads/writes CSV imperatively with Objective-C support.
693+
- [swift-csv](https://github.com/brutella/swift-csv) offers an imperative CSV reader/writer based on Foundation's streams.
694+
- [CSV](https://github.com/skelpo/CSV) offers synchronous and asynchronous imperative CSV reader/writer and encoders/decoders.
695+
- [CommonCoding](https://github.com/Lantua/CommonCoding) provides CSV encoder/decoder conforming to the [RFC4180](https://tools.ietf.org/html/rfc4180) standard.
638696
639697
There are many good tools outside the Swift community. Since writing them all would be a hard task, I will just point you to the great [AwesomeCSV](https://github.com/secretGeek/awesomeCSV) github repo. Take it a look! There are a lot of treasures to be found there.

docs/CONTRIBUTING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The only contribution requirements are:
1515

1616
That is it. Thank you for taking the time to read this and for contributing.
1717

18-
## Looking where to start
18+
## Where to start
1919

2020
This repo's [Github projects](https://github.com/dehesa/CodableCSV/projects) are kept up to date and they are a good place to look if you don't know where to start.
2121

sources/declarative/Utils.swift

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ extension Optional {
2020
/// - parameter rhs: Swift error to throw in case of no value.
2121
/// - returns: The value (non-optional) passed as parameter.
2222
/// - throws: The Swift error returned on the right hand-side autoclosure.
23-
@inline(__always) internal static func ?!(lhs: Self, rhs: @autoclosure ()->Swift.Error) throws -> Wrapped {
23+
@_transparent internal static func ?!(lhs: Self, rhs: @autoclosure ()->Swift.Error) throws -> Wrapped {
2424
switch lhs {
2525
case .some(let v): return v
2626
case .none: throw rhs()

sources/declarative/decodable/Decoder.swift

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,9 @@ extension CSVDecoder {
6060
let source = ShadowDecoder.Source(reader: reader, configuration: self.configuration, userInfo: self.userInfo)
6161
return try T(from: ShadowDecoder(source: source, codingPath: []))
6262
}
63+
}
6364

65+
extension CSVDecoder {
6466
/// Returns a sequence for decoding each row from a CSV file (given as a `Data` blob).
6567
/// - parameter data: The data blob representing a CSV file.
6668
/// - throws: `CSVError<CSVReader>` exclusively.
@@ -69,6 +71,15 @@ extension CSVDecoder {
6971
let source = ShadowDecoder.Source(reader: reader, configuration: self.configuration, userInfo: self.userInfo)
7072
return LazySequence(source: source)
7173
}
74+
75+
/// Returns a sequence for decoding each row from a CSV file (given as a `String`).
76+
/// - parameter string: A Swift string representing a CSV file.
77+
/// - throws: `CSVError<CSVReader>` exclusively.
78+
open func lazy(from string: String) throws -> LazySequence {
79+
let reader = try CSVReader(input: string, configuration: self.configuration.readerConfiguration)
80+
let source = ShadowDecoder.Source(reader: reader, configuration: self.configuration, userInfo: self.userInfo)
81+
return LazySequence(source: source)
82+
}
7283

7384
/// Returns a sequence for decoding each row from a CSV file (being pointed by `url`).
7485
/// - parameter url: The URL pointing to the file to decode.
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
extension CSVDecoder {
2+
/// Swift sequence type giving access to all the "undecoded" CSV rows.
3+
///
4+
/// The CSV rows are read *on-demand* and only decoded when explicitly told so (unlike the default *decode* functions).
5+
public struct LazySequence: IteratorProtocol, Sequence {
6+
/// The source of the CSV data.
7+
private let source: ShadowDecoder.Source
8+
/// The row to be read (not decoded) next.
9+
private var currentIndex: Int = 0
10+
/// Designated initalizer passing all the required components.
11+
/// - parameter source: The data source for the decoder.
12+
internal init(source: ShadowDecoder.Source) {
13+
self.source = source
14+
}
15+
16+
/// Advances to the next row and returns a `LazySequence.Row`, or `nil` if no next row exists.
17+
public mutating func next() -> RowDecoder? {
18+
guard !self.source.isRowAtEnd(index: self.currentIndex) else { return nil }
19+
20+
defer { self.currentIndex += 1 }
21+
let decoder = ShadowDecoder(source: self.source, codingPath: [IndexKey(self.currentIndex)])
22+
return RowDecoder(decoder: decoder)
23+
}
24+
}
25+
}
26+
27+
extension CSVDecoder.LazySequence {
28+
/// Pointer to a row within a CSV file that is able to decode it to a custom type.
29+
public struct RowDecoder {
30+
/// The representation of the decoding process point-in-time.
31+
private let decoder: ShadowDecoder
32+
33+
/// Designated initializer passing all the required components.
34+
/// - parameter decoder: The `Decoder` instance in charge of decoding the CSV data.
35+
fileprivate init(decoder: ShadowDecoder) {
36+
self.decoder = decoder
37+
}
38+
39+
/// Returns a value of the type you specify, decoded from CSV row.
40+
/// - parameter type: The type of the value to decode from the supplied file.
41+
/// - throws: `DecodingError`, or `CSVError<CSVReader>`, or the error raised by your custom types.
42+
@inline(__always) public func decode<T:Decodable>(_ type: T.Type) throws -> T {
43+
return try T(from: self.decoder)
44+
}
45+
}
46+
}

0 commit comments

Comments
 (0)