@@ -279,6 +279,94 @@ few exceptions. Even though...
279
279
catch potential problems early, safety triggers.
280
280
281
281
282
+ `working-tree-encoding`
283
+ ^^^^^^^^^^^^^^^^^^^^^^^
284
+
285
+ Git recognizes files encoded in ASCII or one of its supersets (e.g.
286
+ UTF-8, ISO-8859-1, ...) as text files. Files encoded in certain other
287
+ encodings (e.g. UTF-16) are interpreted as binary and consequently
288
+ built-in Git text processing tools (e.g. 'git diff') as well as most Git
289
+ web front ends do not visualize the contents of these files by default.
290
+
291
+ In these cases you can tell Git the encoding of a file in the working
292
+ directory with the `working-tree-encoding` attribute. If a file with this
293
+ attribute is added to Git, then Git reencodes the content from the
294
+ specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
295
+ content in its internal data structure (called "the index"). On checkout
296
+ the content is reencoded back to the specified encoding.
297
+
298
+ Please note that using the `working-tree-encoding` attribute may have a
299
+ number of pitfalls:
300
+
301
+ - Alternative Git implementations (e.g. JGit or libgit2) and older Git
302
+ versions (as of March 2018) do not support the `working-tree-encoding`
303
+ attribute. If you decide to use the `working-tree-encoding` attribute
304
+ in your repository, then it is strongly recommended to ensure that all
305
+ clients working with the repository support it.
306
+
307
+ For example, Microsoft Visual Studio resources files (`*.rc`) or
308
+ PowerShell script files (`*.ps1`) are sometimes encoded in UTF-16.
309
+ If you declare `*.ps1` as files as UTF-16 and you add `foo.ps1` with
310
+ a `working-tree-encoding` enabled Git client, then `foo.ps1` will be
311
+ stored as UTF-8 internally. A client without `working-tree-encoding`
312
+ support will checkout `foo.ps1` as UTF-8 encoded file. This will
313
+ typically cause trouble for the users of this file.
314
+
315
+ If a Git client, that does not support the `working-tree-encoding`
316
+ attribute, adds a new file `bar.ps1`, then `bar.ps1` will be
317
+ stored "as-is" internally (in this example probably as UTF-16).
318
+ A client with `working-tree-encoding` support will interpret the
319
+ internal contents as UTF-8 and try to convert it to UTF-16 on checkout.
320
+ That operation will fail and cause an error.
321
+
322
+ - Reencoding content to non-UTF encodings can cause errors as the
323
+ conversion might not be UTF-8 round trip safe. If you suspect your
324
+ encoding to not be round trip safe, then add it to
325
+ `core.checkRoundtripEncoding` to make Git check the round trip
326
+ encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
327
+ set) is known to have round trip issues with UTF-8 and is checked by
328
+ default.
329
+
330
+ - Reencoding content requires resources that might slow down certain
331
+ Git operations (e.g 'git checkout' or 'git add').
332
+
333
+ Use the `working-tree-encoding` attribute only if you cannot store a file
334
+ in UTF-8 encoding and if you want Git to be able to process the content
335
+ as text.
336
+
337
+ As an example, use the following attributes if your '*.ps1' files are
338
+ UTF-16 encoded with byte order mark (BOM) and you want Git to perform
339
+ automatic line ending conversion based on your platform.
340
+
341
+ ------------------------
342
+ *.ps1 text working-tree-encoding=UTF-16
343
+ ------------------------
344
+
345
+ Use the following attributes if your '*.ps1' files are UTF-16 little
346
+ endian encoded without BOM and you want Git to use Windows line endings
347
+ in the working directory. Please note, it is highly recommended to
348
+ explicitly define the line endings with `eol` if the `working-tree-encoding`
349
+ attribute is used to avoid ambiguity.
350
+
351
+ ------------------------
352
+ *.ps1 text working-tree-encoding=UTF-16LE eol=CRLF
353
+ ------------------------
354
+
355
+ You can get a list of all available encodings on your platform with the
356
+ following command:
357
+
358
+ ------------------------
359
+ iconv --list
360
+ ------------------------
361
+
362
+ If you do not know the encoding of a file, then you can use the `file`
363
+ command to guess the encoding:
364
+
365
+ ------------------------
366
+ file foo.ps1
367
+ ------------------------
368
+
369
+
282
370
`ident`
283
371
^^^^^^^
284
372
0 commit comments