Skip to content

Commit a37f029

Browse files
committed
Corrections after initial discussion
1 parent 2034c1f commit a37f029

File tree

1 file changed

+93
-23
lines changed

1 file changed

+93
-23
lines changed

1-Draft/RFC0018-Get-StringHash.md

Lines changed: 93 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,20 @@ Area: Cmdlet
88
Comments Due: 3/2/2017
99
---
1010

11-
# Get-StringHash cmdlet
11+
# Get-Hash cmdlet
1212

13-
Add the new cmdlet to PowerShell to get cryptographically strong hashes for strings.
13+
Add the new cmdlet to PowerShell to get cryptographically strong hashes for strings, files and file streams.
1414

15-
Current .Net implementation of GetHashCode method calculates a hash that is not a global (across application domains) safe and is not a permanent value.
15+
Currently Powershell only has Get-FileHash cmdlet to get a file hashes.
16+
The cmdlet enhance and replace `Get-FileHash` cmdlet to support strings. `Get-FileHash` becomes an alias of `Get-Hash`.
17+
18+
Current .Net implementation of GetHashCode method calculates a hash for strings that is not a global (across application domains) safe and is not a permanent value.
1619
According to the documentation, this method has many practical limitations.
1720
https://msdn.microsoft.com/en-us/library/system.object.gethashcode%28v=vs.110%29.aspx
1821
https://msdn.microsoft.com/en-us/library/system.string.gethashcode%28v=vs.110%29.aspx
19-
>As a result, hash codes should never be used outside of the application domain in which they were created, they should never be used as key fields in a collection, and they should never be persisted.
22+
>As a result, the hash codes should never be used outside of the application domain in which they were created, they should never be used as key fields in a collection, and they should never be persisted.
2023
21-
Currently Powershell only has Get-FileHash cmdlet to get a file hashes.
22-
Users are forced to use a workaround to get a string hashes:
24+
Also users are forced to use a workaround to get a string hashes:
2325
```powershell
2426
Get-FileHash -InputStream ([System.IO.MemoryStream]::new([System.Text.Encoding]::UTF8.GetBytes("test string")))
2527
```
@@ -28,20 +30,20 @@ Get-FileHash -InputStream ([System.IO.MemoryStream]::new([System.Text.Encoding]:
2830

2931
With the new cmdlet users can use native Powershell cmdlet syntax without a workaround.
3032

31-
With the new cmdlet users can get hashes which:
33+
With the new cmdlet users can get string hashes which:
3234

3335
* is a cryptographically strong hashes
3436
* is across-platform
3537
* may be sended across application domains
3638
* may be saved in databases
3739
* may be used as key in collections
38-
* may be used to compare strings and texts
40+
* may be used to compare strings and files
3941

4042
## Specification
4143

4244
### Output
4345

44-
The cmdlet output objects of `StringHashInfo` type:
46+
The cmdlet output objects of `StringHashInfo` type for `StringHash` parameter set:
4547

4648
```powershell
4749
public class StringHashInfo
@@ -52,15 +54,41 @@ public class StringHashInfo
5254
public string HashedString { get; set;}
5355
}
5456
```
55-
`Hash` is a hash of the `HashedString` string calculated with `Algorithm` algorithm.
5657

57-
`Encoding` is a encoding of the `HashedString` string.
58+
The cmdlet output objects of `FileHashInfo` type for `PathParameterSet`, `LiteralPathParameterSet`, `StreamParameterSet` parameter sets:
59+
60+
```powershell
61+
public class FileHashInfo
62+
{
63+
public string Algorithm { get; set;}
64+
public string Hash { get; set;}
65+
public string Path { get; set;}
66+
}
67+
```
68+
69+
`Algorithm` is a hash algorithm name.
70+
71+
`Hash` is a hash of the file or the `HashedString` string calculated with `Algorithm` algorithm.
72+
73+
`Encoding` is a encoding of the `HashedString` string. It is not used for file hashes because we read files as binary.
74+
75+
`Path` is from `Path` or `LiteralPath` parameter. It is `null` for file streams.
76+
77+
### Parameter Sets
78+
79+
1. `StringHash`
80+
81+
2. `PathParameterSet`
82+
83+
3. `LiteralPathParameterSet`
84+
85+
4. `StreamParameterSet`
5886

5987
### Parameters
6088

61-
1. InputString
89+
1. InputString (`StringHash` parameter set)
6290

63-
Type = array of strings
91+
Type = `array of strings`
6492

6593
Position = 0
6694

@@ -72,27 +100,67 @@ public class StringHashInfo
72100

73101
AllowEmptyString()
74102

75-
2. Algorithm
103+
We allow `null` and `empty` strings because we can get it from pipeline. See below `#InputString accept null and empty input strings`
104+
105+
2. Algorithm (All parameter sets)
76106

77-
Type = string
107+
Type = `string`
78108

79109
Position = 1
80110

81111
##### Attributes
82112

83-
ValidateSet = SHA1, SHA256, SHA384, SHA512, MD5
113+
ValidateSet = `SHA1`, `SHA256`, `SHA384`, `SHA512`, `MD5`
84114

85-
3. Encoding
115+
3. Encoding (`StringHash` parameter set)
86116

87-
Type = [Microsoft.PowerShell.Commands.FileSystemCmdletProviderEncoding]
117+
Type = `[Microsoft.PowerShell.Commands.FileSystemCmdletProviderEncoding]`
88118

89119
Position = 2
90120

121+
4. Path (`PathParameterSet` parameter set)
122+
123+
Type = `array of strings`
124+
125+
Position = 0
126+
127+
##### Attributes
128+
129+
ValueFromPipelineByPropertyName = true
130+
131+
5. LiteralPath (`LiteralPathParameterSet` parameter set)
132+
133+
Type = `array of strings`
134+
135+
Position = 0
136+
137+
##### Attributes
138+
139+
Alias("PSPath")
140+
141+
5. InputStream (`StreamParameterSet` parameter set)
142+
143+
Type = `Stream`
144+
145+
Position = 0
146+
147+
##### Attributes
148+
149+
ValueFromPipelineByPropertyName = true
150+
151+
### Pipeline
152+
153+
All of `InputString`, `Path`, `LiteralPath` parameters has `string` type and we can use `ValueFromPipeline` only for one parameter - `InputString`.
154+
`Path`, `LiteralPath` parameters is accepted from pipeline by property name.
155+
(If we want `ValueFromPipeline` for `Path` we should split on `Get-StringHash` and `Get-FileHash`.)
156+
91157
### Encoding
92158

93-
.Net hash algorithm methods work on byte buffer. So the cmdlet must convert a input string to byte array.
94-
This process is a encoding sensitive.
95-
To correctly process the user must specify the encoding of incoming strings.
159+
.Net hash algorithm methods work on byte buffer.
160+
161+
We read files as binary and so no encodings needed.
162+
163+
For strings the cmdlet must convert a input string to byte array. This process is a encoding sensitive. To correctly process users must specify the encoding of incoming strings by `Encoding` parameter.
96164

97165
### InputString accept null and empty input strings
98166

@@ -101,10 +169,12 @@ To correctly process the user must specify the encoding of incoming strings.
101169
.Net hash algorithm methods throw for null input buffer.
102170
This behavior is not convenient when processing an array of strings by using a pipeline:
103171
```powershell
104-
"test1", $null, "test2" | Get-StringHash
172+
"test1", $null, "test2" | Get-Hash
105173
```
106174
Best behavior is to create null as result hash for null string and generate non-terminating error.
107175

108176
## Alternate Proposals and Considerations
109177

110-
None
178+
We cosider `Base64` as special encoding and not a hash. It is recomended to implement `Base64` encoding in some `ConvertTo-Base64` and `ConvertFrom-Base64` cmdlets.
179+
180+
We could split the `Get-Hash` functionality on `Get-FileHash` and `Get-StringHash` cmdlets or even `Get-FileHash`, `Get-StreamHash` and `Get-StringHash`.

0 commit comments

Comments
 (0)