You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+24-7Lines changed: 24 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,9 +9,11 @@
9
9
`hdfs-native` is an HDFS client written natively in Rust. It supports nearly all major features of an HDFS client, and several key client configuration options listed below.
10
10
11
11
## Supported HDFS features
12
+
12
13
Here is a list of currently supported and unsupported but possible future features.
13
14
14
15
### HDFS Operations
16
+
15
17
-[x] Listing
16
18
-[x] Reading
17
19
-[x] Writing
@@ -24,14 +26,16 @@ Here is a list of currently supported and unsupported but possible future featur
24
26
-[x] Set timestamps
25
27
26
28
### HDFS Features
29
+
27
30
-[x] Name Services
28
31
-[x] Observer reads
29
32
-[x] ViewFS
30
33
-[x] Router based federation
31
34
-[x] Erasure coded reads and writes
32
-
- RS schema only, no support for RS-Legacy or XOR
35
+
- RS schema only, no support for RS-Legacy or XOR
33
36
34
37
### Security Features
38
+
35
39
-[x] Kerberos authentication (GSSAPI SASL support) (requires libgssapi_krb5, see below)
@@ -40,47 +44,54 @@ Here is a list of currently supported and unsupported but possible future featur
40
44
-[ ] Encryption at rest (KMS support)
41
45
42
46
### Kerberos Support
47
+
43
48
Kerberos (SASL GSSAPI) mechanism is supported through a runtime dynamic link to `libgssapi_krb5`. This must be installed separately, but is likely already installed on your system. If not you can install it by:
44
49
45
50
#### Debian-based systems
51
+
46
52
```bash
47
53
apt-get install libgssapi-krb5-2
48
54
```
49
55
50
56
#### RHEL-based systems
57
+
51
58
```bash
52
59
yum install krb5-libs
53
60
```
54
61
55
62
#### MacOS
63
+
56
64
```bash
57
65
brew install krb5
58
66
```
59
67
60
68
#### Windows
69
+
61
70
Download and install the Microsoft Kerberos package from https://web.mit.edu/kerberos/dist/
62
71
63
72
Copy the `<INSTALL FOLDER>\MIT\Kerberos\bin\gssapi64.dll` file to a folder in %PATH% and change the name to `gssapi_krb5.dll`
64
73
65
74
## Supported HDFS Settings
66
-
The client will attempt to read Hadoop configs `core-site.xml` and `hdfs-site.xml` in the directories `$HADOOP_CONF_DIR` or if that doesn't exist, `$HADOOP_HOME/etc/hadoop`. Currently the supported configs that are used are:
75
+
76
+
The client will attempt to read Hadoop configs `core-site.xml` and `hdfs-site.xml` in the directories `$HADOOP_CONF_DIR` or if that doesn't exist, `$HADOOP_HOME/etc/hadoop`. Passing configs in run time is supported as well via `client::ClientBuilder`. Currently the supported configs that are used are:
77
+
67
78
-`fs.defaultFS` - Client::default() support
68
79
-`dfs.ha.namenodes` - name service support
69
80
-`dfs.namenode.rpc-address.*` - name service support
70
81
-`dfs.client.failover.resolve-needed.*` - DNS based NameNode discovery
71
82
-`dfs.client.failover.resolver.useFQDN.*` - DNS based NameNode discovery
72
83
-`dfs.client.failover.random.order.*` - Randomize order of NameNodes to try
73
84
-`dfs.client.failover.proxy.provider.*` - Supports the behavior of the following proxy providers. Any other values will default back to the `ConfiguredFailoverProxyProvider` behavior:
-`fs.viewfs.mounttable.*.linkFallback` - ViewFS link fallback
82
93
83
-
All other settings are generally assumed to be the defaults currently. For instance, security is assumed to be enabled and SASL negotiation is always done, but on insecure clusters this will just do SIMPLE authentication. Any setups that require other customized Hadoop client configs may not work correctly.
94
+
All other settings are generally assumed to be the defaults currently. For instance, security is assumed to be enabled and SASL negotiation is always done, but on insecure clusters this will just do SIMPLE authentication. Any setups that require other customized Hadoop client configs may not work correctly.
84
95
85
96
## Building
86
97
@@ -89,19 +100,23 @@ cargo build
89
100
```
90
101
91
102
## Object store implementation
103
+
92
104
An object_store implementation for HDFS is provided in the [hdfs-native-object-store](https://github.com/datafusion-contrib/hdfs-native-object-store) crate.
93
105
94
106
## Running tests
107
+
95
108
The tests are mostly integration tests that utilize a small Java application in `rust/mindifs/` that runs a custom `MiniDFSCluster`. To run the tests, you need to have Java, Maven, Hadoop binaries, and Kerberos tools available and on your path. Any Java version between 8 and 17 should work.
96
109
97
110
```bash
98
-
cargo test -p hdfs-native --features intergation-test
111
+
cargo test -p hdfs-native --features integration-test
99
112
```
100
113
101
114
### Python tests
115
+
102
116
See the [Python README](./python/README.md)
103
117
104
118
## Running benchmarks
119
+
105
120
Some of the benchmarks compare performance to the JVM based client through libhdfs via the fs-hdfs3 crate. Because of that, some extra setup is required to run the benchmarks:
The `benchmark` feature is required to expose `minidfs` and the internal erasure coding functions to benchmark.
118
134
119
135
## Running examples
136
+
120
137
The examples make use of the `minidfs` module to create a simple HDFS cluster to run the example. This requires including the `integration-test` feature to enable the `minidfs` module. Alternatively, if you want to run the example against an existing HDFS cluster you can exclude the `integration-test` feature and make sure your `HADOOP_CONF_DIR` points to a directory with HDFS configs for talking to your cluster.
0 commit comments