Skip to content

Commit a05c9d7

Browse files
authored
Merge pull request ceph#63080 from bluikko/doc-caching-improvements-radosgw
doc/radosgw: Improve rgw-cache.rst Reviewed-by: Anthony D'Atri <[email protected]> Reviewed-by: Zac Dover <[email protected]>
2 parents 794034e + 6e836f8 commit a05c9d7

File tree

1 file changed

+67
-73
lines changed

1 file changed

+67
-73
lines changed

doc/radosgw/rgw-cache.rst

Lines changed: 67 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
==========================
2-
RGW Data caching and CDN
3-
==========================
1+
========================
2+
RGW Data Caching and CDN
3+
========================
44

55
.. versionadded:: Octopus
66

@@ -9,148 +9,142 @@ RGW Data caching and CDN
99
This feature adds to RGW the ability to securely cache objects and offload the workload from the cluster, using Nginx.
1010
After an object is accessed the first time it will be stored in the Nginx cache directory.
1111
When data is already cached, it need not be fetched from RGW. A permission check will be made against RGW to ensure the requesting user has access.
12-
This feature is based on some Nginx modules, ngx_http_auth_request_module, https://github.com/kaltura/nginx-aws-auth-module, Openresty for Lua capabilities.
12+
This feature is based on the Nginx modules ``ngx_http_auth_request_module`` and `nginx-aws-auth-module <https://github.com/kaltura/nginx-aws-auth-module>`_, and OpenResty for Lua capabilities.
1313

14-
Currently, this feature will cache only AWSv4 requests (only s3 requests), caching-in the output of the 1st GET request
15-
and caching-out on subsequent GET requests, passing thru transparently PUT,POST,HEAD,DELETE and COPY requests.
14+
Currently this feature will cache only AWSv4 requests (only S3 requests), caching-in the output of the first GET request
15+
and caching-out on subsequent GET requests, passing through transparently PUT,POST,HEAD,DELETE and COPY requests.
1616

1717

1818
The feature introduces 2 new APIs: Auth and Cache.
1919

20-
NOTE: The `D3N RGW Data Cache`_ is an alternative data caching mechanism implemented natively in the RADOS Gateway.
20+
.. note:: The `D3N RGW Data Cache`_ is an alternative data caching mechanism implemented natively in the RADOS Gateway.
2121

2222
New APIs
23-
-------------------------
23+
--------
2424

2525
There are 2 new APIs for this feature:
2626

27-
Auth API - The cache uses this to validate that a user can access the cached data
28-
29-
Cache API - Adds the ability to override securely Range header, that way Nginx can use it is own smart cache on top of S3:
30-
https://www.nginx.com/blog/smart-efficient-byte-range-caching-nginx/
31-
Using this API gives the ability to read ahead objects when clients asking a specific range from the object.
32-
On subsequent accesses to the cached object, Nginx will satisfy requests for already-cached ranges from the cache. Uncached ranges will be read from RGW (and cached).
27+
- **Auth API:** The cache uses this to validate that a user can access the cached data.
28+
- **Cache API:** Adds the ability to override securely ``Range`` header so that Nginx can use its own `smart cache <https://www.nginx.com/blog/smart-efficient-byte-range-caching-nginx/>`_ on top of S3.
29+
Using this API gives the ability to read ahead objects when client is asking a specific range from the object.
30+
On subsequent accesses to the cached object, Nginx will satisfy requests for already-cached ranges from the cache. Uncached ranges will be read from RGW (and cached).
3331

3432
Auth API
35-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33+
~~~~~~~~
3634

37-
This API Validates a specific authenticated access being made to the cache, using RGW's knowledge of the client credentials and stored access policy.
35+
This API validates a specific authenticated access being made to the cache, using RGW's knowledge of the client credentials and stored access policy.
3836
Returns success if the encapsulated request would be granted.
3937

4038
Cache API
41-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
39+
~~~~~~~~~
4240

43-
This API is meant to allow changing signed Range headers using a privileged user, cache user.
41+
This API is meant to allow changing signed ``Range`` headers using a privileged cache user.
4442

45-
Creating cache user
43+
Creating the cache user:
4644

47-
::
45+
.. prompt:: bash #
46+
47+
radosgw-admin user create --uid=<uid for cache user> --display-name="cache user" --caps="amz-cache=read"
48+
49+
This user can send to the RGW the Cache API header ``X-Amz-Cache``. This header contains the headers from the original request (before changing the ``Range`` header):
4850

49-
$ radosgw-admin user create --uid=<uid for cache user> --display-name="cache user" --caps="amz-cache=read"
51+
- Original headers are separated from each other by a character with ASCII code 177 decimal.
52+
- Each original header and its value are separated by a character with ASCII code 178 decimal.
5053

51-
This user can send to the RGW the Cache API header ``X-Amz-Cache``, this header contains the headers from the original request(before changing the Range header).
52-
It means that ``X-Amz-Cache`` built from several headers.
53-
The headers that are building the ``X-Amz-Cache`` header are separated by char with ASCII code 177 and the header name and value are separated by char ASCII code 178.
54-
The RGW will check that the cache user is an authorized user and if it is a cache user,
55-
if yes it will use the ``X-Amz-Cache`` to revalidate that the user has permissions, using the headers from the X-Amz-Cache.
56-
During this flow, the RGW will override the Range header.
54+
The RGW will check that the user is an authorized user and that the value is a cache user.
55+
If both checks succeed it will use the ``X-Amz-Cache`` to revalidate that the user has permissions, using the original headers stored in the ``X-Amz-Cache`` header.
56+
During this flow the RGW will override the ``Range`` header.
5757

5858

5959
Using Nginx with RGW
60-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
60+
--------------------
6161

62-
Download the source of Openresty:
62+
Download the source of OpenResty:
6363

64-
::
64+
.. prompt:: bash $
6565

66-
$ wget https://openresty.org/download/openresty-1.15.8.3.tar.gz
66+
wget https://openresty.org/download/openresty-1.15.8.3.tar.gz
6767

68-
git clone the AWS auth Nginx module:
68+
Use git to clone the Nginx AWS authentication module:
6969

70-
::
70+
.. prompt:: bash $
7171

72-
$ git clone https://github.com/kaltura/nginx-aws-auth-module
72+
git clone https://github.com/kaltura/nginx-aws-auth-module
7373

74-
untar the openresty package:
74+
Untar the OpenResty package:
7575

76-
::
76+
.. prompt:: bash $
7777

78-
$ tar xvzf openresty-1.15.8.3.tar.gz
79-
$ cd openresty-1.15.8.3
78+
tar xvzf openresty-1.15.8.3.tar.gz
79+
cd openresty-1.15.8.3
8080

81-
Compile openresty, Make sure that you have pcre lib and openssl lib:
81+
Compile OpenResty, make sure that you have ``pcre`` library and ``openssl`` library:
8282

83-
::
83+
.. prompt:: bash $
8484

85-
$ sudo yum install pcre-devel openssl-devel gcc curl zlib-devel nginx
86-
$ ./configure --add-module=<the nginx-aws-auth-module dir> --with-http_auth_request_module --with-http_slice_module --conf-path=/etc/nginx/nginx.conf
87-
$ gmake -j $(nproc)
88-
$ sudo gmake install
89-
$ sudo ln -sf /usr/local/openresty/bin/openresty /usr/bin/nginx
85+
sudo yum install pcre-devel openssl-devel gcc curl zlib-devel nginx
86+
./configure --add-module=<the nginx-aws-auth-module dir> --with-http_auth_request_module --with-http_slice_module --conf-path=/etc/nginx/nginx.conf
87+
gmake -j $(nproc)
88+
sudo gmake install
89+
sudo ln -sf /usr/local/openresty/bin/openresty /usr/bin/nginx
9090

9191
Put in-place your Nginx configuration files and edit them according to your environment:
9292

93-
All Nginx conf files are under:
93+
Example Nginx configuration files are available at
9494
https://github.com/ceph/ceph/tree/main/examples/rgw/rgw-cache
9595

96-
`nginx.conf` should go to `/etc/nginx/nginx.conf`
96+
- ``nginx.conf`` should go to ``/etc/nginx/nginx.conf``.
97+
- ``nginx-lua-file.lua`` should go to ``/etc/nginx/nginx-lua-file.lua``.
98+
- ``nginx-default.conf`` should go to ``/etc/nginx/conf.d/nginx-default.conf``.
9799

98-
`nginx-lua-file.lua` should go to `/etc/nginx/nginx-lua-file.lua`
100+
The parameters that are most likely to require adjustment according to the environment are located in the file ``nginx-default.conf``.
99101

100-
`nginx-default.conf` should go to `/etc/nginx/conf.d/nginx-default.conf`
101-
102-
The parameters that are most likely to require adjustment according to the environment are located in the file `nginx-default.conf`
103-
104-
Modify the example values of *proxy_cache_path* and *max_size* at:
102+
Modify the example values of ``proxy_cache_path`` and ``max_size`` at:
105103

106104
::
107105

108106
proxy_cache_path /data/cache levels=2:2:2 keys_zone=mycache:999m max_size=20G inactive=1d use_temp_path=off;
109107

110108

111-
And modify the example *server* values to point to the RGWs URIs:
109+
And modify the example ``server`` values to point to the RGWs URIs:
112110

113111
::
114112

115113
server rgw1:8000 max_fails=2 fail_timeout=5s;
116114
server rgw2:8000 max_fails=2 fail_timeout=5s;
117115
server rgw3:8000 max_fails=2 fail_timeout=5s;
118116

119-
| It is important to substitute the *access key* and *secret key* located in the `nginx.conf` with those belong to the user with the `amz-cache` caps
120-
| for example, create the `cache` user as following:
121-
122-
::
123-
124-
radosgw-admin user create --uid=cacheuser --display-name="cache user" --caps="amz-cache=read" --access-key <access> --secret <secret>
117+
| It is important to substitute the *access key* and *secret key* located in the ``nginx.conf`` file with those belonging to the user with the ``amz-cache`` caps.
118+
| For example, create the cache user as follows:
125119
126-
It is possible to use Nginx slicing which is a better method for streaming purposes.
120+
.. prompt:: bash #
127121

128-
For using slice you should use `nginx-slicing.conf` and not `nginx-default.conf`
122+
radosgw-admin user create --uid=cacheuser --display-name="cache user" --caps="amz-cache=read" --access-key <access> --secret <secret>
129123

130-
Further information about Nginx slicing:
124+
It is possible to use `Nginx slicing <https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/#byte-range-caching>`_ which is suitable for streaming purposes.
131125

132-
https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/#byte-range-caching
126+
To enable slicing you should use ``nginx-slicing.conf`` instead of ``nginx-default.conf``.
133127

134128

135-
If you do not want to use the prefetch caching, It is possible to replace `nginx-default.conf` with `nginx-noprefetch.conf`
136-
Using `noprefetch` means that if the client is sending range request of 0-4095 and then 0-4096 Nginx will cache those requests separately, So it will need to fetch those requests twice.
129+
If you do not want to use prefetch caching, it is possible to replace ``nginx-default.conf`` with ``nginx-noprefetch.conf``.
130+
If prefetch caching is disabled Nginx will cache each range request separately and possible overlap in the range requests will be fetched more than once. For example, if a client is sending a range request of 0-4095 and then 0-4096 both requests are fetched completely from RGW.
137131

138132

139-
Run Nginx(openresty):
133+
Run Nginx (OpenResty):
140134

141-
::
135+
.. prompt:: bash #
142136

143-
$ sudo systemctl restart nginx
137+
systemctl restart nginx
144138

145139
Appendix
146-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
147-
**A note about performance:** In certain instances like development environment, disabling the authentication by commenting the following line in `nginx-default.conf`:
140+
--------
141+
142+
**A note about performance:** In certain instances such as a development environment, disabling authentication may (depending on the hardware) increase performance significantly as it forgoes auth API calls to RADOS Gateway.
143+
This can be done by commenting the following line in ``nginx-default.conf``:
148144

149145
::
150146

151147
#auth_request /authentication;
152148

153-
may (depending on the hardware) increases the performance significantly as it forgoes the auth API calls to radosgw.
154-
155149

156150
.. _D3N RGW Data Cache: ../d3n_datacache/

0 commit comments

Comments
 (0)