Skip to content

Commit 6305d20

Browse files
committed
Filter: Describe sanitization flags in constant page
1 parent 23e2d50 commit 6305d20

File tree

3 files changed

+99
-161
lines changed

3 files changed

+99
-161
lines changed

reference/filter/book.xml

Lines changed: 38 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,53 @@
11
<?xml version="1.0" encoding="utf-8"?>
22
<!-- $Revision$ -->
3-
43
<book xml:id="book.filter" xmlns="http://docbook.org/ns/docbook">
54
<?phpdoc extension-membership="bundled" ?>
65
<title>Data Filtering</title>
76
<titleabbrev>Filter</titleabbrev>
87

98
<preface xml:id="intro.filter">
109
&reftitle.intro;
11-
<para>
12-
This extension filters data by either validating or sanitizing it. This is
13-
especially useful when the data source contains unknown (or foreign) data,
14-
like user supplied input. For example, this data may come from an HTML form.
15-
</para>
16-
<para>
10+
<simpara>
11+
This extension provides filters which can be used to validate or sanitize data.
12+
This is especially useful when the data source contains unknown (or foreign) data,
13+
like user supplied input.
14+
For example, this data may come from an <acronym>HTML</acronym> form.
15+
</simpara>
16+
<simpara>
1717
There are two main types of filtering:
1818
<emphasis>validation</emphasis> and <emphasis>sanitization</emphasis>.
19-
</para>
20-
<para>
21-
Validation is used to
22-
validate or check if the data meets certain qualifications. For example,
23-
passing in <constant>FILTER_VALIDATE_EMAIL</constant> will determine if
24-
the data is a valid email address, but will not change the data itself.
25-
</para>
26-
<para>
27-
<link linkend="filter.filters.sanitize">Sanitization</link> will
28-
sanitize the data, so it may alter it by removing undesired characters.
29-
For example, passing in <constant>FILTER_SANITIZE_EMAIL</constant> will
19+
</simpara>
20+
<simpara>
21+
A validation filter is used to check if the data meets certain criteria.
22+
These filters are identified by the
23+
<constant>FILTER_VALIDATE_<replaceable>*</replaceable></constant>
24+
constants.
25+
For example, the <constant>FILTER_VALIDATE_EMAIL</constant> filter
26+
can be used to determine if the data is a valid email address.
27+
However, it will never alter the input data.
28+
</simpara>
29+
<simpara>
30+
Sanitization on the other hand will "clean up" the data,
31+
therefore it may alter the input data by adding or removing characters.
32+
These filters are identified by the
33+
<constant>FILTER_SANITIZE_<replaceable>*</replaceable></constant>
34+
constants.
35+
For example, the <constant>FILTER_SANITIZE_EMAIL</constant> filter will
3036
remove characters that are inappropriate for an email address to contain.
31-
That said, it does not validate the data.
32-
</para>
33-
<para>
34-
<emphasis>Flags</emphasis> are optionally used with both validation and
35-
sanitization to tweak behaviour according to need. For example, passing
36-
in <constant>FILTER_FLAG_PATH_REQUIRED</constant> while filtering an
37-
<acronym>URL</acronym> will require a path (like <literal>/foo</literal>
38-
in <literal>http://example.org/foo</literal>) to be present.
39-
</para>
37+
However, the sanitized data is not validated to check if it is a valid
38+
email address.
39+
</simpara>
40+
<simpara>
41+
Most filters support optional <emphasis>flags</emphasis> that can tweak
42+
the behavior of the filter.
43+
These flags are identified by the
44+
<constant>FILTER_FLAG_<replaceable>*</replaceable></constant>
45+
constants.
46+
For example, using the <constant>FILTER_FLAG_PATH_REQUIRED</constant> with
47+
the <constant>FILTER_VALIDATE_URL</constant> validation filter
48+
requires that the <acronym>URL</acronym> has a path
49+
(e.g. <literal>/foo</literal> in <literal>https://example.org/foo</literal>).
50+
</simpara>
4051
</preface>
4152

4253
&reference.filter.setup;
@@ -46,7 +57,6 @@
4657
&reference.filter.reference;
4758

4859
</book>
49-
5060
<!-- Keep this comment at the end of the file
5161
Local variables:
5262
mode: sgml

reference/filter/constants.xml

Lines changed: 61 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -831,7 +831,13 @@
831831
</term>
832832
<listitem>
833833
<simpara>
834-
ID of "unsafe_raw" filter.
834+
This filter does nothing.
835+
</simpara>
836+
<simpara>
837+
However, it can strip or encode special characters if used together with
838+
the <constant>FILTER_FLAG_STRIP_<replaceable>*</replaceable></constant>
839+
and <constant>FILTER_FLAG_ENCODE_<replaceable>*</replaceable></constant>
840+
filter sanitization flags.
835841
</simpara>
836842
</listitem>
837843
</varlistentry>
@@ -853,10 +859,24 @@
853859
</term>
854860
<listitem>
855861
<simpara>
856-
ID of "string" filter.
857-
(<emphasis>Deprecated</emphasis> as of PHP 8.1.0,
858-
use <function>htmlspecialchars</function> instead.)
862+
This filter strips tags and HTML-encodes double and single quotes.
863+
</simpara>
864+
<simpara>
865+
Optionally it can strip or encode specified characters if used together with
866+
the <constant>FILTER_FLAG_STRIP_<replaceable>*</replaceable></constant>
867+
and <constant>FILTER_FLAG_ENCODE_<replaceable>*</replaceable></constant>
868+
filter sanitization flags.
859869
</simpara>
870+
<simpara>
871+
The behaviour of encoding quotes can be disabled by using the
872+
<constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant> filter flag.
873+
</simpara>
874+
<warning>
875+
<simpara>
876+
<emphasis>Deprecated</emphasis> as of PHP 8.1.0,
877+
use <function>htmlspecialchars</function> instead.
878+
</simpara>
879+
</warning>
860880
</listitem>
861881
</varlistentry>
862882
<varlistentry xml:id="constant.filter-sanitize-stripped">
@@ -879,7 +899,13 @@
879899
</term>
880900
<listitem>
881901
<simpara>
882-
ID of "encoded" filter.
902+
This filter URL-encodes a string.
903+
</simpara>
904+
<simpara>
905+
Optionally it can strip or encode specified characters if used together with
906+
the <constant>FILTER_FLAG_STRIP_<replaceable>*</replaceable></constant>
907+
and <constant>FILTER_FLAG_ENCODE_<replaceable>*</replaceable></constant>
908+
filter sanitization flags.
883909
</simpara>
884910
</listitem>
885911
</varlistentry>
@@ -889,8 +915,22 @@
889915
(<type>int</type>)
890916
</term>
891917
<listitem>
918+
<para>
919+
This filter HTML-encodes
920+
<simplelist type="inline">
921+
<member><literal>'</literal></member>
922+
<member><literal>"</literal></member>
923+
<member><literal>&lt;</literal></member>
924+
<member><literal>&gt;</literal></member>
925+
<member><literal>&amp;</literal></member>
926+
</simplelist>
927+
and characters with an ASCII value less than 32.
928+
</para>
892929
<simpara>
893-
ID of "special_chars" filter.
930+
Optionally it can strip specified characters if used together with
931+
the <constant>FILTER_FLAG_STRIP_<replaceable>*</replaceable></constant>
932+
filter sanitization flags, and it can encode characters with ASCII value
933+
greater than 127 using <constant>FILTER_FLAG_ENCODE_HIGH</constant>.
894934
</simpara>
895935
</listitem>
896936
</varlistentry>
@@ -901,8 +941,22 @@
901941
</term>
902942
<listitem>
903943
<simpara>
904-
ID of "full_special_chars" filter.
944+
This filter is equivalent to calling <function>htmlspecialchars</function>
945+
with <constant>ENT_QUOTES</constant> set.
905946
</simpara>
947+
<simpara>
948+
The behaviour of encoding quotes can be disabled by using the
949+
<constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant> filter flag.
950+
</simpara>
951+
<warning>
952+
<simpara>
953+
Like <function>htmlspecialchars</function>, this filter is aware of the
954+
<link linkend="ini.default-charset">default_charset</link> INI setting.
955+
If a sequence of bytes is detected that makes up an invalid character
956+
in the current character set then the entire string is rejected
957+
resulting in a empty string being returned.
958+
</simpara>
959+
</warning>
906960
</listitem>
907961
</varlistentry>
908962
<varlistentry xml:id="constant.filter-sanitize-email">

reference/filter/filters.xml

Lines changed: 0 additions & 126 deletions
Original file line numberDiff line numberDiff line change
@@ -3,132 +3,6 @@
33
<chapter xml:id="filter.filters" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink">
44
<title>Types of filters</title>
55

6-
<!-- Sanitize filters: {{{-->
7-
<section xml:id="filter.filters.sanitize">
8-
<title>Sanitize filters</title>
9-
<para>
10-
<table>
11-
<title>List of filters for sanitization</title>
12-
<tgroup cols="5">
13-
<thead>
14-
<row>
15-
<entry>ID</entry>
16-
<entry>Name</entry>
17-
<entry>Flags</entry>
18-
<entry>Description</entry>
19-
</row>
20-
</thead>
21-
<tbody>
22-
<row>
23-
<entry><constant>FILTER_SANITIZE_ENCODED</constant></entry>
24-
<entry>"encoded"</entry>
25-
<entry>
26-
<constant>FILTER_FLAG_STRIP_LOW</constant>,
27-
<constant>FILTER_FLAG_STRIP_HIGH</constant>,
28-
<constant>FILTER_FLAG_STRIP_BACKTICK</constant>,
29-
<constant>FILTER_FLAG_ENCODE_LOW</constant>,
30-
<constant>FILTER_FLAG_ENCODE_HIGH</constant>
31-
</entry>
32-
<entry>URL-encode string, optionally strip or encode special characters.</entry>
33-
</row>
34-
<row>
35-
<entry><constant>FILTER_SANITIZE_SPECIAL_CHARS</constant></entry>
36-
<entry>"special_chars"</entry>
37-
<entry>
38-
<constant>FILTER_FLAG_STRIP_LOW</constant>,
39-
<constant>FILTER_FLAG_STRIP_HIGH</constant>,
40-
<constant>FILTER_FLAG_STRIP_BACKTICK</constant>,
41-
<constant>FILTER_FLAG_ENCODE_HIGH</constant>
42-
</entry>
43-
<entry>
44-
HTML-encode <literal>'"&lt;&gt;&amp;</literal> and characters with
45-
ASCII value less than 32, optionally strip or encode other special
46-
characters.
47-
</entry>
48-
</row>
49-
<row>
50-
<entry><constant>FILTER_SANITIZE_FULL_SPECIAL_CHARS</constant></entry>
51-
<entry>"full_special_chars"</entry>
52-
<entry>
53-
<constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant>
54-
</entry>
55-
<entry>
56-
Equivalent to calling <function>htmlspecialchars</function> with <constant>ENT_QUOTES</constant> set. Encoding quotes can
57-
be disabled by setting <constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant>. Like <function>htmlspecialchars</function>, this
58-
filter is aware of the <link linkend="ini.default-charset">default_charset</link> and if a sequence of bytes is detected that
59-
makes up an invalid character in the current character set then the entire string is rejected resulting in a 0-length string.
60-
When using this filter as a default filter, see the warning below about setting the default flags to 0.
61-
</entry>
62-
</row>
63-
<row>
64-
<entry><constant>FILTER_SANITIZE_STRING</constant></entry>
65-
<entry>"string"</entry>
66-
<entry>
67-
<constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant>,
68-
<constant>FILTER_FLAG_STRIP_LOW</constant>,
69-
<constant>FILTER_FLAG_STRIP_HIGH</constant>,
70-
<constant>FILTER_FLAG_STRIP_BACKTICK</constant>,
71-
<constant>FILTER_FLAG_ENCODE_LOW</constant>,
72-
<constant>FILTER_FLAG_ENCODE_HIGH</constant>,
73-
<constant>FILTER_FLAG_ENCODE_AMP</constant>
74-
</entry>
75-
<entry>
76-
Strip tags and HTML-encode double and single quotes, optionally strip
77-
or encode special characters. Encoding quotes can be
78-
disabled by setting <constant>FILTER_FLAG_NO_ENCODE_QUOTES</constant>.
79-
(<emphasis>Deprecated</emphasis> as of PHP 8.1.0,
80-
use <function>htmlspecialchars</function> instead.)
81-
</entry>
82-
</row>
83-
<row>
84-
<entry><constant>FILTER_UNSAFE_RAW</constant></entry>
85-
<entry>"unsafe_raw"</entry>
86-
<entry>
87-
<constant>FILTER_FLAG_STRIP_LOW</constant>,
88-
<constant>FILTER_FLAG_STRIP_HIGH</constant>,
89-
<constant>FILTER_FLAG_STRIP_BACKTICK</constant>,
90-
<constant>FILTER_FLAG_ENCODE_LOW</constant>,
91-
<constant>FILTER_FLAG_ENCODE_HIGH</constant>,
92-
<constant>FILTER_FLAG_ENCODE_AMP</constant>
93-
</entry>
94-
<entry>
95-
Do nothing, optionally strip or encode special characters. This
96-
filter is also aliased to <constant>FILTER_DEFAULT</constant>.
97-
</entry>
98-
</row>
99-
</tbody>
100-
</tgroup>
101-
</table>
102-
</para>
103-
104-
<simplesect role="changelog">
105-
&reftitle.changelog;
106-
<para>
107-
<informaltable>
108-
<tgroup cols="2">
109-
<thead>
110-
<row>
111-
<entry>&Version;</entry>
112-
<entry>&Description;</entry>
113-
</row>
114-
</thead>
115-
<tbody>
116-
<row>
117-
<entry>8.1.0</entry>
118-
<entry>
119-
<constant>FILTER_SANITIZE_STRING</constant> and
120-
<constant>FILTER_SANITIZE_STRIPPED</constant> have been deprecated.
121-
</entry>
122-
</row>
123-
</tbody>
124-
</tgroup>
125-
</informaltable>
126-
</para>
127-
</simplesect>
128-
129-
</section>
130-
<!--}}}-->
131-
1326
<!-- Filter flags: {{{-->
1337
<section xml:id="filter.filters.flags">
1348
<title>Filter flags</title>

0 commit comments

Comments
 (0)