Skip to content

Commit 39e2a03

Browse files
Add docs from nexB/aboutcode
Migrate all scancode-toolkit documentation from aboutcode till this commit - aboutcode-org/aboutcode@faea9fc Last pull request adding scancode docs was aboutcode-org/aboutcode#33 Signed-off-by: Ayan Sinha Mahapatra <[email protected]>
1 parent cd59436 commit 39e2a03

File tree

113 files changed

+7791
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

113 files changed

+7791
-0
lines changed
Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
`Basic` Options
2+
===============
3+
4+
.. include:: /scancode-toolkit/rst_snippets/basic_options.rst
5+
6+
----
7+
8+
.. include:: /scancode-toolkit/rst_snippets/note_snippets/synopsis_install_quickstart.rst
9+
10+
----
11+
12+
``--generated`` Options
13+
-----------------------
14+
15+
The ``--generated`` option classifies automatically generated code files with a flag.
16+
17+
An example of using ``--generated`` in a scan::
18+
19+
scancode -clpieu --json-pp output.json samples --generated
20+
21+
In the results, for each file the following attribute is added with it's corresponding
22+
``true``/``false`` value ::
23+
24+
"is_generated": true
25+
26+
In the samples folder, the following files have a true value for their is_generated attribute::
27+
28+
"samples/zlib/dotzlib/LICENSE_1_0.txt"
29+
"samples/JGroups/licenses/apache-2.0.txt"
30+
31+
..
32+
[ToDo] Research and Write Better
33+
34+
----
35+
36+
``--max-email`` Options
37+
-----------------------
38+
39+
.. admonition:: Dependency
40+
41+
The option ``--max-email`` is a sub-option of and requires the option ``--email``.
42+
43+
If in the files that are scanned, in individual files, there are a lot of emails (i.e lists) which
44+
are unnecessary and clutter the scan results, ``--max-email`` option can be used to report emails
45+
only up to a limit in individual files.
46+
47+
Some important INTEGER values of the ``--max-email INTEGER`` option:
48+
49+
- 0 - No limit, include all emails.
50+
- 50 - Default.
51+
52+
An example usage::
53+
54+
scancode -clpieu --json-pp output.json samples --max-email 5
55+
56+
This only reports 5 email addresses per file and ignores the rest.
57+
58+
----
59+
60+
``--max-url`` Options
61+
---------------------
62+
63+
.. admonition:: Dependency
64+
65+
The option ``--max-url`` is a sub-option of and requires the option ``--url``.
66+
67+
If in the files that are scanned, in individual files, there are a lot of links to other websites
68+
(i.e url lists) which are unnecessary and clutter the scan results, ``--max-url`` option can be
69+
used to report urls only up to a limit in individual files.
70+
71+
Some important INTEGER values of the ``--max-url INTEGER`` option:
72+
73+
- 0 - No limit, include all urls.
74+
- 50 - Default.
75+
76+
An example usage::
77+
78+
scancode -clpieu --json-pp output.json samples --max-url 10
79+
80+
This only reports 10 urls per file and ignores the rest.
81+
82+
----
83+
84+
``--license-score`` Options
85+
---------------------------
86+
87+
.. admonition:: Dependency
88+
89+
The option ``--license-score`` is a sub-option of and requires the option ``--license``.
90+
91+
..
92+
[ToDo] Research and Write License Matching Better
93+
94+
License matching strictness, i.e. How closely matched licenses are detected in a scan, can be
95+
modified by using this ``--license-score`` option.
96+
97+
Some important INTEGER values of the ``--license-score INTEGER`` option:
98+
99+
- **0** - Default and Lowest Value, All matches are reported.
100+
- **100** - Highest Value, Only licenses with a much better match are reported
101+
102+
Here, a bigger number means a better match, i.e. Setting a higher license score translates to a
103+
higher threshold for matching licenses (with equal or less number of license matches).
104+
105+
An example usage::
106+
107+
scancode -clpieu --json-pp output.json samples --license-score 70
108+
109+
Here's the license results on setting the integer value to 100, Vs. the default value 0. This is
110+
visualized using ScanCode workbench in the License Info Dashboard.
111+
112+
.. list-table:: License scan results of Samples Directory.
113+
114+
* - .. figure:: data/core_lic_score_0.png
115+
116+
License Score 0 (Default).
117+
118+
- .. figure:: data/core_lic_score_100.png
119+
120+
License Score 100.
121+
122+
----
123+
124+
``--license-text`` Options
125+
--------------------------
126+
127+
.. admonition:: Dependency
128+
129+
The option ``--license-text`` is a sub-option of and requires the option ``--license``.
130+
131+
.. admonition:: Sub-Option
132+
133+
The option ``--license-text-diagnostics`` and ``--is-license-text`` are sub-options of
134+
``--license-text``. ``--is-license-text`` is a Post-Scan Option.
135+
136+
With the ``--license-text`` option, the scan results attribute "matched text" includes the matched text
137+
for the detected license.
138+
139+
An example Scan::
140+
141+
scancode -cplieu --json-pp output.json samples --license-text
142+
143+
An example matched text included in the results is as follows::
144+
145+
"matched_text":
146+
" This software is provided 'as-is', without any express or implied
147+
warranty. In no event will the authors be held liable for any damages
148+
arising from the use of this software.
149+
Permission is granted to anyone to use this software for any purpose,
150+
including commercial applications, and to alter it and redistribute it
151+
freely, subject to the following restrictions:
152+
1. The origin of this software must not be misrepresented; you must not
153+
claim that you wrote the original software. If you use this software
154+
in a product, an acknowledgment in the product documentation would be
155+
appreciated but is not required.
156+
2. Altered source versions must be plainly marked as such, and must not be
157+
misrepresented as being the original software.
158+
3. This notice may not be removed or altered from any source distribution.
159+
160+
Jean-loup Gailly Mark Adler
161+
162+
163+
- The file in which this license was detected: ``samples/arch/zlib.tar.gz-extract/zlib-1.2.8/zlib.h``
164+
- License name: "ZLIB License"
165+
166+
----
167+
168+
``--license-url-template`` Options
169+
----------------------------------
170+
171+
.. admonition:: Dependency
172+
173+
The option ``--license-url-template`` is a sub-option of and requires the option
174+
``--license``.
175+
176+
The ``--license-url-template`` option sets the template URL used for the license reference URLs.
177+
178+
The default template URL is : [https://enterprise.dejacode.com/urn/urn:dje:license:{}]
179+
In a template URL, curly braces ({}) are replaced by the license key.
180+
181+
So, by default the license reference URL points to the dejacode page for that license.
182+
183+
A scan example using the ``--license-url-template TEXT`` option ::
184+
185+
scancode -clpieu --json-pp output.json samples --license-url-template https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/{}.yml
186+
187+
In a normal scan, reference url for "ZLIB License" is as follows::
188+
189+
"reference_url": "https://enterprise.dejacode.com/urn/urn:dje:license:zlib",
190+
191+
After using the option in the following manner::
192+
193+
``--license-url-template https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/{}``
194+
195+
the reference URL changes to this `zlib.yml file <https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/licenses/zlib.yml>`_::
196+
197+
"reference_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/zlib.yml",
198+
199+
The reference URL changes for all detected licenses in the scan, across the scan result file.
200+
201+
----
202+
203+
``--license-text-diagnostics`` Options
204+
--------------------------------------
205+
206+
.. admonition:: Dependency
207+
208+
The option ``--license-text-diagnostics`` is a sub-option of and requires the options
209+
``--license`` and ``--license-text``.
210+
211+
In the matched license text, include diagnostic highlights surrounding with square brackets []
212+
words that are not matched.
213+
214+
In a normal scan, whole lines of text are included in the matched license text, including parts
215+
that are possibly unmatched.
216+
217+
An example Scan::
218+
219+
scancode -cplieu --json-pp output.json samples --license-text --license-text-diagnostics
220+
221+
Running a scan on the samples directory with ``--license-text --license-text-diagnostics`` options,
222+
causes the following difference in the scan result of the file
223+
``samples/JGroups/licenses/bouncycastle.txt``.
224+
225+
Without Diagnostics::
226+
227+
"matched_text":
228+
"License Copyright (c) 2000 - 2006 The Legion Of The Bouncy Castle
229+
(http://www.bouncycastle.org) Permission is hereby granted, free of charge, to any person
230+
obtaining a copy of this software and associated documentation files (the \"Software\"),
231+
to deal in the Software without restriction
232+
233+
With Diagnostics on::
234+
235+
"matched_text":
236+
"License [Copyright] ([c]) [2000] - [2006] [The] [Legion] [Of] [The] [Bouncy] [Castle]
237+
([http]://[www].[bouncycastle].[org]) Permission is hereby granted, free of charge, to any person
238+
obtaining a copy of this software and associated documentation files (the \"Software\"),
239+
to deal in the Software without restriction,
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
`Core` Options
2+
==============
3+
4+
.. _cli_core:
5+
6+
.. include:: /scancode-toolkit/rst_snippets/core_options.rst
7+
8+
----
9+
10+
.. include:: /scancode-toolkit/rst_snippets/note_snippets/synopsis_install_quickstart.rst
11+
12+
----
13+
14+
Comparing Progress Message Options
15+
----------------------------------
16+
17+
**Default Progress Message**::
18+
19+
Scanning files for: infos, licenses, copyrights, packages, emails, urls with 1 process(es)...
20+
Building license detection index...Done.
21+
Scanning files...
22+
[####################] 43
23+
Scanning done.
24+
Scan statistics: 43 files scanned in 33s.
25+
Scan options: infos, licenses, copyrights, packages, emails, urls with 1 process(es).
26+
Scanning speed: 1.4 files per sec.
27+
Scanning time: 30s.
28+
Indexing time: 2s.
29+
Saving results.
30+
31+
**Progress Message with ``--verbose``**::
32+
33+
Scanning files for: infos, licenses, copyrights, packages, emails, urls with 1 process(es)...
34+
Building license detection index...Done.
35+
Scanning files...
36+
Scanned: screenshot.png
37+
Scanned: README
38+
...
39+
Scanned: zlib/dotzlib/ChecksumImpl.cs
40+
Scanned: zlib/dotzlib/readme.txt
41+
Scanned: zlib/gcc_gvmat64/gvmat64.S
42+
Scanned: zlib/ada/zlib.ads
43+
Scanned: zlib/infback9/infback9.c
44+
Scanned: zlib/infback9/infback9.h
45+
Scanned: arch/zlib.tar.gz
46+
Scanning done.
47+
Scan statistics: 43 files scanned in 29s.
48+
Scan options: infos, licenses, copyrights, packages, emails, urls with 1 process(es).
49+
Scanning speed: 1.58 files per sec.
50+
Scanning time: 27s.
51+
Indexing time: 2s.
52+
Saving results.
53+
54+
So, with ``--verbose`` enables, progress messages for individual files are shown.
55+
56+
**With the ``--quiet`` option enabled**, nothing is printed on the Command Line.
57+
58+
----
59+
60+
``--timeout`` Option
61+
--------------------
62+
63+
This option sets scan timeout for **each file** (and not the entire scan). If some file scan
64+
exceeds the specified timeout, that file isn't scanned anymore and the next file scanning
65+
starts. This helps avoiding very large/long files, and saves time.
66+
67+
Also the number (timeout in seconds) to be followed by this option can be a
68+
floating point number, i.e. 1.5467.
69+
70+
----
71+
72+
``--reindex-licenses`` Option
73+
-----------------------------
74+
75+
ScanCode maintains a license index to search for and detect licenses. When Scancode is
76+
configured for the first time, a license index is built and used in every scan thereafter.
77+
78+
This ``--reindex-licenses`` option rebuilds the license index. Running a scan with this option
79+
displays the following message to the terminal in addition to what it normally shows::
80+
81+
Checking and rebuilding the license index...
82+
83+
..
84+
[ToDo] Research and Write Better
85+
86+
----
87+
88+
``--from-json`` Option
89+
----------------------
90+
91+
If you want to input scan results from a .json file, and run a scan again on those same files,
92+
with some other options/output format, you can do so using the ``--from-json`` option.
93+
94+
An example scan command using ``--from-json``::
95+
96+
scancode --from-json sample.json --json-pp sample_2.json --classify
97+
98+
This inputs the scan results from ``sample.json``, runs the post-scan plugin ``--classify`` and
99+
outputs the results for this scan to ``sample_2.json``.
100+
101+
----
102+
103+
``--max-in-memory`` Option
104+
----------------------------------
105+
106+
During a scan, as individual files are scanned, the scan details for those files are kept on
107+
memory till the scan is completed. Then after the scan is completed, they are written in the
108+
specified output format.
109+
110+
Now, if the scan involves a very large number of files, they might not fit in the memory during
111+
the scan. For this reason, disk-caching can be used for some/all of the files.
112+
113+
Some important INTEGER values of the ``--max-in-memory INTEGER`` option:
114+
115+
- **0** - Unlimited Memory, store all the file/directory scan results on memory
116+
- **-1** - Use only Disk-Caching, store all the file/directory scan results on disk
117+
- **10000** - Default, store 10,000 file/directory scan results on memory and the rest on disk
118+
119+
An example usage::
120+
121+
scancode -clieu --json-pp sample.json samples --max-in-memory -1
51.3 KB
Loading
64.8 KB
Loading
231 KB
Loading
220 KB
Loading
123 KB
Loading
101 KB
Loading
99 KB
Loading
21.6 KB
Loading

0 commit comments

Comments
 (0)