Skip to content

Commit 60d4a20

Browse files
committed
Documentation converted to .md
1 parent 9cbfdef commit 60d4a20

File tree

5 files changed

+92
-150
lines changed

5 files changed

+92
-150
lines changed

doc/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@
144144
# Add any paths that contain custom static files (such as style sheets) here,
145145
# relative to this directory. They are copied after the builtin static files,
146146
# so a file named "default.css" will overwrite the builtin "default.css".
147-
html_static_path = ['_static']
147+
# html_static_path = ['_static']
148148

149149
# Add any extra paths that contain custom files (such as robots.txt or
150150
# .htaccess) here, relative to this directory. These files are copied

doc/widgets/associationrules.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
Association Rules
2+
=================
3+
4+
Induction of association rules.
5+
6+
**Inputs**
7+
8+
- Data: Data set
9+
10+
**Outputs**
11+
12+
- Matching Data: Data instances matching the criteria.
13+
14+
This widget implements FP-growth [frequent pattern mining](https://en.wikipedia.org/wiki/Association_rule_learning) algorithm [1] with bucketing optimization [2] for conditional databases of few items. For inducing classification rules, it generates rules for the entire itemset and skips the rules where the consequent does not match one of the class' values.
15+
16+
![](images/association-rules-stamped.png)
17+
18+
1. Information on the data set.
19+
2. In *Find association rules* you can set criteria for rule induction:
20+
- **Minimal support**: percentage of the entire data set covered by the entire rule (antecedent and consequent).
21+
- **Minimal confidence**: proportion of the number of examples which fit the right side (consequent) among those that fit the left side (antecedent).
22+
- **Max. number of rules**: limit the number of rules the algorithm generates. Too many rules can slow down the widget considerably.
23+
If *Induce classification (itemset → class) rules* is ticked, the widget will only generate rules that have a class value on the right-hand side (consequent) of the rule.
24+
If *Auto find rules is on*, the widget will run the search at every change of parameters. Might be slow for data sets with many attributes, so pressing *Find rules* only when the parameters are set is a good idea.
25+
3. *Filter rules* by
26+
27+
- **Antecedent**:
28+
- *Contains*: will filter rules by matching space-separated [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) in antecedent items.
29+
- *Min. items*: minimum number of items that have to appear in an antecedent.
30+
- *Max. items*: maximum number of items that can appear in an antecedent.
31+
32+
- **Consequent**:
33+
- *Contains*: will filter rules by matching space-separated regular expressions in consequent items.
34+
- *Min. items*: minimum number of items that have to appear in a consequent.
35+
- *Max. items*: maximum number of items that can appear in a consequent.
36+
37+
If *Apply these filters in search* is ticked, the widget will limit the rule generation only to rules that match the filters. If unchecked, all rules are generated, but only the matching are shown.
38+
39+
4. If *Auto send selection is on*, data instances that match the selected association rules are output automatically. Alternatively press *Send selection*.
40+
41+
Example
42+
-------
43+
44+
Association Rules can be used directly with the [File](../data/file.md) widget.
45+
46+
![](images/association-rules-example1.png)
47+
48+
References and further reading
49+
------------------------------
50+
51+
[1]: J. Han, J. Pei, Y. Yin, R. Mao. (2004) [Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach](https://www.cs.sfu.ca/~jpei/publications/dami03_fpgrowth.pdf).
52+
53+
[2]: R. Agrawal, C. Aggarwal, V. Prasad. (2000) [Depth first generation of long patterns](http://www.cs.tau.ac.il/~fiat/dmsem03/Depth%20First%20Generation%20of%20Long%20Patterns%20-%202000.pdf).

doc/widgets/associationrules.rst

Lines changed: 0 additions & 85 deletions
This file was deleted.

doc/widgets/frequentitemsets.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
Frequent Itemsets
2+
=================
3+
4+
Finds frequent itemsets in the data.
5+
6+
**Inputs**
7+
8+
- Data: Data set
9+
10+
**Outputs**
11+
12+
- Matching Data: Data instances matching the criteria.
13+
14+
The widget finds [frequent items](https://en.wikipedia.org/wiki/Association_rule_learning) in a data set based on a measure of
15+
support for the rule.
16+
17+
![](images/frequent-itemsets-stamped.png)
18+
19+
1. Information on the data set. 'Expand all' expands the frequent itemsets tree, while 'Collapse all' collapses it.
20+
2. In *Find itemsets by* you can set criteria for itemset search:
21+
- **Minimal support**: a minimal ratio of data instances that must support (contain) the itemset for it to be generated. For large data sets it is normal to set a lower minimal support (e.g. between 2%-0.01%).
22+
- **Max. number of itemsets**: limits the upward quantity of generated itemsets. Itemsets are generated in no particular order.
23+
If *Auto find itemsets is on*, the widget will run the search at every change of parameters. Might be slow for large data sets, so pressing *Find itemsets* only when the parameters are set is a good idea.
24+
3. *Filter itemsets*:
25+
If you're looking for a specific item or itemsets, filter the results by [regular expressions](https://en.wikipedia.org/wiki/Regular_expression). Separate regular expressions by comma to filter by more than one word.
26+
- **Contains**: will filter itemsets by regular expressions.
27+
- **Min. items**: minimum number of items that have to appear in an itemset. If 1, all the itemsets will be displayed. Increasing it to, say, 4, will only display itemsets with four or more items.
28+
- **Max. items**: maximum number of items that are to appear in an itemset. If you wish to find, say, only itemsets with less than 5 items in it, you'd set this parameter to 5.
29+
If *Apply these filters in search* is ticked, the widget will filter the results in real time. Preferably not ticked for large data sets.
30+
4. If *Auto send selection is on*, changes are communicated automatically.
31+
Alternatively press *Send selection*.
32+
33+
Example
34+
-------
35+
36+
Frequent Itemsets can be used directly with the [File](../data/file.md) widget.
37+
38+
![](images/frequent-itemsets-example1.png)

doc/widgets/frequentitemsets.rst

Lines changed: 0 additions & 64 deletions
This file was deleted.

0 commit comments

Comments
 (0)