Skip to content

Commit 6ca13f3

Browse files
committed
Merge branch 'release-1.2.0'
2 parents f7ca4ba + 6c3cc03 commit 6ca13f3

22 files changed

+973
-279
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
*.pyc
2+
/*.egg-info
23
/MANIFEST
34
/build
45
/dist
6+
/docs/_build
57
/env
68
/env3

CHANGELOG

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
1.2.0
2+
- Multiprocessing (by loisaidasam)
3+
- Python 3 support
24
- Split up one big file into smaller more logical sub-modules
5+
- Fixed https://github.com/exhuma/python-cluster/issues/11
6+
- Documentation update.
37

48
1.1.1b3
59
- Fixed bug #1727558

INSTALL

Lines changed: 12 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,27 @@
11
INSTALLATION
22
============
33

4-
Linux
5-
-----
4+
Simply run::
65

7-
RPM-Installation
8-
~~~~~~~~~~~~~~~~
6+
pip install cluster
97

10-
I'm not familiar with RPM-distributions but as far as I know it should be
11-
something like::
8+
Or, if you run it in a virtualenv:
129

13-
rpm -i <filename.rpm>
10+
/path/to/your/env/bin/pip install cluster
1411

15-
RPM-source Installation
16-
~~~~~~~~~~~~~~~~~~~~~~~
1712

18-
This is something I don't know. If somebody can enlighten me, please do!
13+
Source installation
14+
~~~~~~~~~~~~~~~~~~~
1915

20-
Binary/Source installation
21-
~~~~~~~~~~~~~~~~~~~~~~~~~~
16+
Untar the archive::
2217

23-
Untar the package with you favourite archive tool. On the console it will be
24-
something along the lines::
25-
26-
tar xzf <filename.tar.gz>
18+
tar xf <filename.tar.gz>
2719

2820
Next, go to the folder just created. It will have the same name as the package
29-
(for example "cluster-1.0.0b1") and run::
30-
31-
python setup.py install
32-
33-
For this step you need root-priviledges
34-
35-
Windows
36-
-------
21+
(for example "cluster-1.2.0") and run::
3722

38-
Execute the executable file and follow the instructions displayed. Default
39-
values will be fine in most cases.
23+
python setup.py install
4024

41-
MacOS-X
42-
-------
25+
This will require superuser privileges unless you install it in a virtual environment::
4326

44-
Simply follow the same instructions as with the Linux-Source installation.
27+
/path/to/your/env/bin/python setup.py install

README renamed to README.rst

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,13 +9,17 @@ between two of those objects. For simple datatypes, like integers, this can be
99
as simple as a subtraction, but more complex calculations are possible. Right
1010
now, it is possible to generate the clusters using a hierarchical clustering
1111
and the popular K-Means algorithm. For the hierarchical algorithm there are
12-
different "linkage" (single, complete, average and uclus) methods available. I
13-
plan to implement other algoithms as well on an
14-
"as-needed" or "as-I-have-time" basis.
12+
different "linkage" (single, complete, average and uclus) methods available.
1513

1614
Algorithms are based on the document found at
1715
http://www.elet.polimi.it/upload/matteucc/Clustering/tutorial_html/
1816

17+
.. note::
18+
The above site is no longer avaialble, but you can still view it in the
19+
internet archive at:
20+
https://web.archive.org/web/20070912040206/http://home.dei.polimi.it//matteucc/Clustering/tutorial_html/
21+
22+
1923
USAGE
2024
=====
2125

@@ -33,10 +37,10 @@ Note, that when you retrieve a set of clusters, it immediately starts the
3337
clustering process, which is quite complex. If you intend to create clusters
3438
from a large dataset, consider doing that in a separate thread.
3539

36-
For K-Means clustering it would look like this:
40+
For K-Means clustering it would look like this::
3741

38-
>>> from cluster import KMeansClustering
39-
>>> cl = KMeansClustering([(1,1), (2,1), (5,3), ...])
40-
>>> clusters = cl.getclusters(2)
42+
>>> from cluster import KMeansClustering
43+
>>> cl = KMeansClustering([(1,1), (2,1), (5,3), ...])
44+
>>> clusters = cl.getclusters(2)
4145

4246
The parameter passed to getclusters is the count of clusters generated.

cluster/cluster.py

Lines changed: 34 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@
1515
# Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
1616
#
1717

18+
from __future__ import print_function
19+
1820

1921
class Cluster(object):
2022
"""
@@ -34,16 +36,13 @@ def __init__(self, level, *args):
3436
"""
3537
Constructor
3638
37-
PARAMETERS
38-
level - The level of this cluster. This is used in hierarchical
39-
clustering to retrieve a specific set of clusters. The
40-
higher the level, the smaller the count of clusters
41-
returned. The level depends on the difference function
42-
used.
43-
*args - every additional argument passed following the level value
44-
will get added as item to the cluster. You could also pass
45-
a list as second parameter to initialise the cluster with
46-
that list as content
39+
:param level: The level of this cluster. This is used in hierarchical
40+
clustering to retrieve a specific set of clusters. The higher the
41+
level, the smaller the count of clusters returned. The level depends
42+
on the difference function used.
43+
:param *args: every additional argument passed following the level value
44+
will get added as item to the cluster. You could also pass a list as
45+
second parameter to initialise the cluster with that list as content
4746
"""
4847
self.__level = level
4948
if len(args) == 0:
@@ -55,18 +54,16 @@ def append(self, item):
5554
"""
5655
Appends a new item to the cluster
5756
58-
PARAMETERS
59-
item - The item that is to be appended
57+
:param item: The item that is to be appended.
6058
"""
6159
self.__items.append(item)
6260

6361
def items(self, new_items=None):
6462
"""
6563
Sets or gets the items of the cluster
6664
67-
PARAMETERS
68-
new_items (optional) - if set, the items of the cluster will be
69-
replaced with that argument.
65+
:param new_items: if set, the items of the cluster will be replaced with
66+
that argument.
7067
"""
7168
if new_items is None:
7269
return self.__items
@@ -80,8 +77,7 @@ def fullyflatten(self, *args):
8077
some items of the cluster are clusters in their own right and you only
8178
want the items.
8279
83-
PARAMETERS
84-
*args - only used for recursion.
80+
:param *args: only used for recursion.
8581
"""
8682
flattened_items = []
8783
if len(args) == 0:
@@ -99,39 +95,41 @@ def fullyflatten(self, *args):
9995

10096
def level(self):
10197
"""
102-
Returns the level associated with this cluster
98+
Returns the level associated with this cluster.
10399
"""
104100
return self.__level
105101

106102
def display(self, depth=0):
107103
"""
108-
Pretty-prints this cluster. Useful for debuging
104+
Pretty-prints this cluster. Useful for debuging.
109105
"""
110-
print depth * " " + "[level %s]" % self.__level
106+
print(depth * " " + "[level %s]" % self.__level)
111107
for item in self.__items:
112108
if isinstance(item, Cluster):
113109
item.display(depth + 1)
114110
else:
115-
print depth * " " + "%s" % item
111+
print(depth * " " + "%s" % item)
116112

117113
def topology(self):
118114
"""
119115
Returns the structure (topology) of the cluster as tuples.
120116
121-
Output from cl.data:
122-
123-
<[email protected](['34.xls',
124-
125-
<[email protected](['ChangeLog', 'ChangeLog.txt'])>])>,
126-
<[email protected](['20060730.py',
127-
<[email protected](['.cvsignore',
128-
<[email protected](['About.py', <[email protected](['.idlerc',
129-
'.pylint.d'])>])>])>])>])>])>])>]
117+
Output from cl.data::
118+
119+
120+
<[email protected](['34.xls',
121+
122+
<[email protected](['ChangeLog', 'ChangeLog.txt'])>])>,
123+
<[email protected](['20060730.py',
124+
<[email protected](['.cvsignore',
125+
<[email protected](['About.py', <[email protected](['.idlerc',
126+
'.pylint.d'])>])>])>])>])>])>])>]
127+
128+
Corresponding output from cl.topo()::
130129
131-
Corresponding output from cl.topo():
132-
('CVS', ('34.xls', (('0.txt', ('ChangeLog', 'ChangeLog.txt')),
133-
('20060730.py', ('.cvsignore', ('About.py',
134-
('.idlerc', '.pylint.d')))))))
130+
('CVS', ('34.xls', (('0.txt', ('ChangeLog', 'ChangeLog.txt')),
131+
('20060730.py', ('.cvsignore', ('About.py',
132+
('.idlerc', '.pylint.d')))))))
135133
"""
136134

137135
left = self.__items[0]
@@ -157,10 +155,9 @@ def getlevel(self, threshold):
157155
receive and the higher you set it, you will receive less but bigger
158156
clusters.
159157
160-
PARAMETERS
161-
threshold - The level threshold
158+
:param threshold: The level threshold:
162159
163-
NOTE
160+
.. note::
164161
It is debatable whether the value passed into this method should
165162
really be as strongly linked to the real cluster-levels as it is
166163
right now. The end-user will not know the range of this value

0 commit comments

Comments
 (0)