Skip to content

Commit 7d4a92b

Browse files
authored
Merge pull request #24 from DataboyUsen/main
MF formal pull request 1
2 parents 0859759 + d8ec4af commit 7d4a92b

File tree

9 files changed

+1497
-4
lines changed

9 files changed

+1497
-4
lines changed

doc/source/example.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Example Gallery
1616
examples/Path_solution.ipynb
1717
examples/Warm_start.ipynb
1818
examples/Sklearn_Mixin.ipynb
19+
examples/MF.ipynb
1920

2021
List of Examples
2122
----------------
@@ -30,4 +31,5 @@ List of Examples
3031
examples/RankRegression.ipynb
3132
examples/Path_solution.ipynb
3233
examples/Warm_start.ipynb
33-
examples/Sklearn_Mixin.ipynb
34+
examples/Sklearn_Mixin.ipynb
35+
examples/MF.ipynb

doc/source/examples/MF.ipynb

Lines changed: 413 additions & 0 deletions
Large diffs are not rendered by default.

doc/source/tutorials.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,4 +63,5 @@ List of Tutorials
6363
./tutorials/constraint
6464
./tutorials/ReHLine_sklearn
6565
./tutorials/warmstart
66+
./tutorials/ReHLine_MF
6667

Lines changed: 294 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,295 @@
11
ReHLine: Matrix Factorization
2-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3+
4+
This tutorial illustrates how to conduct Matrix Factorization (MF) with multiple PLQ loss functions through ReHLine.
5+
We provide 2 versions of prediction methods:
6+
7+
.. math::
8+
\begin{aligned}
9+
&\text{Including bias terms:} && \hat{r}_{ui} = \mathbf{p}_u^T \mathbf{q}_i + \alpha_u + \beta_i \\
10+
&\text{Excluding bias terms:} && \hat{r}_{ui} = \mathbf{p}_u^T \mathbf{q}_i \\
11+
\end{aligned}
12+
13+
14+
Mathematical Formulation
15+
------------------------
16+
17+
Considering a User-Item-Rating triplet dataset :math:`(u, i, r_{ui})` derived from target sparse matrix, the optimization problem corresponding to this scenario is:
18+
19+
.. math::
20+
\min_{\substack{
21+
\mathbf{P} \in \mathbb{R}^{n \times r}\
22+
\pmb{\alpha} \in \mathbb{R}^n \\
23+
\mathbf{Q} \in \mathbb{R}^{m \times r}\
24+
\pmb{\beta} \in \mathbb{R}^m
25+
}}
26+
\left[
27+
\sum_{(u,i)\in \Omega} C \cdot \text{PLQ}(r_{ui}, \ \mathbf{p}_u^T \mathbf{q}_i + \alpha_u + \beta_i)
28+
\right]
29+
+
30+
\left[
31+
\frac{\rho}{n}\sum_{u=1}^n(\|\mathbf{p}_u\|_2^2 + \alpha_u^2)
32+
+ \frac{1-\rho}{m}\sum_{i=1}^m(\|\mathbf{q}_i\|_2^2 + \beta_i^2)
33+
\right]
34+
35+
.. math::
36+
\ \text{ s.t. } \
37+
\mathbf{A} \begin{bmatrix}
38+
\pmb{\alpha} & \mathbf{P}
39+
\end{bmatrix}^T +
40+
\mathbf{b}\mathbf{1}_{n}^T \geq \mathbf{0}
41+
\ \text{ and } \
42+
\mathbf{A} \begin{bmatrix}
43+
\pmb{\beta} & \mathbf{Q}
44+
\end{bmatrix}^T +
45+
\mathbf{b}\mathbf{1}_{m}^T \geq \mathbf{0}
46+
47+
48+
where
49+
50+
- :math:`\text{PLQ}(\cdot , \cdot)`
51+
is a convex piecewise linear-quadratic loss function. You can find built-in loss functions in the `Loss <./loss.rst>`_ section.
52+
53+
- :math:`\mathbf{A}` is a :math:`K \times r` matrix and :math:`\mathbf{b}` is a :math:`K`-dimensional vector
54+
representing :math:`K` linear constraints. See `Constraints <./constraint.rst>`_ for more details.
55+
56+
- :math:`\Omega`
57+
is a user-item collection that records all training data
58+
59+
- :math:`n` is number of users, :math:`m` is number of items
60+
61+
- :math:`r` is length of latent factors (rank of MF)
62+
63+
- :math:`C` is regularization parameter, :math:`\rho` balances regularization strength between user and item
64+
65+
- :math:`\mathbf{p}_u` and :math:`\alpha_u`
66+
are latent vector and individual bias of u-th user. Specifically, :math:`\mathbf{p}_u` is the u-th row of :math:`\mathbf{P}`, and :math:`\alpha_u` is the u-th element of :math:`\pmb{\alpha}`
67+
68+
- :math:`\mathbf{q}_i` and :math:`\beta_i`
69+
are latent vector and individual bias of i-th item. Specifically, :math:`\mathbf{q}_i` is the i-th row of :math:`\mathbf{Q}`, and :math:`\beta_i` is the i-th element of :math:`\pmb{\beta}`
70+
71+
72+
Implementation Guide
73+
--------------------
74+
75+
A simple synthetic dataset is used for illustration. The implementation can be easily adapted to your specific triplet data, allowing you to experiment with various loss functions.
76+
77+
Setup
78+
^^^^^
79+
80+
To proceed, ensure that you have already installed :code:`rehline`:
81+
82+
.. code-block:: bash
83+
84+
pip install rehline
85+
86+
Basic Usage
87+
^^^^^^^^^^^
88+
89+
.. code-block:: python
90+
91+
# 1. Necessary Packages
92+
import numpy as np
93+
from rehline import plqMF_Ridge, make_ratings
94+
from sklearn.model_selection import train_test_split
95+
from sklearn.metrics import mean_absolute_error
96+
97+
98+
# 2. Data Preparation
99+
# Generate synthetic data (replace with your own data in practice)
100+
user_num, item_num = 1200, 4000
101+
ratings = make_ratings(n_users=user_num, n_items=item_num,
102+
n_interactions=50000, seed=42)
103+
104+
# Split into training and testing sets
105+
X_train, X_test, y_train, y_test = train_test_split(
106+
ratings['X'], ratings['y'], test_size=0.3, random_state=42)
107+
108+
109+
# 3. Model Construction
110+
clf = plqMF_Ridge(
111+
C=0.001, ## Regularization strength
112+
rank=6, ## Latent factor dimension
113+
loss={'name': 'mae'}, ## Use absolute loss
114+
n_users=user_num, ## Number of users
115+
n_items=item_num, ## Number of items
116+
)
117+
clf.fit(X_train, y_train)
118+
119+
120+
# 4. Evaluation
121+
y_pred = clf.decision_function(X_test)
122+
mae_score = mean_absolute_error(y_test, y_pred)
123+
print(f"Test MAE: {mae_score:.3f}")
124+
125+
Advanced Configuration
126+
^^^^^^^^^^^^^^^^^^^^^^
127+
128+
Choosing different `loss functions <./loss.rst>`_ through :code:`loss`:
129+
130+
.. code-block:: python
131+
132+
# Square loss
133+
clf_mse = plqMF_Ridge(
134+
C=0.001,
135+
rank=6,
136+
loss={'name': 'mse'}, ## Choose square loss
137+
n_users=user_num,
138+
n_items=item_num)
139+
140+
# Hinge loss (suitable for binary data)
141+
clf_hinge = plqMF_Ridge(
142+
C=0.001,
143+
rank=6,
144+
loss={'name': 'hinge'}, ## Choose hinge loss
145+
n_users=user_num,
146+
n_items=item_num)
147+
148+
`Linear constraints <./constraint.rst>`_ can be applied via :code:`constraint`:
149+
150+
.. code-block:: python
151+
152+
# Implement a linear constraint
153+
clf_nonnegative = plqMF_Ridge(
154+
C=0.001,
155+
rank=6,
156+
loss={'name': 'mae'},
157+
n_users=user_num,
158+
n_items=item_num,
159+
constraint=[{'name': '>=0'}] ## Use nonnegative constraint
160+
)
161+
162+
The algorithm includes bias terms by default. To disable them, set: :code:`biased=False`:
163+
164+
.. code-block:: python
165+
166+
# Exclude user and item biases
167+
clf_unbiased = plqMF_Ridge(
168+
C=0.001,
169+
rank=6,
170+
loss={'name': 'mae'},
171+
n_users=user_num,
172+
n_items=item_num,
173+
biased=False ## Disable bias terms
174+
)
175+
176+
Imposing different strengths of regularization on items/users through :code:`rho`:
177+
178+
.. code-block:: python
179+
180+
# Imbalanced penalty
181+
clf_asymmetric = plqMF_Ridge(
182+
C=0.001,
183+
rank=6,
184+
loss={'name': 'mae'},
185+
n_users=user_num,
186+
n_items=item_num,
187+
rho=0.7 ## Add heavier penalties for user parameters
188+
)
189+
190+
Parameter Tuning
191+
^^^^^^^^^^^^^^^^
192+
193+
The model complexity is mainly controlled by :code:`C` and :code:`rank`.
194+
195+
.. code-block:: python
196+
197+
198+
for C_value in [0.0002, 0.001, 0.005]:
199+
clf = plqMF_Ridge(
200+
C=C_value, ## Try different regularization strengths
201+
rank=6,
202+
loss={'name': 'mae'},
203+
n_users=user_num,
204+
n_items=item_num
205+
)
206+
clf.fit(X_train, y_train)
207+
y_pred = clf.decision_function(X_test)
208+
mae = mean_absolute_error(y_test, y_pred)
209+
print(f"C={C_value}: MAE = {mae:.3f}")
210+
211+
212+
for rank_value in [4, 8, 12]:
213+
clf = plqMF_Ridge(
214+
C=0.001,
215+
rank=rank_value, ## Try different latent factor dimensions
216+
loss={'name': 'mae'},
217+
n_users=user_num,
218+
n_items=item_num
219+
)
220+
clf.fit(X_train, y_train)
221+
y_pred = clf.decision_function(X_test)
222+
mae = mean_absolute_error(y_test, y_pred)
223+
print(f"rank={rank_value}: MAE = {mae:.3f}")
224+
225+
Convergence Tracking
226+
^^^^^^^^^^^^^^^^^^^^
227+
228+
You can customize the optimization process by setting your preferred iteration counts and tolerance levels.
229+
Training progress can be monitored either by enabling :code:`verbose` output during fitting or by examining the :code:`history` attribute after fitting.
230+
231+
.. code-block:: python
232+
233+
clf = plqMF_Ridge(
234+
C=0.001,
235+
rank=6,
236+
loss={'name': 'mae'},
237+
n_users=user_num,
238+
n_items=item_num,
239+
max_iter_CD=15, ## Outer CD iterations
240+
tol_CD=1e-5, ## Outer CD tolerance
241+
max_iter=8000, ## ReHLine solver iterations
242+
tol=1e-2, ## ReHLine solver tolerance
243+
verbose=1, ## Enable progress output
244+
)
245+
clf.fit(X_train, y_train)
246+
247+
print(clf.history) ## Check training trace of cumulative loss and objection value
248+
249+
Different Gaussian initial conditions can be manually set by :code:`init_mean` and :code:`init_sd`:
250+
251+
.. code-block:: python
252+
253+
# Initialize model with positive shifted normal
254+
clf = plqMF_Ridge(
255+
C=0.001,
256+
rank=6,
257+
loss={'name': 'mae'},
258+
n_users=user_num,
259+
n_items=item_num,
260+
init_mean=1.0, ## Manually set mean of normal distribution
261+
init_sd=0.5 ## Manually set sd of normal distribution
262+
)
263+
264+
Practical Guidance
265+
^^^^^^^^^^^^^^^^^^
266+
267+
- The first column of :code:`X` corresponds to **users**, and the second column corresponds to **items**. Please ensure this aligns with your :code:`n_users` and :code:`n_items` parameters.
268+
- The default penalty strength is relatively weak; it is recommended to set a relatively small :code:`C` value initially.
269+
- When using larger :code:`C` values, consider increasing :code:`max_iter` to avoid ConvergenceWarning.
270+
271+
272+
Regularization Conversion
273+
-------------------------
274+
The regularization in this algorithm is tuned via :math:`C` and :math:`\rho`. For users who prefer to set the penalty strength directly, you may achieve conversion through the following formula:
275+
276+
.. math::
277+
\lambda_{\text{user}} = \frac{\rho}{Cn}
278+
\quad\text{and}\quad
279+
\lambda_{\text{item}} = \frac{(1 - \rho)}{Cm}
280+
281+
282+
.. math::
283+
C = \frac{1}{m \cdot \lambda_{\text{item}} + n \cdot \lambda_{\text{user}}}
284+
\quad\text{and}\quad
285+
\rho = \frac{1}{\frac{m \cdot \lambda_{\text{item}}}{ n \cdot \lambda_{\text{user}}}+1}
286+
287+
288+
Example
289+
-------
290+
291+
.. nblinkgallery::
292+
:caption: Empirical Risk Minimization
293+
:name: rst-link-gallery
294+
295+
../examples/MF.ipynb

rehline/__init__.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,18 @@
55
from ._internal import rehline_internal, rehline_result
66
from ._path_sol import plqERM_Ridge_path_sol
77
from ._sklearn_mixin import plq_Ridge_Classifier, plq_Ridge_Regressor
8+
from ._mf_class import plqMF_Ridge
9+
from ._data import make_ratings
810

911
__all__ = ("_BaseReHLine",
1012
"ReHLine_solver",
1113
"ReHLine",
1214
"plqERM_Ridge",
1315
"CQR_Ridge",
16+
"plqMF_Ridge",
1417
"plqERM_Ridge_path_sol",
1518
"plq_Ridge_Classifier",
1619
"plq_Ridge_Regressor",
1720
"_make_loss_rehline_param",
18-
"_make_constraint_rehline_param")
21+
"_make_constraint_rehline_param",
22+
"make_ratings")

rehline/_base.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -393,6 +393,20 @@ def _make_loss_rehline_param(loss, X, y):
393393
U = np.array([[1.0] * n, [-1.0] * n])
394394
V = np.array([-y , y])
395395

396+
elif (loss['name'] == 'SVM square') \
397+
or (loss['name'] == 'svm square') \
398+
or (loss['name'] == 'hinge square'):
399+
Tau = np.inf * np.ones((1, n))
400+
S = - np.sqrt(2) * y.reshape(1,-1)
401+
T = np.sqrt(2) * np.ones((1, n))
402+
403+
elif (loss['name'] == 'MSE') \
404+
or (loss['name'] == 'mse') \
405+
or (loss['name'] == 'mean square error'):
406+
Tau = np.inf * np.ones((2, n))
407+
S = np.array([[np.sqrt(2)] * n, [-np.sqrt(2)] * n])
408+
T = np.array([-np.sqrt(2) * y , np.sqrt(2) * y])
409+
396410

397411
else:
398412
raise Exception("Sorry, ReHLine currently does not support this loss function, \

0 commit comments

Comments
 (0)