Although I followed the exact settings outlined in the reproducibility file, my experiments consistently yield inferior results compared to those reported in the paper. Any suggestions, recommendations, or additional details I might overlooked?
WEB30K—Result in paper
| Loss |
Self-attention |
Self-attention |
Self-attention |
MLP |
MLP |
MLP |
| |
NDCG@5 |
NDCG@10 |
NDCG@30 |
NDCG@5 |
NDCG@10 |
NDCG@30 |
| NDCGLoss 2++ |
52.65+-0.37 |
54.49+-0.27 |
59.80+-0.08 |
49.15+-0.44 |
51.22+-0.34 |
57.14+-0.23 |
| LambdaRank |
52.29+-0.31 |
54.08+-0.19 |
59.48+-0.12 |
48.77+-0.38 |
50.85+-0.28 |
56.72+-0.17 |
WEB30K— Reproduce
| Loss |
Self-attention |
Self-attention |
Self-attention |
MLP |
MLP |
MLP |
| |
NDCG@5 |
NDCG@10 |
NDCG@30 |
NDCG@5 |
NDCG@10 |
NDCG@30 |
| NDCGLoss 2++ |
48.825 ±0.025 |
50.587±0.062 |
56.473±0.012 |
48.084±0.118 |
49.623±0.106 |
55.497±0.108 |
| LambdaRank |
48.015±0.351 |
49.602±0.147 |
55.466±0.180 |
41.739± 0.341 |
43.562±0.159 |
49.656±0.134 |