Skip to content

Experiments

Marco Fossati edited this page May 8, 2019 · 4 revisions

Default evaluation technique

Applies to all experiments:

  • stratified 5-fold cross validation over training/test splits;
  • mean performance scores over the folds.

Single-layer perceptron optimizers

https://github.com/Wikidata/soweego/issues/285

Setting

  • run: May 3 2019;
  • output folder: soweego-2.eqiad.wmflabs:/srv/dev/20190503/;
  • head commit: d0d390e622f2782a49a1bd0ebfc64478ed34aa0c;
  • command: python -m soweego linker evaluate slp ${Dataset} ${Entity} optimizer=${Optimizer}.

Discogs band

Optimizer Precision Recall F-score
sgd .782 .945 .856
rmsprop .801 .930 .860
nadam .805 .925 .861
adamax .795 .938 .861
adam .800 .929 .860
adagrad .802 .927 .859
adadelta .799 .934 .861

Discogs musician

Optimizer Precision Recall F-score
sgd .815 .985 .892
rmsprop .816 .985 .893
nadam .816 .986 .893
adamax .817 .985 .893
adam .816 .985 .893
adagrad .816 .986 .893
adadelta .815 .986 .892

Imdb director

Optimizer Precision Recall F-score
sgd .918 .954 .936
rmsprop .895 .954 .923
nadam .908 .954 .930
adamax .907 .955 .930
adam .909 .953 .931
adagrad .867 .950 .907
adadelta .902 .954 .927

Imdb musician

Optimizer Precision Recall F-score
sgd .912 .927 .920
rmsprop .913 .929 .921
nadam .913 .929 .921
adamax .913 .928 .921
adam .913 .928 .921
adagrad .873 .860 .866
adadelta .913 .928 .921

Imdb producer

Optimizer Precision Recall F-score
sgd .917 .942 .929
rmsprop .916 .938 .927
nadam .916 .938 .927
adamax .916 .940 .928
adam .916 .938 .927
adagrad .852 .684 .756
adadelta .916 .939 .928

Imdb writer

Optimizer Precision Recall F-score
sgd .929 .943 .936
rmsprop .927 .940 .934
nadam .930 .940 .935
adamax .930 .941 .935
adam .930 .940 .935
adagrad .872 .923 .896
adadelta .931 .941 .936

Musicbrainz band

Optimizer Precision Recall F-score
sgd .952 .869 .909
rmsprop .949 .875 .911
nadam .949 .877 .911
adamax .952 .871 .910
adam .951 .875 .911
adagrad .932 .886 .909
adadelta .952 .874 .911

Musicbrainz musician

Optimizer Precision Recall F-score
sgd .942 .957 .949
rmsprop .941 .958 .949
nadam .941 .958 .949
adamax .941 .958 .949
adam .941 .958 .949
adagrad .946 .953 .950
adadelta .941 .958 .950

Takeaways

  • All optimizers seem to do a similar job;
  • no specific impact on the performance.

Discogs band

Algorithm Precision Recall F-score
nb max .787 .955 .863
nb avg .789 .941 .859
lsvm max .780 .960 .861
lsvm avg .785 .946 .858
svm max .777 .963 .860
svm avg .777 .963 .860
slp max .784 .954 .861
slp avg .776 .956 .857
mlp max .822 .925 .870

Discogs musician

Algorithm Precision Recall F-score
nb max .831 .975 .897
nb avg .836 .958 .893
lsvm max .818 .985 .894
lsvm avg .814 .986 .892
svm max .815 .985 .892
svm avg .815 .985 .892
slp max .821 .983 .895
slp avg .815 .985 .892
mlp max .852 .963 .904

Imdb director

Algorithm Precision Recall F-score
nb max .896 .971 .932
nb avg .897 .971 .932
lsvm max .919 .943 .931
lsvm avg .919 .942 .930
svm max .911 .950 .930
svm avg .908 .958 .932
slp max .917 .953 .935
slp avg .867 .953 .908
mlp max .913 .964 .938

Imdb musician

Algorithm Precision Recall F-score
nb max .889 .962 .924
nb avg .891 .960 .924
lsvm max .917 .938 .927
lsvm avg .917 .937 .927
svm max .904 .944 .924
svm avg .908 .942 .924
slp max .924 .929 .926
slp avg .922 .914 .918
mlp max .912 .951 .931

Imdb producer

Algorithm Precision Recall F-score
nb max .870 .971 .918
nb avg .871 .970 .918
lsvm max .920 .940 .930
lsvm avg .920 .938 .929
svm max .923 .927 .925
svm avg .923 .926 .925
slp max .914 .940 .927
slp avg .862 .914 .883
mlp max .911 .956 .933

Imdb writer

Algorithm Precision Recall F-score
nb max .904 .975 .938
nb avg .910 .961 .935
lsvm max .936 .949 .943
lsvm avg .936 .948 .942
svm max .932 .954 .943
svm avg .932 .954 .943
slp max .938 .946 .942
slp avg .903 .955 .928
mlp max .930 .963 .946

Musicbrainz band

Algorithm Precision Recall F-score
nb max .821 .987 .896
nb avg .822 .985 .896
lsvm max .944 .879 .910
lsvm avg .943 .888 .914
svm max .930 .891 .910
svm avg .939 .893 .915
slp max .953 .865 .907
slp avg .930 .885 .907
mlp max .906 .918 .911

Musicbrainz musician

Algorithm Precision Recall F-score
nb max .955 .936 .946
nb avg .955 .936 .946
lsvm max .941 .963 .952
lsvm avg .941 .962 .952
svm max .951 .938 .944
svm avg .950 .938 .944
slp max .942 .957 .949
slp avg .943 .956 .949
mlp max .939 .970 .954

Takeaways

The impact of max Levenshtein:

  • NB is always improved or left untouched;
  • LSVM is always improved, left untouched for IMDb director, but worsens for MusicBrainz band;
  • SVM is often left untouched, but worsens for IMDb director and MusicBrainz band;
  • SLP is always improved with the highest impact, left untouched for MusicBrainz;
  • conclusion: max Levenshtein should replace the average one.

Clone this wiki locally