Skip to content

Conversation

@kweonwooj
Copy link

Problem

Current implementation fails to reproduce semi-supervised method you mention in solution.pdf.
This is due to data.py #L103~104 where all the test data is assigned a label of len(self.label_words_dict), which is unknown.
There is no execution error, but kaggle submission results in ~0.15.

Solution

  • return both list of data and list of label in get_sub_list() so that correct labels are fed for semi-supervised setting
  • this fix results in public/private leaderboard score ~0.85 which is decent, but does not improve overall ensemble score as mentioned in solution.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant