Skip to content
Discussion options

You must be logged in to vote

I would also be interested in the answer to this. I have found it impacts my rel model training, but not on every data set. From a theoretical standpoint my understanding is that it is a randomization of how often to ignore a weight adjustment/optimization on back propagation. It is meant to help prevent overfitting, which is often due to over representation of certain kinds of examples, which all imply roughly the same information so they aren’t individually as useful. Therefore dropping out a fixed portion of steps makes it easier for less represented portions of the data with more surprising information to shine.

I would guess that to better understand your specific case it would be us…

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@mlawson35
Comment options

@MatthiasMurray
Comment options

@svlandeg
Comment options

@mlawson35
Comment options

@svlandeg
Comment options

Answer selected by mlawson35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / rel Feature: Relation Extractor
3 participants