Skip to content

Unreasonable Samples in MANS. #2

@ZhuohanX

Description

@ZhuohanX

Hi,

Thank you so much for providing this work, it is very inspiring and we are keen to use the resources and compare other newly proposed metrics.
However, I am not quite sure if I understand the paper and data correctly.
It seems that in Table 3, you split each unreasonable samples into 4 categories while in your provided data, there is a score of a list of 5 integers for each generation of each model (which I assume is the overall score by 5 annotators?) but there is no label for each story would unreasonable type it should belong to.
I am not quite sure if I have missed the details here how you decide which story belongs to which error type?
Also when you mention that you set reasonable and unreasonable samples with binary labels 1 and 0 in Section 4.2, does that mean all reasonable samples are considered four times for each problem types?
Like, for ROC, you have 46 Reasonable Samples as 1 and 22 Unreasonable Samples as 0 for Rept and then
46 Reasonable Samples as 1 for Unrel again and 319 Unreasonable Samples as 0 for Unrel type.
Any illustration on this would be much appreciated.
Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions