Skip to content

About the scoring mechanism #6

@HuangTH96

Description

@HuangTH96

Hi,

Thank you for your excellent work on LFG and for making the code publicly available.

I'm currently studying your paper and code to reproduce your results. I have a question regarding the score_clusters function in language_tools.py.

In the paper, you mention that "logprobs" are inappropriate for representing task-relevant probability. Instead, the method described uses Detic (VLM) to extract textual descriptors, and then queries GPT multiple times (n_s samples) to approximate the task-relevant probability through averaged scores.

However, in the score_clusters implementation, it appears to directly use logprobs from a single GPT query:

logprob = choice["logprobs"]["token_logprobs"][0]
top_logprobs = choice["logprobs"]["top_logprobs"][0]

Could you clarify the reasoning behind this implementation choice? Is this perhaps a simplified version for efficiency, or am I misunderstanding how the method maps to the paper's description?

Thank you very much for your time and assistance!

Best regards,
Huang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions