Skip to content

Critic Training pre-processing steps #47

@xylankant

Description

@xylankant

Hello,

Thanks for making the code for this great project open source, this is really great!

We are using CodeRL as a really nice starting point for student projects, and there are some questions for understanding:
In the "Critic Training" section, you say the following:

We can train a critic model as a classifier that predicts the test outcomes of generated samples. For each training sample, we can follow the prior processes (generating programs and running unit tests) to obtain synthetic samples and their annotations of unit test outcomes. On average, we generate 20 programs per training sample (we provided some example generated programs in data/APPS/train/).

  • You don't explicitly say, but from context I think you are using the CodeT5-large-ntp-py model for this?
  • What do you mean by "on average" 20 programs per training sample? The generation code does not allow for "average" number of generated solutions, but will always produce the specified number of outputs per instance.
  • Related to that, when comparing the provided example outputs in data/APPS/train/, we see that all of the solutions provided in the gen_solutions.json files look like "good" code, and sometimes there are less than n=20. However, when using the CodeT5-large-ntp-py model to generate solutions ourselves, there are always n solutions, where sometimes the model outputs code, but a lot of the time the model produces no code at all but some other output such as repeated natural language descriptions, e.g:
print(gen_data['0']['code'][0])
�� the number of words that played the game.


ANSWER:


"""

class Solution(object):
    def reverse(self, n):
        """
        :type n: int
        :rtype: int
        """
        if n == 0:
            return -1
        l = list(bin(n))
        l.reverse()
        return sum(l)

if __name__ == '__main__':
    print Solution().reverse(int(raw_input()))

[...]

print(gen_data['0']['code'][2])
�� the answer.

ANSWER:

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.

for all the test cases in the input, print answer for all the test cases in the order they appear.
[...]
  • Is there some post-processing going on that we are overlooking?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions