Skip to content

Scope out question generation #3

@kcarnold

Description

@kcarnold
  • Collect a bunch of example inputs and outputs from last summer's exploratory work
    • We have one or two examples already.
  • Wrangle these examples into something we can use to train (fine-tune) a LM
    • we could start with the approach of the Interviewer even though it's not perfect...
  • Pick an LM that we can feasibly run inference on
    • OpenAI API?
    • One of the existing open-source ones we've used (maybe an encoder-decoder one like flan-ul2)
    • one of the new batch of open-source models (LLaMa etc), fine-tuned perhaps
  • Fine-tune the LM to generate questions like our examples
  • Collect ranking data on LM generations
    • build Streamlit app for this? Maybe there's already an app that people are using, e.g., any open-source ChatGPT replication project will have something like this. Vicunia, Alpaca... see llama.cpp repo. or https://arxiv.org/abs/2204.05862
  • Use ranking data to optimize the LM
  • Deploy the optimized LM as an API that the frontend can access.
    • We've already got an API for the interviewer model.

Design a simple format for input and output. e.g., input is document_text, cursor_position, and optional question_type, output is question, start_position, end_position (where positions are character offsets from the beginning of the texts)

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions