TellMeWhy-Context-Injection

Original Source

Our project is based on a paper/project linked below. We used their dataset and trained our models from the HuggingFace transformers API. Original Paper where the Idea came from Original Git Project where the Idea came from Data used from that project

Modified Files

We ended up not reusing any files from the original paper. We did modify their dataset however: Data used from that project

How to Train and Test our Models

We used a Google Colab Notebook that can be run sequentially on a GPU instance to train, save, and evaluate our models. Without a GPU instance the notebook will crash as in some places we move tensors to the GPU explicitly. The notebook requires access to the runner's google drive, and will open a folder named "CSE_354_project", or create it if it doesn't exist. Preprocessing the data requires loading the JSON named "ALL_CONTEXT_DATA_1.json" that we created and linked in the data section below. Link to Notebook

Additionally inside the notebook we link to our context generation using Gemini that we did over the course of days due do Gemini's API limits. To run this notebook successfully you need to add a Google Gemini API key in the secrets section of Colab.

Our Models and Data

Drive link to our no context tuned model Drive link to our context tuned model Drive link to our modified data in JSON format Drive link to folder containing all files above

Prompts used

Prompt we had given to gemini for context injection:

prompt = '''Given the following narrative sentences that describe a story, produce a sequence of concise and to the point sentences that bring in commonsense information, and external world knowledge that is relevant. Be very verbose about commonsense knowledge and explain the reason why things are done.

  

Here is an example:

narrative: Cam ordered a pizza and took it home. He opened the box to take out a slice. Cam discovered that the store did not cut the pizza for him. He looked for his pizza cutter but did not find it. He had to use his chef knife to cut a slice.

Pizza is a food. People eat food when they are hungry. Pizza is usually already cut. Cam got the pizza from the store.

  

Produce context sentences to the following narrative without any formatting, just as a sequence of 4 short, simple, and single clause sentences, do NOT reason through multiple sentences, each sentence should state commonsense information related to the narrative:

{narrative}

'''

Prompt format we trained our models on, either:

f"Narrative: {narrative} Question: {question}"

or

f"Narrative: {narrative} Context: {context} Question: {question}"

Requirements

All requirements are set to be installed in the notebook before, or when they are needed. The notebook works if run sequentially.

License

This project is licensed under the MIT License. See the LICENSE file for details. Disclaimer: This project was completed as a part of a university assignment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
ALL_CONTEXT_DATA.json		ALL_CONTEXT_DATA.json
LICENSE		LICENSE
Paper.pdf		Paper.pdf
Project Slides.pdf		Project Slides.pdf
README.md		README.md
TellMeWhyContextInjection.ipynb		TellMeWhyContextInjection.ipynb
context_generation.ipynb		context_generation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TellMeWhy-Context-Injection

Original Source

Modified Files

How to Train and Test our Models

Our Models and Data

Prompts used

Requirements

License

About

Uh oh!

Releases

Packages

Languages

License

dane-meister/TellMeWhy-Context-Injection

Folders and files

Latest commit

History

Repository files navigation

TellMeWhy-Context-Injection

Original Source

Modified Files

How to Train and Test our Models

Our Models and Data

Prompts used

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages