Skip to content

Extract text about papers from "related work" sections #14085

@koppor

Description

@koppor

In science, authors write papers. They related their paper to other papers. This text is very interesting, as it contains two aspects:

  • interesting other papers
  • description of other papers

Example: https://github.com/JabRef/jabref-demo-libraries/blob/main/chocolate/pdfs/LunaOstos_2024%20-%20Social%20Life%20Cycle%20Assessment%20in%20the%20Chocolate%20Industry%20-%20A%20Colombian%20Case%20Study%20with%20Luker%20Chocolate.pdf

Colombia is a middle-income country with a population
of approximately 50 million (CIA 2021), with at least 11
million people living in rural areas (DANE 2018). It is the
third most biodiverse country globally, following Brazil
and Indonesia (Nash 2022). 

JabRef should do following:

For each reference:

  • Lookup in the references
  • Add to library - or update, if already exists
  • Find out descriptive text for the paper in the text
  • Add the desriptive text to comments-{username}, prefixed with [{citation-key}]: ([LunaOstos_2024] in our example). In case there is already content in comments-{username}, just append it. Separated by an empty line.

Example result:

@Misc{Agency2021,
  author         = {{Central Intelligence Agency}},
  note           = {Accessed 4 Mar 2023},
  title          = {The world factbook: Colombia},
  year           = {2021},
  comment-koppor = {[LunaOstos_2024]: Colombia is a middle-income country with a population of approximately 50 million.},
  url            = {https://www.cia.gov/the-world-factbook/countries/colombia/},
}

Related: Citation relations. However, they do not have the full text.

Screenshot from the linked PDF:

Image

Fuller context:

Image

Hint: It is perfectly OK to use the langchain4j's AI interface to parse etc.


This is NOT citation relations, because this issue here is about to harvest knowledge from a PDF.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions