GitHub - jiaqiwang1105/AI-ML-Project

1/ Parse through the PDF and extract all the text 2/ Store the text (locally / database (???) - depends on the size)

Where to store the files?
Should we calculate the number of words? / Number of tokens and decide based on the number of tokens?

3/ Figure out what amount of context to embed?

Is it a summary of each chapter?
What's the best way of embedding an entire PDF text content? The PDF could be from 5 pages to 300 pages.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
client		client
server		server
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback