You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-10Lines changed: 9 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
-
# GPT-4 & LangChain - Create a ChatGPT Chatbot for Your PDF Docs
1
+
# GPT-4 & LangChain - Create a ChatGPT Chatbot for Your PDF Files
2
2
3
-
Use the new GPT-4 api to build a chatGPT chatbot for Large PDF docs (56 pages used in this example).
3
+
Use the new GPT-4 api to build a chatGPT chatbot for multiple Large PDF files.
4
4
5
5
Tech stack used includes LangChain, Pinecone, Typescript, Openai, and Next.js. LangChain is a framework that makes it easier to build scalable AI/LLM apps and chatbots. Pinecone is a vectorstore for storing embeddings and your PDF in text to later retrieve similar docs.
6
6
@@ -48,15 +48,15 @@ PINECONE_INDEX_NAME=
48
48
49
49
5. In `utils/makechain.ts` chain change the `QA_PROMPT` for your own usecase. Change `modelName` in `new OpenAIChat` to `gpt-3.5-turbo`, if you don't have access to `gpt-4`. Please verify outside this repo that you have access to `gpt-4`, otherwise the application will not work with it.
50
50
51
-
## Convert your PDF to embeddings
51
+
## Convert your PDF files to embeddings
52
52
53
-
1. In `docs` folder replace the pdf with your own pdf doc.
53
+
**This repo can load multiple PDF files**
54
54
55
-
2. In `scripts/ingest-data.ts` replace `filePath` with `docs/{yourdocname}.pdf`
55
+
1. Inside `docs` folder, add your pdf files or folders that contain pdf files.
56
56
57
-
3. Run the script `pnpm run ingest` to 'ingest' and embed your docs
57
+
2. Run the script `npm run ingest` to 'ingest' and embed your docs. If you run into errors troubleshoot below.
58
58
59
-
4. Check Pinecone dashboard to verify your namespace and vectors have been added.
59
+
3. Check Pinecone dashboard to verify your namespace and vectors have been added.
60
60
61
61
## Run the app
62
62
@@ -73,16 +73,15 @@ In general, keep an eye out in the `issues` and `discussions` section of this re
73
73
- Check that you've created an `.env` file that contains your valid (and working) API keys, environment and index name.
74
74
- If you change `modelName` in `OpenAIChat` note that the correct name of the alternative model is `gpt-3.5-turbo`
75
75
- Make sure you have access to `gpt-4` if you decide to use. Test your openAI keys outside the repo and make sure it works and that you have enough API credits.
76
+
- Your pdf file is corrupted and cannot be parsed.
76
77
77
78
**Pinecone errors**
78
79
79
80
- Make sure your pinecone dashboard `environment` and `index` matches the one in the `pinecone.ts` and `.env` files.
80
81
- Check that you've set the vector dimensions to `1536`.
81
82
- Make sure your pinecone namespace is in lowercase.
82
83
- Pinecone indexes of users on the Starter(free) plan are deleted after 7 days of inactivity. To prevent this, send an API request to Pinecone to reset the counter.
83
-
- Retry with a new Pinecone index.
84
-
85
-
If you're stuck after trying all these steps, delete `node_modules`, restart your computer, then `pnpm install` again.
84
+
- Retry from scratch with a new Pinecone index and cloned repo.
0 commit comments