Using the GPT Index to explore the content of Scrapbox.
@robbalian: Hey GPT: When did I peak? I created a model that searches through thousands of pages of my emails and personal notes. You can use it too at https://t.co/tSHSzWoM6q Here’s what I learned… 🧵
- Looks like I can use this collaboration.
import json
# open the json file
with open('scrapbox_export.json') as json_file:
data = json.load(json_file)
# iterate through the pages
for page in data['pages']:
title = page['title']
lines = page['lines']
# join the lines using newline character
content = "\n".join(lines)
title = title.replace("/", "-")
# print the title and content of each page
f = open("data/" + title + ".txt", "w+")
f.write(content)
- For now, this can replace the json with a txt file.
- There seems to be an error with /blu3mo, but it’s unresolved.
- With that, you can do Semantic Search and Q&A.
- However, it seems that the stage of pulling useful files with Semantic Search is not going well.
- Could it be because of the Japanese language?
- I need to understand the mechanisms of Embedding and Indexing (blu3mo).
- I think it’s impossible with Japanese! (tkgshn)
- I think it will work if we translate and put it in.
- However, it seems that the stage of pulling useful files with Semantic Search is not going well.