-
I want to share some unusual things I created for fun over the weekend.
-
I developed a system where a Language Model (LLM) reads multiple existing research papers and then generates a fictional subsequent study. By repeating this process and expanding the Directed Acyclic Graph of papers, I aimed to simulate how researchers read papers and come up with new studies.
-
The simulation was created using the MCTS/UCB Tree Search-like algorithm. The output is then assessed by another LLM acting as a reviewer, and the evaluation score is used to guide the expansion of promising directions while cutting off unpromising paths.
-
In the visualization, the top node (yellow) represents a real paper, while the other nodes are generated by the LLM.
-
The results are somewhat selectively chosen, so I am not completely confident in the conclusions drawn. Nevertheless, I observe an improvement in the evaluation of each node, indicating that the exploitation of promising nodes is effective.
-
It’s intriguing to observe how the growth patterns shift when we modify the definition of a ‘good paper’ provided to the LLMs.
-
I also notice changes in growth when adjusting the parameter c of UCB.
- Giving more importance to exploration (c=0.3):
- Giving more importance to exploitation (c=0.1):
Some possible future directions:
- Incorporating human feedback (e.g., gaze attention) to guide the graph towards relevant directions.
- Applying a similar approach to other brainstorming tasks (e.g., generating a game idea, story plot, etc.).
- Exploring the idea of creating multiple agents with different “interests” and allowing them to collaboratively expand the graph.