• I want to share some unusual things I created for fun over the weekend.

  • I developed a system where a Language Model (LLM) reads multiple existing research papers and then generates a fictional subsequent study. By repeating this process and expanding the Directed Acyclic Graph of papers, I aimed to simulate how researchers read papers and come up with new studies.

  • The simulation was created using the MCTS/UCB Tree Search-like algorithm. The output is then assessed by another LLM acting as a reviewer, and the evaluation score is used to guide the expansion of promising directions while cutting off unpromising paths.

  • In the visualization, the top node (yellow) represents a real paper, while the other nodes are generated by the LLM.

    • imageimageimage
  • The results are somewhat selectively chosen, so I am not completely confident in the conclusions drawn. Nevertheless, I observe an improvement in the evaluation of each node, indicating that the exploitation of promising nodes is effective.

    • image
  • It’s intriguing to observe how the growth patterns shift when we modify the definition of a ‘good paper’ provided to the LLMs.

  • I also notice changes in growth when adjusting the parameter c of UCB.

    • Giving more importance to exploration (c=0.3):
    • image
    • Giving more importance to exploitation (c=0.1):
    • image

Some possible future directions:

  • Incorporating human feedback (e.g., gaze attention) to guide the graph towards relevant directions.
  • Applying a similar approach to other brainstorming tasks (e.g., generating a game idea, story plot, etc.).
  • Exploring the idea of creating multiple agents with different “interests” and allowing them to collaboratively expand the graph.
