HINTs: Sensemaking on Large Collections of Documents with Hypergraph Visualization and INTelligent Agents.

Sam Yu-Te Lee,Kwan-Liu Ma


Sensemaking on a large collection of documents (corpus) is a challenging taskoften found in fields such as market research, legal studies, intelligenceanalysis, political science, computational linguistics, etc. Previous worksapproach this problem either from a topic- or entity-based perspective, butthey lack interpretability and trust due to poor model alignment. In thispaper, we present HINTs, a visual analytics approach that combines topic- andentity-based techniques seamlessly and integrates Large Language Models (LLMs)as both a general NLP task solver and an intelligent agent. By leveraging theextraction capability of LLMs in the data preparation stage, we model thecorpus as a hypergraph that matches the user's mental model when making senseof the corpus. The constructed hypergraph is hierarchically organized with anagglomerative clustering algorithm by combining semantic and connectivitysimilarity. The system further integrates an LLM-based intelligent chatbotagent in the interface to facilitate sensemaking. To demonstrate thegeneralizability and effectiveness of the HINTs system, we present two casestudies on different domains and a comparative user study. We report ourinsights on the behavior patterns and challenges when intelligent agents areused to facilitate sensemaking. We find that while intelligent agents canaddress many challenges in sensemaking, the visual hints that visualizationsprovide are necessary to address the new problems brought by intelligentagents. We discuss limitations and future work for combining interactivevisualization and LLMs more profoundly to better support corpus analysis.
Text visualization,sensemaking,hypergraph,hierarchical clusters,corpus analysis,large language models
