Paper on New Data Model for Humanitext Published at PNC 2025

12/4/2025

PNC 2025 Conference

We are pleased to announce that the research results from the Humanitext project, presented at the 2025 Pacific Neighborhood Consortium Annual Conference (PNC) in Hanoi, Vietnam, have been officially published on IEEE Xplore.

The paper, titled “A Data Model for Integrating Annotations with Western Classical Texts for an AI Dialogue System,” represents a collaborative effort by Jun Ogawa (The University of Tokyo), Naoya Iwata (Nagoya University), and Ikko Tanaka (J. F. Oberlin University).

Background and Objectives

The Humanitext project has been developing “Humanitext Antiqua,” an AI dialogue generation system grounded in Western Classics. While the system utilizes Retrieval-Augmented Generation (RAG) to provide answers based on primary texts, it has faced limitations in incorporating background information or related context not directly described in the source material.

In Classical Studies, “annotations” made by scholars over centuries are as vital as the primary texts themselves. However, integrating this extensive layer of interpretative data into the AI’s generation process has been a significant challenge.

A New Data Model

To address this limitation, the paper proposes a new schema that structures primary texts and their annotations as a Linked Data-based knowledge graph.

This approach allows for the structuring of information related to primary texts, enabling search capabilities and document generation that consider extensive information beyond mere textual descriptions. By linking specific segments of primary texts with their corresponding scholarly annotations within a knowledge graph, the system establishes a foundation for AI to understand and generate responses with greater contextual depth.

Development of a Prototype Viewing Tool

Furthermore, to demonstrate the utility of the constructed knowledge graph, the team developed a prototype viewing tool. This application allows for the parallel viewing of primary texts and their annotations, facilitating a digital environment where users can seamlessly engage in the traditional scholarly practice of reading texts alongside commentaries.

This research serves as a foundational step in integrating domain-specific knowledge graphs with AI generation, promising significant enhancements to the future capabilities of the Humanitext system.

The full paper is available via IEEE Xplore: