Humanitext Developer Lectures at Niigata University's "Generative AI Practice Exercise"
11/13/2025

Niigata University’s “Generative AI Practice Exercise” course is designed for students to practically learn the application of generative AI in various fields.
As a special lecture for this course’s 6th session, focusing on “Application in Specific Fields (Philosophy)”, Associate Professor Kazutaka Tanaka of J. F. Oberlin University, who leads the Humanitext project, was invited to speak. His lecture was titled, “Humanities Majors Should Be the Ones Mastering AI! -Building and Utilizing Text Databases-”. Over the two-day lecture, a total of 400 students actively engaged not just by listening, but by enthusiastically tackling the practical assignments presented.
📚 Applying Humanitext Insights to Education
The lecture first introduced the development of the “Humanitext” series (Antiqua, Aozora, etc.), an AI specialized for humanities research, particularly in Western Classics and Japanese literature.
The core of Humanitext lies in RAG (Retrieval-Augmented Generation) technology. This mechanism generates answers by referencing a curated database of primary texts, selected by experts, rather than relying solely on a general Large Language Model (LLM).
This approach allows the AI to accurately cite the sources for its answers, significantly reducing the risk of “hallucinations” (misinformation) in research and learning.
🎓 An Assignment in “Database Building” Using Expertise
In the second half of the lecture, drawing from the insights of Humanitext development, a unique, practical assignment was presented. It was designed to shift students’ perspectives from merely using AI to actively embedding expert knowledge into it.
The assignment requires students to use a simple RAG-building tool, such as Google’s “NotebookLM”, to design and construct a small-scale text database (chatbot), leveraging their own specialized field of study (domain knowledge) while also learning to consider copyright issues for public release.
The Goal of the Assignment
The purpose of this exercise is not simply to learn how to operate an AI tool. It is to gain hands-on experience with the fundamental challenges of AI development:
- Which texts should the AI read? (leveraging domain knowledge for data curation)
- Can this data be publicly released? (consideration for copyright)
- What information is missing for the stated purpose? (Data Evaluation)
- What information should the AI reference? (Database Design)
This directly addresses the critical “preprocessing” and “search design” that underpins AI.
This lecture provided a profound insight: the real skill lies not in chasing the trends of rapidly evolving generative AI technologies, but in designing “clean databases” that hold enduring value independent of them. It is about embedding specialized expertise (domain knowledge) into these sustainable databases so they can be connected to and utilized by any current AI technology. This demonstrated the core of the ability to truly “master and utilize AI,” making for an insightful lecture.