Humanitext Aozora
Humanitext Aozora is an innovative platform for exploring the vast collection of the Japanese digital library "Aozora Bunko"—comprising over 17,000 works by approximately 1,000 authors—through a conversational AI. It utilizes a database of works from literary giants like Natsume Soseki and Akutagawa Ryunosuke and employs Retrieval-Augmented Generation (RAG) technology to ensure every AI-generated answer is cited from a source. This allows users to delve deeply into thematic elements, character emotions, and the beauty of the Japanese language within a reliable, academic-grade environment. It is a research and reading tool for a new era, designed for everyone from experts and students to literary enthusiasts.
Highlights
- Explore over 17,000 works from 1,100 authors in a conversational format with AI
- Ensure academic reliability with source citations from Aozora Bunko (RAG)
- Utilize five unique output modes for everything from research to creative writing
- Leverage advanced filtering and exclusion functions by genre, author, and work
What is Humanitext Aozora?
Humanitext Aozora is a project dedicated to the deep exploration of the forest of modern Japanese literature, using the shared cultural heritage of “Aozora Bunko” as its stage and AI as its guide. The name “Aozora” signifies our respect for the monumental efforts of our predecessors who made this library possible.
The technical heart of this system is a hybrid search pipeline combining Dense search, which retrieves based on semantic similarity, and Sparse search, which captures keyword matches. A user’s question in Japanese is first translated by an AI into a context-aware English search query. This query is then used to rapidly search the Aozora Bunko database with both methods. Furthermore, an AI reranks the retrieved text fragments to select only the highest-quality contexts that best match the user’s intent.
Because the Large Language Model (LLM) generates its response based only on these carefully selected contexts, it effectively suppresses “hallucinations”—inaccurate information generated from the LLM’s training data alone. This architecture ensures high reliability, with every answer grounded in a verifiable source.
Purpose: To Traverse the Sea of Words and Touch the Souls of Authors
Modern Japanese literature is a world woven with beautiful Japanese, depicting the struggles and joys of people who lived through turbulent times and exploring universal themes that resonate today. Humanitext Aozora provides a compass for everyone to freely navigate this rich sea of words.
-
For Professionals and Researchers It dramatically accelerates tasks like stylistic analysis of a specific author or comparative research on how different authors handle a particular theme. It brings new perspectives and inspiration to research by uncovering unexpected connections between works.
-
For Students It allows for a deeper understanding of literary works by enabling students to ask the AI about archaic expressions and historical backgrounds. By asking questions like, “Explain the main character’s feelings at this moment,” or “What does this metaphor mean?” it powerfully supports their reading comprehension and understanding of the work’s content.
-
For All Literature Lovers It serves as a conversational partner for enjoying works from multiple angles, whether by searching for common themes in a favorite author’s different pieces or inquiring about the motivations behind a character’s actions. You can deepen your own interpretation of works through casual questions like, “Tell me about the female characters in Osamu Dazai’s writing.”
Core Functions and How to Use Them
While capable of profound analysis, Humanitext Aozora is designed for simple and intuitive operation. Anyone can instantly begin an intellectual dialogue with the great authors of Japan.
1. Select Genre, Author, and Work (Optional)
Flexibly define the scope of your conversation to enhance the precision of your search.
- Genre Filter: Based on the Nippon Decimal Classification (NDC), you can narrow down authors by genres, from broad categories like “Novels & Stories” and “Poetry” to more specific sub-classifications.
- Author Filter: Select multiple authors of interest from the displayed list.
- Work Filter: After selecting an author, you can press the “Further filter by work” button to narrow your search to specific works by that author.
- Exclusion Function: By checking “Exclude the selected authors (and works) from your search,” you can conduct advanced queries such as, “the concept of ‘self’ in works other than those by Natsume Soseki.”
2. Choose an Output Mode
The platform features five unique output modes to cater to a wide variety of needs.
- Q&A Mode: Best for getting concise answers to factual questions, along with precise source citations (e.g.,
(Natsume Soseki, 'Kokoro', Section 15)). - Detailed Analysis Mode: Receive a multi-faceted interpretation, much like a literary scholar would provide, that compares various sources and supplements the analysis with historical context.
- Dialogue Mode: The AI adopts the persona (writing style, philosophy, tone) of a specified author (e.g., Akutagawa Ryunosuke) or character, allowing for an immersive conversational experience.
- Creative Writing Mode: The AI learns the style of a specified author and uses it to create new, original text. This is intended to support creative activities with prompts like, “What if Sakaguchi Ango wrote about modern-day Shibuya?”
- Custom Mode: This mode allows you to build your own optimal AI assistant. By fine-tuning settings such as the language of citations (original/translation), the level of detail in explanations, and the response style, you can elicit responses that perfectly match your research and learning style.
- No Context Mode: In this mode, the system does not perform RAG (source retrieval) and answers solely based on the LLM’s vast internal knowledge. This allows for more free-form discussions beyond the scope of Aozora Bunko, such as translating classical Japanese to modern Japanese. However, answers may contain inaccuracies as they are not source-grounded.
3. Enter Your Question
Type what you want to know or think about into the chatbox in plain Japanese. The key to unexpected discoveries is to not stop at a single answer but to continue the conversation, using the AI’s responses as a springboard for further questions.
(Example Questions)
What are the common themes in Akutagawa Ryunosuke's 'Rashomon' and 'The Nose'?Please provide specific examples of how "nature" is depicted in the works of Miyazawa Kenji.From the perspective of the cat in 'I Am a Cat', please satirically describe the society of that time.
4. Review the Answer and Its Sources
Except in “No Context Mode,” the contexts that formed the basis of the AI’s answer are displayed in the “Sources” panel. They are listed with their relevance score, as judged by the AI, like [Work ID Author Name ‘Work Title’ / Score: 0.85]. By expanding the details (▼), you can view the full text of the cited passage and find a direct link to Aozora Bunko.
This back-and-forth movement—between the “dialogue with AI” and the “return to the source”—is the new form of learning and research that Humanitext Aozora proposes.
Embark on a journey to discover new facets of modern Japanese literature with Humanitext Aozora.