RG-KCR

The pipeline of the proposed restoration-guided Kuzushiji character recognition (RG-KCR) framework consists of three stages: Kuzushiji character detection (Stage 1), Kuzushiji document restoration (Stage 2), and Kuzushiji character classification (Stage 3). A training-free document restoration algorithm is introduced in Stage 2 to mitigate seal interference prior to character classification in Stage 3.

Abstract

Kuzushiji was one of the most popular writing styles in pre-modern Japan and was widely used in both personal letters and official documents. However, due to its highly cursive forms and extensive glyph variations, most modern Japanese readers cannot directly interpret Kuzushiji characters. Therefore, recent research has focused on developing automated Kuzushiji character recognition methods, which have achieved satisfactory performance on relatively clean Kuzushiji document images. However, existing methods struggle to maintain recognition accuracy under seal interference (e.g., when seals overlap characters), despite the frequent occurrence of seals in pre-modern Japanese documents. To address this challenge, we propose a three-stage restoration-guided Kuzushiji character recognition (RG-KCR) framework specifically designed to mitigate seal interference. We also construct datasets for evaluating Kuzushiji character detection (Stage 1) and classification (Stage 3). Experimental results show that the YOLOv12-medium model achieves a precision of 98.0% and a recall of 93.3% on the constructed test set. We quantitatively evaluate the restoration quality of Stage 2 using PSNR and SSIM. In addition, we conduct an ablation study to demonstrate that Stage 2 improves the Top-1 accuracy of Metom, a Vision Transformer (ViT)-based Kuzushiji classifier used in Stage 3, from 93.45% to 95.33%.

Motivation

To the best of our knowledge, existing Kuzushiji character recognition systems have not explicitly addressed the impact of seal interference, which frequently overlaps with characters and substantially degrades recognition performance. We present example recognition results produced by different methods under seal interference. The left column shows the input Kuzushiji characters “尚書堂梓”, while the right three columns present the corresponding recognition outputs of three methods: Komonjo Camera (Fuminoha) [TOPPAN Inc., 2023], NDLkotenOCR-Lite [Aoike et al., 2024], and Metom [Imajuku et al., 2024]. The first row displays the raw document as input, while the second row shows the recognition results after applying our proposed document restoration method.

(1) Data Correction

The raw Kuzushiji document images are obtained from the dataset provided by the Center for Open Data in the Humanities (CODH), with source materials preserved by the National Institute of Japanese Literature (NIJL) [NIJL, 2016]. We select 13 representative documents and exclude pages that contain no Kuzushiji characters, resulting in a final dataset of 1,000 images. During the annotation review process, 267 images are found to contain incomplete labels. We manually correct these annotations by adding missing character bounding boxes to improve overall annotation quality. We present representative examples of incomplete annotations in the raw dataset, where red boxes denote our manually added annotations and green boxes indicate the original ones.

(2) Synthetic Data Generation

Comparison examples between raw (real) Kuzushiji documents and synthetic Kuzushiji documents. The seals in the first row are naturally present in the original documents, whereas those in the second row are synthetically added.

(3) Final Visual Output

We present Kuzushiji document images and their corresponding final outputs produced by the proposed RG-KCR framework. For visualization, green bounding boxes and the associated modern Japanese characters are overlaid on the documents with a uniform font size of 64 pixels to enhance readability. After processing by the RG-KCR framework, readers can intuitively interpret the content of the original Kuzushiji documents.

Reference

[TOPPAN Inc. 2023] TOPPAN Inc., ふみのは (Fuminoha): 古文書カメラ (Komonjo Camera) (2023), https://camera.fuminoha.jp.

[Aoike et al. 2024] Aoike, T., Development of ndlkotenocr-lite, a lightweight ocr that runs at high speed in a cpu environment. In: IPSJ SIG Computers and the Humanities Symposium. pp. 181–186 (2024).

[Imajuku et al. 2024] Imajuku, Y., Clanuwat, T.: Metom (2024), https://huggingface.co/SakanaAI/Metom.

[NIJL 2016] National Institute of Japanese Literature (国文学研究資料館), Japanese Classical Cursive Script Kuzushiji Dataset（日本古典籍くずし字データセット） (2016), https://doi.org/10.20676/00000340.

[Tian et al. 2025] Tian, Y., Ye, Q., Doermann, D., Yolov12: Attention-centric real-time object detectors. In: Advances in Neural Information Processing Systems (2025).

BibTeX

If you find our paper useful in your research, please consider citing:

@article{ju2026rgkcr, title={Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference}, author={Ju, Rui-Yang and Yamashita, Kohei and Kameko, Hirotaka and Mori, Shinsuke}, journal={arXiv preprint arXiv:2602.19086}, year={2026} }

The following is the citation of the original Kuzushiji dataset; please cite it when using our constructed dataset:

『日本古典籍くずし字データセット』（国文研所蔵／CODH加工） doi:10.20676/00000340

Restoration-Guided Kuzushiji Character Recognition Framework under Seal Interference

Abstract

Motivation

Dataset

(1) Data Correction

(2) Synthetic Data Generation

Experiments

(1) Visual Output of Stage 1

(2) Visual Output of Stage 2

(3) Final Visual Output

Reference

BibTeX