The Tripitaka Koreana Knowledgebase Project : Project Summary

01 Project Summary

■ Project Title

The Tripitaka Koreana Knowledgebase Project

■ Purpose of the Project
Background of the Project
  • - Because the systematic computerization of the Tripitaka Koreana was done comparatively earlier than other branches, it is contributing greatly to research and has helped scholars overcome its weakness of having been written in ancient Chinese characters. Digitization has brought epochal change to scripture research and compared to other fields, scholars depend on computerized information to a greater degree. Given the massive amount of related information, it is difficult to carry on research without the assistance of computerized information.
  • - From the year 2001 until today, Haein-sa Temple has been the site of the continuous collection of accurate information about the Tripitaka Koreana, and this effort has helped to maintain a scientific and systematic preservation of data. This process of digitizing the images and basic information in the Tripitaka Koreana is aimed at the goal of eternal preservation. Therefore, the Research Institute of Tripitaka Koreana is making every effort to collect every bit of information and to make it widely available as a web-based resource, in line with the needs of all those who are interested in its use.
  • - The Research Institute of Tripitaka Koreana has not only digitized the more than 55,000,000 characters of the Tripitaka Koreana, there have also been numerous supplemental projects considered crucial to the Tripitaka Koreana research, including the addition of notational mark-ups, the publication of the Dictionary of the Chinese Variant Characters, the collection of various sutras from all over the world, and a standardized comparison between the Tripitaka Koreana and Japan’s Taisho Shinshu Tripitaka. In addition, the Institution has developed further basic research aids, including the Tripitaka Koreana index, a Buddhist Vocabulary dictionary, an electronic dictionary, critical essays about Korea-Buddhism relations, and various other information.
  • - Though the Research Institute of Tripitaka Koreana has provided all this material through CD-ROM, the internet, books, and various other media, given the massive breadth and scope of information, it has been difficult to provide these resources in an efficient and productive manner. With increased technological development of computers and the internet, and as scholars’ requests have become more sophisticated, there has been a growing need to provide this vast information effectively though an upgrade of the current system.
Purpose of the Project
  • - To succeed in the propagation of the Buddha's light of wisdom, contributing to the development of humankind by researching the Tripitaka Koreana and providing for its perpetual preservation.
  • - The Tripitaka Koreana is not simply a scriptural repository, but a treasure trove of historical data. It not only has records on the Mongolian invasion of Korea and other aspects of international political history, but also critical data on the domestic political, economical, and environmental state of affairs, as well as information about early Korean arts, sciences, and cultural customs, as well. These rare historical materials should be inspected not only from philosophical, literary, and Buddhist points of view, but also from the fields of political science, economics, history, cultural studies, anthropology, and others.
  • - Since the Tripitaka Koreana is not only for Buddhist scholars, in order for this information to be accessible and important for all, a system that can provide the information effectively to scholars in all fields is necessary.
  • - Thereupon, by simultaneously preserving and digitizing the Tripitaka Koreana, we aim to re-construct the original text of the Tripitaka Koreana.
  • - Previously, computerization was done individually according to necessity, However, by linking the images of the Tripitaka Koreana’s text with its woodblock text, we seek to construct a web service system that computerizes the Dictionary of Chinese Variant Characters, a usage of its index, a vertical comparison between the Tripitaka Koreana and Japan’s Taisho Shinshu Tripitaka, and more.
Project Necessities
  • - To develop techniques for: choosing proper tree types for wood engraving, engraving vast amounts of text in a standardized format, preserving the wood over time, making inks and papers, printing, binding vast amounts of books, and more.
  • - The systematically collected and organized wooden engravings in Haein-sa represent an accumulation of information about basic scriptures, dictionaries, and commentary books necessary to the study of Buddhism. Therefore, Haein-sa monastery needs a complex system for organizing a library where this information can be catalogued and preserved.
  • - The information in the Tripitaka Koreana has an extremely high utility and preservation value and thus, it needs to be shared.
  • - To facilitate the preservation of the woodblocks, we need to computerize their images, so as to quickly determine any damage that has been done, due to dust or warping, for example, and remedy these problems.
  • - To dramatically improve the preservation environment of the Tripitaka Koreans, we must collect all pertinent information about the images of the wood, previous examples, and research results by scholars and experts.
  • - The Research Institute of Tripitaka Koreana has collected vast amounts of digitized information through 10 years of computerization. We need to accumulate various experiences about cooperation and communication through providing information to scholars in various fields.
  • - Therefore, if we can build a system where this information and knowledge can be effectively organized and distributed, we can create the circumstances through which research on the Tripitaka Koreana can be done easily, at a very low expense, and in a very short period of time.
■ Project Goals
Project Summary
  • - By digitalizing the Tripitaka Koreana, which is a national treasure as well as a UNESCO World Cultural Heritage, we can revitalize research about domestic and international wood prints and old books, and contribute to the increase in pride for traditional cultures and national identity.
  • - We are trying a create new interpretations of the Tripitaka Koreana by digitizing and connecting the wood prints, original images, and text all in one space.
  • - Based on the database constructed through this process, we will try to increase the quality of the system so that both experts and ordinary citizens will be able to freely access this database and use the original text of the Tripitaka Koreana effectively.
  • - The newly created database service will use the latest IT technology, growing out from the old system that simply provided woodblock images. We will use an image streaming system so that we can quickly and smoothly provide contents to users quickly, clearly, and accurately.
Main details of the Project
  • - Through this project, we will digitize images of the Tripitaka Koreana’s wooden engravings, turn the images into digital text, index the contents, and allow them to be accessed with the Tripitaka Koreana's digital input texts.
  • - The Tripitaka Koreana’s Unicode text version and Variant Characters version can be automatically interlocked.
  • - By including the Dictionary of Variant Chinese Characters of Tripitaka Koreana, a direct comparison between the Unicode text version and Variant Characters version is possible.
  • - Vertical comparison access will be possible between the Tripitaka Koreana and the Taisho Shinshu Tripitaka by indexing the collated data of the Tripitaka Koreana.
  • - A detailed index of the Tripitaka Koreana will be produced, allowing for the Tripitaka Koreana to be hyperlinked from the index.
  • - The database's table of contents and searched meta information will be producible in both XML and HTML form.
  • - Collected information will be provided through the Tripitaka Koreana Knowldegebase web site.
  • - Original images will be available in high resolution color so that people can have a genuine sense of the real texture of the original text.
  • - Original images will be provided through a high speed image streaming system.
Yearly Plan
Project Own project The Tripitaka Koreana Knowledgebase Project File format
2004 2005 2006 2007 2008 2009
The Tripitaka Koreana
Knowledgebase
system
Multi Web
service
solution finished.
United DB
Web service
system
The Tripitaka Koreana
wood-print
(162,516 page)
64,195
page
49,161
page
49,161
page
text
Unicode version digital texts of Tripitaka Koreana (162,516 page) 162,516
page
(Unicode-text)
64,195
page
49,161
page
49,161
page
text(XML)
The notational mark data of the photocopied version of the Tripitaka Koreana (13,000,000 letters) 13,000,000
letters
5,135,094
letters
(64,195
page)
3,932,453
letters
(49,161
page)
3,932,453
letters
(49,161
page)
integrated with the Standard version Text(XML)
The Tripitaka Koreana images
(162,516 page)
70,000
page
54,172
page
54,172
page
54,172
page
image
(XML-ization for interlocking)