Development of New Digital Library Applications in the Context of a basic Ontology for Biosystematics Information Using the Literature of Entomology (Ants)

  • Contact:

    G. Sautter

  • Project Group:

    Systeme der Informationsverwaltung

Description

The GoldenGATE Document Markup & Retrieval System / plazi.org

With the literature on ants as a use case, this project lays the theoretical and practical groundwork for large-scale digitization and markup of biosystematical documents, and for making these documents available online through a semantic-aware search portal. The actual technologies developed in this project, however, are – in large parts – readily applicable to documents from other domains as well.
In particular, sophisticated markup tools are the major product of this project. Based on users’ practical working experiences – mainly from the markup of the Madagascar corpus, a corpus consisting of the 2,500 pages of literature on the ant fauna of Madagascar – the tools are constantly being refined, in order to ever better support users in marking up digitized documents.
For storing the documents throughout the markup process and for making them available online, we have created a server that backs the markup work as a centralized data store. Soon as the markup is done to a certain degree, the server makes their individual parts (biosystematical treatments) available online through a portal offering semantic search functionality based on the markup. Beside this search portal, there are several XML-based interfaces, which allow for other applications to retrieve the documents. The entire server is highly modular and flexible, so new features integrate seamlessly, allowing for new ways of making the documents available to ever more people and applications.