Monday, November 25, 2019

OpenText 2.0: A Stratified Annotation for Multi-Layer Searching

I attended an interesting presentation at the 2019 AAR/SBL presented by Ryder Wishart, Francis Pang, and Christopher Land. They described an upcoming release of OpenText 2.0. The OpenText project has been a long running project (1998?) which I have previously commented upon. It's self-description is:
The OpenText.org project is a web-based initiative to develop annotated Greek texts and tools for their analysis. The project aims both to serve, and to collaborate with, the scholarly community. Texts are annotated with various levels of linguistic information, such as text-critical, grammatical, semantic and discourse features.  
In their presentation, they described an upcoming update. Here is their summary:
The upcoming OpenText 2.0 analysis of the GNT is an open annotation derived from data released by the Global Bible Initiative 2016. In addition to various minor modifications to the GBI syntax model, OpenText 2.0 introduces a stratified model that includes explicit distinctions between graphological, morphological, lexico-grammatical, semantic, and discourse-level markup. It also introduces feature annotations beyond just morphological parsing, allowing other units to be queried for meaningful features that have been identified in advance (to give a simple example, the clausal analysis explicitly identifies intransitive and transitive clauses). All of this data, however, is encoded in a single XML document using in-line markup, so that it is intuitive to query using standard XQuery—and even easier with the custom query resources that we are developing. In this presentation, we will present a series of queries that demonstrate the richness of the stratified data, focusing specifically on investigating phenomena that cut across the different annotation layers. We will also show how this markup is useful for teaching Greek by demonstrating a simple web page that allows students to investigate the syntax and semantics of a specific Greek lexeme. For both examples, we will show how the feature annotations facilitate the display of meaningful quantitative information about the relevant search results.
There have been some delays so they were not fully able to show the new tool in action, but the graphic above gives an example of what it will look like. The color bands provide a way to visualize the various semantic elements in the sentence. Using well-defined XML coding, it allows for advanced semantic searches. An example they gave was how OpenText 2.0 makes it possible to discern different types of narrative, e.g., the parables of Jesus are distinguishable from the context.
I'm looking forward to their work becoming available. (Perhaps January 2020, they said?)

No comments:

Post a Comment