Text-fabric: handling Biblical data with IKEA logistics
The BHSA (Biblia Hebraica Stuttgartensia Amstelodamensis) is the BHS text plus the linguistic annotations of the Eep Talstra Centre for Bible and Computer.
The BHSA is available as a data set in Text-Fabric format. Text-Fabric is a minimalistic model to represent text: it provides addresses for all textual objects, so that it is easy to add arbitrary information at all textual levels, precisely and firmly anchored. A Text-Fabric resource resembles an IKEA ware house. The parts are nicely separated and stacked, so that they can be retrieved easily, to be combined into meaningful output later on. A consequence is that different teams with divergent purposes still can add to the same body of work, with a minimum of interference or duplication of work. Text-Fabric has helped with various types of data construction work, of which the most visible is the website SHEBANQ. We focus on two recent data combination jobs, (A) treebanks from the BHSA data and (B) a detailed comparison of the morphology in the BHSA and in the Open Scriptures effort. As the OSM is not yet finished, the comparison is repeatable.