This website uses cookies to store information on your computer. Some of these cookies are used for visitor analysis, others are essential to making our site function properly and improve the user experience. By using this site, you consent to the placement of these cookies. Click Accept to consent and dismiss this message or Deny to leave this website. Read our Privacy Statement for more.
Print Page   |   Sign In   |   Register
News and Announcements: Resources

Noscemus: Transkribus Model "Noscemus GM v1" Released

Tuesday, January 7, 2020  
Share |

The Transkribus model “Noscemus GM v1” was released by Stefan Zathammer as a part of the Innsbruck based project NOSCEMUS (Nova Scientia: Early Modern Scientific Literature and Latin). The model is able to read texts set in Antiqua-based typefaces from the 16th, 17th and 18th century in a high quality outperforming most standard OCR engines. Although it is tailored towards transcribing (Neo-)Latin texts, it provides convincing results also for other languages such as French, Italian or English. The Noscemus model can therefore not only provide help to Neo-Latinists, but for all kind of researchers dealing with big text corpora from the Early Modern Period.

The model is based on training data coming from the Digital Sourcebook of the project and comprises at the moment (December 2019) about 1,000 fully corrected pages. In order to give the user a maximum of freedom, standardizations in the transcription process were kept to a minimum. Only in the following cases normalizations were made: ligatures (e.g. ae, oe, ct, ff) and abbreviations (e.g. -que, -us, -tur, …mm…) were expanded, long s (ſ) was transcribed as normal s, small caps were transcribed as majuscules.

At the current state there are still a few known issues: There are some remaining inconsistencies in the transcription of quotation marks. The error rate for the transcription of Greek words or passages is still high, to a lesser degree the same applies to words set in (German) Fraktur.

Membership Software Powered by YourMembership  ::  Legal