Reviewing Sorting Phase Data: Visual Characteristics
Reviewing Sorting Phase Data: Visual Characteristics
To celebrate our volunteers’ hard work & review the data produced in the sorting phase, we’re sharing a series of blog posts that answer some of these questions about this project. Part 1 reviews the question of whether a subject was Hebrew or Arabic script. Part 2 reviews the question of whether a subject was written in formal or informal script. This part reviews reviews the presence of various visual characteristics. Part 4 reviews the classification tags from the talk boards.
What characteristics did volunteers identify, and why are these characteristics important?
At the start of the project, we asked volunteers to use the point tools to identify the following visual characteristics on a given fragment:
- Diagonal and/or Perpendicular Text in the Margin. This typically indicates this is a text reflecting daily, normal life, such as a letter.
- Seals (only if the volunteer had sorted the fragment as Arabic). Seals were created by the impressions from signet rings and were used on official documents. A seal typically indicates an official state document.
- Horizontal Line Above Word (only if the volunteer had sorted the fragment as Hebrew.). This indicates the fragment likely contains a literary text.
- Use of a Colon in the Text (only if the volunteer had sorted the fragment as Hebrew). The colon “:” symbol written on these fragments demarcates the end of a line or verse, suggesting this fragment may have been part of a prayer book or poetry.
In June 2018, we stopped asking volunteers to identify a horizontal line above a word. We decided this was too specific for our sorting purposes, and instead replaced it with the following:
- Diacritic, also referred to as Dot, Vowel, or Diacritic (only if the volunteer had sorted the fragment as Hebrew). These markings give us information about how to pronounce and accentuate each word in each phrase, how to sing each word, and where to pause in the phrase while reading. It would indicate that this fragment was meant to be read aloud.
At the same time, we asked volunteers to identify additional evidence of characteristics on a given fragment:
- Evidence of Binding indicates this fragment was likely part of a book.
- Justified Margins indicate this may be a literary fragment, as documentary fragments tend to have more irregularity to their margins.
- Top Corner Page Wear indicates the fragment may have been turned while reading.
These characteristics help us discern whether a fragment was used for everyday purposes or religious purposes. Not all of these features were available as options at every stage of the project. Volunteers also had the option to select “None”. Volunteers could identify the visual characteristics on either side of the subject.
How many subjects were identified as having evidence of these visual characteristics?
9,108 subjects (22%) were classified as having evidence of diagonal and/or perpendicular text in the margin, which means at least one volunteer identified it as such. 187 of these subjects were eventually sorted into the Arabic transcription workflows, and 8,792 were eventually sorted into the Hebrew transcription workflows.
416 subjects (1%) were classified as having evidence of seals, which means at least one volunteer identified it as such. These subjects, combined with the #seals tag on the Talk boards, may seem like a small number, but it is incredibly important for identifying official state documents written in Arabic.
3,734 subjects (9%) were classified as having evidence of horizontal line above word, which means at least one volunteer identified it as such.
11,978 subjects (29.8%) were classified as having evidence of a colon in the text, which means at least one volunteer identified it as such.
4,398 subjects (10.9%) were classified as having evidence of a dot, vowel, or diacritic, which means at least one volunteer identified it as such.
How many subjects were identified as having evidence of the additional characteristics?
6,457 subjects (16%) were classified as having evidence of justified margins by at least one volunteer. 225 of these subjects were eventually sorted into the Arabic transcription workflows, and 6,163 were eventually sorted into the Hebrew transcription workflows.
5,958 (14.8%) subjects were classified as having evidence of binding by at least one volunteer. 308 of these subjects were eventually sorted into the Arabic transcription workflows, and 5,593 were eventually sorted into the Hebrew transcription workflows.
3,707 subjects (9%) were classified as having evidence of top corner page wear by at least one volunteer. 170 of these subjects were eventually sorted into the Arabic transcription workflows, and 3,503 were eventually sorted into the Hebrew transcription workflows.
Did volunteers identify visual characteristics correctly?
As noted in previous posts, this doesn’t mean that the subjects definitively have these visual characteristics — it just means that based on the set of instructions given, volunteers identified various visual characteristics of the fragment as such. In the above dataset, we considered a subject to have evidence of a visual characteristic if at least one volunteer marked it as such. As our content specialists review this list, we hope to improve upon the field guide for effective and accurate identification by volunteers.