Search for fraction entities in Author/Editor. (Check the Entity Set: Local & Active under "Entities/Insert Entities." The fractions listed there, e.g. "frac12," "frac34," etc., are the only ones that need checking.) Some dates have been recorded (in the original texts) as, for example, 133¾. meaning 1333/4. The vendor has in some cases mimicked the ¾ with a fraction entity. These should be changed to 3/4. Some dates where fractions occur have been marked with $x$. Change these to the correct form as above.
"dj." (with a swung dash (~) above the "j") =
<ABBR>dj</ABBR>
"&c9." (where "9" is a character that looks more or less like a superscript "9") =
ABBR EXPAN="et cetera">&c.</ABBR>
"l." (with a loop like a backward "c" crossing the upright, then looping up and back over it) =
<ABBR EXPAN="et cetera"><GAP></ABBR>
A Perl script called pinarg.pl exists for removing excess paragraph tags from <ARGUMENT> elements. (It was devised for the Oseney Register, but may prove useful elsewhere.)
cd C:\Markup\mecorp\text\XX\toproofwhere XX is the two-letter alphabetic identifier for the file.
C:\Markup\mecorp\text\XX\toproof>run the Perl script
perl -i.bak C:\Markup\code\perl\cmeB\pinarg.pl xxxxxxx.sgmwhere xxxxxxx is the seven-letter/digit NOTIS identifier.
This will produce a new .sgm file with a single <P> per <ARGUMENT> and save the previous .sgm file as xxxxxxx.sgm.bak.
For a quick look at tables in order to see if they are properly handled and generally reflect what is in the original text, do the following:
A Perl script called labinarg.pl exists for removing <LIST>, <LABEL>, and <ITEM> tags from <ARGUMENT> elements.
cd C:\Markup\mecorp\text\XX\toproofwhere XX is the two-letter alphabetic identifier for the file.
C:\Markup\mecorp\text\XX\toproof>run the Perl script
perl -i.bak C:\Markup\code\perl\cmeB\labinarg.pl xxxxxxx.sgmwhere xxxxxxx is the seven-letter/digit NOTIS identifier.
This will produce a new .sgm file with <LIST>, <LABEL>, and <ITEM> tags removed from <ARGUMENT>s. It will save the previous .sgm file as xxxxxxx.sgm.bak.
For quick removal of specific tags in Author/Editor: place cursor directly to right of opening tag and type Ctrl. + D.
A Perl script called italand.pl exists for removing italic "and"s from Roman texts.
cd C:\Markup\mecorp\text\XX\toproofwhere XX is the two-letter alphabetic identifier for the file.
C:\Markup\mecorp\text\XX\toproof>run the Perl script
perl -i.bak C:\Markup\code\perl\cmeB\italand.pl xxxxxxx.sgmwhere xxxxxxx is the seven-letter/digit NOTIS identifier.
This will produce a new .sgm file with italic "and"s removed; it will save the previous .sgm file as xxxxxxx.sgm.bak.
Show Structure View (hotkey: F11) is useful not only for seeing how a document has been organized into <DIV>s and other elements, but also for adding missing TYPEs (and other attributes) to those elements more quickly than to a complete document. The additions can be done individually by placing the cursor to the right of the <DIV> and opening Edit Attributes (hotkey: F6) under Markup. Or additions (or changes) can be made globally by opening Find and Replace (hotkey: Ctrl-F) under Find. E.g.:
Find: <DIV2Replace: <DIV2 TYPE="chapter"
Using TextPad, in order to find and replace globally an item that includes a sequence of numbers, the following is useful: N="\([0-9]+\)" E.g.:
Find: N="\([0-9]+\)"The replacement will add the letter "a" to each number in the sequence.Replace: N="\1a"
Use the <GAP> element to indicate printed marks that are otherwise unpresentable. The DESC attribute is available to describe the marks but in general is used only to name unpresentable languages, e.g., Greek, Hebrew, etc. Thus passages in Greek may be marked as <GAP DESC="Greek">.
Use the <ABBR> element to indicate abbreviations. (These may include printed marks that are otherwise unpresentable, and which may be indicated with a <GAP> within the <ABBR> and </ABBR> tags.) The EXPAN attribute is available to spell out the unabbreviated word. Some examples:
<ABBR EXPAN="[dram]"><GAP></ABBR>
<ABBR>mal</ABBR>
<ABBR>Ihc</ABBR>