Specifying the file and folder names, for example:
In files: *.sgm
In folder: C:\Work Docs\mecorp\text\aa\toproof
search for
<PB[^>]*>
with Text, Regular expression, All matching lines, and Binary files all checked.
The results will appear in a file of Search Results; these may be examined for missing numbers, duplications, etc. and any discrepancies may be checked against the original text and/or the Author/Editor file.
Similar searches may be made in order to examine the presence and accuracy of
<MILESTONE>s (<MILESTONE[^>]*>),
<DIV>s of various sizes (<DIV[^>]*>),
<LG>s (<LG[^>]*>), and
<L>s (<L[^>]*>).In each case it is a useful tool for seeing anomalies, particularly in numbers and attributes.
NOTE:
As indicated in FAQ sheet #4, Author/Editor sometimes introduces unwanted carriage returns into the .sgm files that it exports, often in the middle of tags (between the element name and the attribute name):
<PB
N="103">
If such is the case, the above searches will fail to list those tags in the Search Results. In order to remove the Author/Editor placed carriage returns from the middle of tags, in TextPad, replace:
<\([^>]+\)\n\([^>]+\)>
with
<\1 \2>
Then run the "Search/Find In Files" again.
One further test for the presence of unwanted carriage returns is to use "Search/Find" (hotkey: F5) to find any lines not beginning with "<". Use ^[^<].
For now, consider page breaks as occurring at the "bottom" of each page. This means that the total number of <PB>s found will always be one less than the number of pages in the text.