Middle English Compendium logo

How to QC Map Files

For other MEC files, see the  MEC INDEX

1. Check map files:

   a. add doctype and validate.

   b. extract list of IDs. Make sure that none 
      has been duplicated in the file.

   c. check muds superficially:

       search for <mud>.*\.</AUTHOR>{space}
                  <mud>.*[^.]</AUTHOR>[^ ]
                  <mud>.*[^.]</TITLE>[^ <]
                  <mud>.*/DATE>[^ ]
       get rid of some extraneous blank lines
       by replacing </med>\n\n with </med>\n

2. Prepare bib files for indexing:

   a. move to a temp directory
   b. attach current mslib to all files and validate
   c. normalize
   d. reattach doctype
   e. reentrify (using reentry.pl or reenter.postid.sgm,
      depending on whether the file has been merged before.
      Use ch-entry.pl on chaucer (c2) file: this will also
      work in place of reentry.pl on the other non-ided files)
3. Compare stencils from map files(s) and bib files.

   a. extract stencils from both sets thus:

      perl ex-stencil.all.pl mapfile.take?.sgm >> sts.mapped.txt
      perl ex-stencil.all.pl ??.sgm >> sts.bibbed.txt
      perl ex-stencil.all.pl ??.id.sgm >> sts.bibbed.txt

      or some such.

    c. open sts.bibbed.txt and sts.mapped.txt in text editor. Cp. numbers
       of lines. If sim., remove IDs from bibbed stencils

       (replace <STENCIL[^>]+> with <STENCIL> in textpad)

    d. upload both files to dns /work/pfs/merge or your equivalent

    e. create lists of stencils unique to map files and bib files:

dns:merge % sort ss.bibbed.txt > ss.bibbed.sort

dns:merge % sort ss.mapped.txt > ss.mapped.sort

dns:merge % comm -23 ss.bibbed.sort ss.mapped.sort > solely.bibbed
dns:merge % comm -13 ss.bibbed.sort ss.mapped.sort > solely.mapped
dns:merge % cat solely.bibbed
dns:merge % cat solely.mapped

4. Resolve differences, if any, between bib files and map files.

5. revalidate map files. revalidate bib files.

6. Hand both over to Nigel.


< o >