Scroll to top

Automated Text Comparison


In our previous blog, we discussed about how the brief given to the designer for text gets inserted into an artwork or label. The ideal automated text comparison tool must be able to read and compare files even if they are in different formats. This tool becomes a must have when it has capabilities such as multi-language support, manual selection of text within a region, turn on/off match case, ignore whitespace and match font attributes. The clincher is if the tool renders the results of comparison as annotated output PDF file where the differences are marked in varying colors and changes are highlighted as sticky notes.

The inputs to an automated text comparison tool are the source document and the target document. Based on the variations described earlier, the following combinations exist

  1. PDF vs. PDF: Typically, comparison between two versions of an artwork
  2. Word vs. PDF: Comparison between an artwork and a brief in the form of a Leaflet, QRD or copy text for CPG companies.
  3. Excel vs. PDF: Comparison between copy text and Artwork in CPG companies. European companies use more Excel briefs than the rest of the world
  4. XML vs. PDF: Compare the SPL and PDF artwork in Pharma industry

Comparison Options

The following options and features are available in a text comparison tool

  1. Ability to compare text in any language: This requires the comparison to be done using Unicode and can identify any type of character including Chinese, Arabic and Hebrew. The same type of character in a Word document might have a different Unicode value in the PDF document. E.g. single and double quotes, hyphens, etc.
  2. automatedOption to extract text within a certain region: Artworks typically have a trimbox which contains the printable artwork and legends outside it. The required text is used for comparison by extracting text within the trimbox. Sometimes, text comparison of certain sections of the artwork (specific page, specific column, etc.) is required. The system should be able to extract text from a marked up area to facilitate the same.
  3. Text options can be enabled and disabled for matching text. These include:
    • Match or Ignore Case (uppercase or lowercase)
    • Match or Ignore Whitespaces
    • Match or Ignore font attributes (Bold, Italics, Underline)
    • Match or Ignore font name
    • Match or Ignore font size
    • Match or Ignore font color
  4. The result of the comparison is shown in multiple ways:
    • Each paragraph or line in the source file is matched with its corresponding text in the target file and the output is written in Source/Target pairs of text. The target text can be highlighted in color (deleted in red, added in green and changed in orange) if there are any differences.
    • The source and target documents can be visually shown in the same formatting as the original. The matched or unmatched text can be highlighted in the documents.
    • An annotated output PDF file can be generated with the changes marked as sticky notes within the PDF.

ManageArtworks is a Packaging Artwork Management Software that helps regulated industries like Pharmaceuticals and CPG to ensure regulatory compliance of their pack labels. It connects all stakeholders into an automated workflow, empowers users with sophisticated proofing tools including Text Comparison tools that reduce errors and speed up the process and gives complete transparency to the entire process with approval request tracking, audit trails and dashboards.

ManageArtworks is available as a ready to use cloud product or as a configurable on premise solution.

Click here to know more..

Related posts

Post a Comment

© 2018 All Rights Reserved. Karomi Inc