My Oracle Support Banner

Sample Code / Script to Preserve Spacing and Underscores in InvoiceNumber Field While Removing Spaces Around Hyphen Characters (Doc ID 2854714.1)

Last updated on MARCH 13, 2024

Applies to:

Oracle WebCenter Forms Recognition - Version 12.2.1.4.0 to 12.2.1.4.0 [Release 12c]
Information in this document applies to any platform.

Goal

By default the Underscore "_" characters within the Invoice Number are not OCR'd or Extracted.  This script will address that issue.

Also within OCR results hyphen characters (e.g. "-") the hyphens are treated as a separate word within a string.  For example an Invoice Number value of "ABC-123" is treated as 3 separate words:

When this text is extracted to the InvoiceNumber field the extraction has to be told whether to keep the spaces between these words.  Therefore the text can be extracted as "ABC-123" or "ABC - 123".  This is generally controlled by the AP Invoices project setting:

NUM_OP_RemoveSpaces=NO

Another complication is introduced when an InvoiceNumber string has spaces around the hyphen characters or no spaces (e.g. "ABC-DEF - 123").  There is no way to keep the correct spacing as printed on the document.  Either the text extracted will be "ABC-DEF-123" or "ABC - DEF - 123".

The sample script provided in this article will attempt to preserve the spacing as it exists within the InvoiceNumber string so that the text printed on the page as "ABC-DEF - 123" gets extracted to the field as "ABC-DEF - 123".


 

Solution

To view full details, sign in with your My Oracle Support account.

Don't have a My Oracle Support account? Click to get started!


In this Document
Goal
Solution


My Oracle Support provides customers with access to over a million knowledge articles and a vibrant support community of peers and Oracle experts.