|
JRA Software, Publisher
JRA FirstPage is a powerful but simple-to-use visual image
enhancement editor for bitonal TIFF files that offers the unique
features of Rapid Manual Cleanup and Virtual Document Separation.
3.8 Mb
To run the JRA FirstPage evaluation, you will need an evaluation code. Send an e-mail request for an evaluation code and one will be promptly sent to you. Include name and contact information with your request. If you decide to purchase, you will receive an unlock code and your evaluation version will become your registered version.
Note: If on Windows 2000 or Windows XP, you do not need to reboot after the install.
E-mail or call in a FirstPage Evaluation Code Request
(610) 983-3682
Would you like some test TIFF files to work with? You can download this set of memo and report images that need both cleaning and virtual separation! The 293 images are contained in a single self-extracting.exe file.
11.5 Mb
System Requirements:
Operating System
|
Windows 95, 98, NT, 200, XP, ME
|
Graphics Adapter
|
High Color (16 bit)
True Color (24 bit)
True Color (32 bit)
600 x 800 or greater resolution
(use full 1200 x 1600 for best results)
|
Monitor
|
17" or greater (21" or flat screen recommended)
|
Viewing Help File
|
Acrobat Reader Required
|
Introduction
JRA FirstPage is used to enhance folders of bitonal TIFF files produced by a document scanner, in preparation for submitting these images to an OCR (optical character recognition) engine. The biggest factor in OCR accuracy is the quality of the page images submitted to the OCR engine.
JRA FirstPage satisfies the image enhancement step of the paper-to-web conversion process. It is well suited for processing collections of business documents.
Rapid Manual Cleanup
The exclusive focus of other image enhancement applications like ScanFix has been to provide functions for the batch enhancement of images. But there is a limitation to what batch enhancement can do. For example, specks can be automatically removed, but only those specks that are smaller than the dot of an "i". Often there are marks in the margin areas of a scan that need removal. Automatic margin cleaning functions are available but are dangerous to use, because they will sometimes also remove text.
JRA FirstPage solves this problem by providing you with the ability to manually clean up the images with blasing speed! Draw a bounding box around an area to be cleaned with using your left-mouse button, and when you release the button, your area will be cleaned. When you are ready to move to the next image, simply click using the right-mouse button. The current image in the folder will be saved and the next one will pop up.
When you have finished the cleaning, you can run the deskew function again on just those images that have been cleaned. The cleaned images will deskew more accurately now that the marks are removed.
With JRA FirstPage, it is possible to clean hundreds and even thousands of images at one sitting!
Virtual Document Separation
Virtual Document Separation offers an alternative to the use of physical document separator sheets.
Let's take a look at when and why document separation is needed. If you have a large volume of multipage documents to scan and convert, you need a way to tell the conversion software what pages go into a multipage conversion (PDF) document. The scanner can only make single-page images.
The traditional way to identify which page images are part of a given document is through the application of file naming conventions. A simple way to do this is to have the scanner operator enter a new base name for each new document, then the scanning software adds an incremental number to the base name for each page of the document. This is not a desirable method, however, since the scanner is sitting idle while the operator performs the name entries. It is important to use a method that allows for the continuous scanning of document pages, with minimum interruption.
A more advanced method of applying file naming conventions, and one that avoids scanner interruption, is to use patch or barcode separator sheets. Special separator sheets are printed with a patch code or barcode that can be recognized by the scanning software during the scanning process. These separator sheets "trigger" the scanning software to create a new base name when a separator sheet is recognized, for the pages that follow.
Pre-printed separator sheets allow the scanner to run continuously, but this approach has drawbacks. To begin with, it increases the volume of paper to be scanned. If the average document size is 4 pages, the separator sheets increase the volume of paper to be scanned by 20%. If the project involves scanning 100,000 pages, then 25,000 separator sheets are required. The separator sheets can be reused, but this then requires the labor of removing them from the scanned document set for reuse.
Then there is the cost of printing the separator sheets, and the labor cost of inserting them between documents. As you can see, this method is not without substantial costs in both material and labor.
Virtual Document Separation is a method that does not using file naming conventions. Instead, it is a method that inserts separator files between multipage document images AFTER the documents have been scanned. The separator files are simple (and small) text files with a special file extension.
JRA FirstPage is an application which permits an operator to "tag" the first page of each document, which then causes a separator file to be created before the first page of each document. Document images are displayed in FirstPage with a high degree of clarity and speed, and the labor to manually tag the first page of each document is a fraction of the labor involved with the use of physical separator sheets. And there is no material cost!
Applications such as PDFPublish 2.0 are able to interpret the separator files created by FirstPage and build appropriate multipage documents using this intelligence. The multipage documents can be created in PDF or multipage TIFF format.
JRA FirstPage Guided Tour
FirstPage Application Icon
JRA FirstPage, from James Rile Associates, is a virtual document separation and image-enhancement program for bitonal page images in the TIFF format. It allows you to process folders of TIFF files generated by a high-speed document scanner, and to enhance and prepare these images for digital publication using PageGenie Enterprise Edition.
FirstPage is both a visual processing application, and a batch processing application. FirstPage features a unique method of virtual document separation.
The operations which can be performed on the TIFF files are:
Remove unwanted artifacts from the images (visual).
Rotate images so the text is in horizontal orientation (visual)
Tag (identify) the first page of each multi-page document (visual)
Despeckle and deskew the document images (batch).
Clean-up the borders of the images (batch).
The tagging of first pages generates document separator files which are interpreted by the PageGenie application to construct multipage digital documents in PDF or other formats, following OCR (optical character recognition).
JRA FirstPage Button Bar
The button bar provides quick access to all of the most common functions of FirstPage. From left to right, these buttons have the following functions:
Rotate 90º Right
Rotate 180º
Rotate 90º Left
Save and Next
Beginning of Document
Scale-to-Gray Display Method
Favor-Black Display Method
First Image
Previous Image
Next Image
Last Image
Folder/File Open
Run a Batch Process
Running the Pre-Process and Post-Process Batch Jobs
The Batch Process Button
Click the icon illustrated above to enter the dialog box for Pre-Process and Post-Process.
Both the Pre-Process and the Post-Process Jobs are run from the same dialog box.
During Batch Processing
During the processing, a status meter bar will display at the bottom of the window.
When all TIFF files in the folder have been processed, a Completed message will display in the status bar. While processing, the application can be minimized and other tasks can be performed on the PC.
Running the Pre-Process
To run the Pre-Process, first select the enhancements to be performed. By default, Despeckle and Deskew are checked. If margin cleaning is needed, check this box as well. Enter the fraction of an inch that will be cleaned from each edge of the document image.
Next, select the SOURCE folder containing the images to be processed. Then, select the OUTPUT folder where the enhanced images will be written to. If the OUTPUT folder is the same as the SOURCE folder, the source images will be overwritten with the enhanced images. If different, the source images will remain unmodified in the Source folder.
When this has been done, click on the Run button for Pre-Process.
Pre-Process is normally run before any of the images in the folder are edited. The deskewed and despeckled images are easier to view and edit.
FirstPage can deskew even severely (up to 45 degrees) skewed images. But quality is slightly reduced in a severely deskewed image.
To minimize skew in the images, be sure that the paper input to the scanner is accurately aligned.
Running the Post-Process
To run the Post-Process, first select the enhancements to be performed. Deskew Changes is checked, and cannot be modified. This function will deskew any document images that were edited during the visual editing process. The deskew will be more accurate than the first time these images were deskewed. Only the images that were "pixel-modified" will be deskewed again.
Next, select the SOURCE folder containing the images to be processed. Then, select the OUTPUT folder where the enhanced images will be written to. If the OUTPUT folder is the same as the SOURCE folder, the source images will be overwritten with the enhanced images. If different, the source images will remain unmodified in the source folder.
Since the SOURCE folder will contain document separator files for PageGenie, these will be copied to the OUTPUT folder along with the document images that are processed.
When this has been done, click on the Run button for Post-Process.
The Post-Process is run after all of the images in the folder have been edited. The output of the Post-Process is files that are ready for optical character recognition and conversion to digital documents using PageGenie.
Opening a Folder of TIFF Files for Visual Editing
File Open Button
File Open Dialog Box
Navigate to the folder you wish to process. Select the first file in the folder if you are processing this folder for the first time. Select a subsequent file if you are resuming where you previously left off.
You can use the File Open box as a "Go To" function if you need to go to a file in the middle of the folder.
Stepping Through the Document Images
The first file you selected will be displayed in the application. The default resolution for this image is Fit Page.
Tagging the Document Image as a First Page
Clicking on the Beginning of Document icon above will tag the page image as a document first page. The appearance of the icon will then change to look as follows:
The changed icon will display whenever you return to display this page image.
Integration feature for PDFPublish
Upon tagging an image as a First Page, a document separator file for PDFPublish 2.0 is written to disk. Upon un-tagging an image, the document separator file is deleted. The end-of-document separator file has a .EOD extension.
Example:
00000001.tif
00000002.tif
0000002zzz.eod
0000003.tif
In the OUTPUT folder of the Post-Process, the EOD files are written in sequence with the TIFF files.
You can set up PDFPublish to "watch" the output folder, and then PDFPublish will begin processing the output folder even while the FirstPage Post-Process is still running.
Rotating Page Images
Frequently, pages which are printed in landscape orientation are rotated to portrait orientation before being placed into a multipage document. The text lines in such page images are in vertical orientation. These pages, referred to as turn-pages, need to be rotated so that the text lines are in horizontal orientation.
The Page Rotation Buttons
A single click on one of the rotation buttons will rotate the page image. The right button, Rotate 90º Left, is most commonly used.
Turn-page before rotation
Turn-page after rotation
Changing the Magnification of the Page Image
Fit Visible & Fit Width Buttons
The button bar provides an easy way to change between the default magnification of Fit Visible and the alternate magnification of Fit Width. Fit Width streches the page image horizontally to fit the window, but only the top portion of the page is displayed.
A scrolling bar on the right will allow you to view the entire page in Fit Width magnification.
Example of Fit Width Magnification
Changing the Display Method
Scale-to-Gray and Favor-Black Display Buttons
The default display method is Scale-to-Gray.
Scale-to-Gray is the easiest way to view the page image in general. If you are not performing fine clean-up work on the image, this is the preferred display method.
If you are performing detailed cleanup to remove specks and marks on the image, then use the Favor-Black display method. When rendering the reduced image that you see on your screen, this method does not "hide" any black marks during the reduction. All black marks are visible, giving the image an accentuated-black appearance.
Favor-Black Display Method:
|
Scale-to-Gray Display Method
|
|
|
|
File Navigation
File Navigation Buttons
With the four navigation buttons, you can move forwards and backwards through the folder of page images. The functions of the four buttons are: First, Previous, Next and Last.
If you have made changes to the page you are viewing and you press one of the navigation buttons, you will be prompted to save your changes first. The Save & Next button, by contrast, has no confirmation message and automatically saves the changes each time.
Use the navigation buttons to browse and proofread the documents. Use the Save & Next button which processing a folder of images.
Save and Next
Save & Next Button
Left-clicking on the Save & Next button has the dual function of automatically saving the displayed page image and navigating to the next page image in the folder.
The Save & Next button is optimally located between the Rotate 90º and First Page buttons, which are the most frequently-used buttons.
The Save & Next function is also executed when you right-click on the page image
The White-Fill Function
The default action of the cursor in the page image region is to draw a bounding rectangle for a white-fill operation.
Position the cursor to one corner of a rectangular area you what to fill with white. While holding the button down, move the cursor to the opposite corner. Upon releasing the left mouse, the area will be filled with white.
A photocopy edge mark exists, and a speck on the lower-right. The bounding area will be white-filled, disappearing into the white background.
When this image is deskewed again in the Post-Process, it will become completely straight.
The Menu Bar
All of the functions of the Button Bar, plus a few more, can be found in the menus. Also, Alt Function Keys are available for each menu item and function.
The Menu Bar
The File Menu
Using the Save As menu item allows you to save a selected image out to another folder.
Using the Revert Image menu item is very useful if you make a mistake with the White-Fill function. This will reload the image file from disk, undoing any changes that you made.
The Edit Menu
Use the Invert function if you have a page image with a black background and white text. Invert will reverse the black and white in the image.
By default the White-Fill function is active, and this status is indicated by a check mark next to the White-Fill menu. Select the menu item again and White-Fill will toggle off.
The View Menu
On the View Menu, you can magnify the page image to be the Actual Size, or a custom Magnification factor that you specify.
In addition to to Scale-to-Gray and Favor Black display options, you can display in the Normal option, which is a more traditional method of screen rendering.
|