Old Documents Meet New Technology
Washington, DC – April 21, 2008
Scanning paper, microfilm represents a growth industry for imaging companies
By Doug Beizer
Even with today’s digital databases and electronic forms, government warehouses are filled with mountains of paper and film documents.
For example, medical records of military personnel must be maintained throughout their lives. That could mean retaining microfilm records for about 100 years in some cases.
Despite decades of computer-centric processes, scanning and data mining of old documents are still growth industries, said several technology vendors at this year’s FOSE show in Washington. FOSE is produced by 1105 Government Information Group, publisher of Washington Technology.
NextScan’s NextStar software, for example, was designed to scan microfilm and microfiche.
Traditionally, microfilm scanners must center film frames in a window to capture images, but NextStar captures more than a window view, said Mike Oris, a NextScan sales manager.
“We scan edge to edge of the film, from the beginning all the way through to the end,” he said. “The ribbon image guarantees the capture of every image on that roll — nothing is skipped, nothing is truncated — and that gives you better quality images and minimizes the need for rescanning.”
NextStar draws a box around each frame and numbers it. If an image is not found, tools alert the operator that something was skipped. If an image was chopped in half, the operator can go back and fix it. No rescanning is necessary because all the frames are available in the ribbon image.
Oris said the need to scan vast quantities of film will continue for years. “Since the 1960s to the early 1990s, microfilm was the major storage medium used by banks, insurance companies and government agencies for the long-term preservation of information and images,” he said. “They needed a medium that could span a long time frame, and microfilm and fiche were the only mediums used at the time.”
The reservoir of microfilm still exists. Microfilm is no longer manufactured, but people are concerned about making the information from existing film more readily available.
“They need to scan the microfilm and bring that information into the digital world so they can distribute it, manipulate it and do whatever they need with it,” Oris said.
Once old documents are scanned into a database, government agencies need better ways to search and manipulate the data.
Kyos Systems Inc.’s TransFORM lets organizations scan and transfer documents securely. Its analytics can detect information patterns, track data, and easily audit paper-created or electronic data. The system converts each piece of paper into a relational database. Traditional scanning simply turns documents into a static form such as a PDF, said Kevin Pang, Kyos’ president and chief executive officer.
“PDFs are pretty … difficult to search,” he said. “There’s a huge data penalty because as you scan more and more things into your system, it begins to slow down. The memory is more difficult to manage, [and] search becomes a very untenable experience for a lot of end users.”
Rather than treating a form as one file, Kyos breaks it down into its elements, which frees the data from the paper and allows it to move through Kyos’ system, Pang said. The Defense Department has warehouses full of paper that are difficult to search and share securely. Traditional scanning methods often rely on a Google-based search that requires the user to visually determine if the correct document was found.
Kyos semantically understands each document and tags all the data elements. Then a variety of search and security rules can be applied and the data aggregated on demand. “One of the things that we do for the military is allow them to ask questions like ‘Show me the last 20 blood pressure readings taken over the last 10 years, sorted by date,’” Pang said. “Then, ‘Show me all the data that’s related to that, for example, medications and procedures.’”
With all that aggregated data, doctor and patient can sit down together and monitor the patient’s progress.
To achieve that level of search capability, Kyos used algorithms originally developed for the study of genetics. When a new form comes into a system, it is not considered an exception; it is viewed as a mutation. In essence, the algorithms allow the form to evolve.
“Forms change for a reason,” Pang said. Generally it is because someone wants to add data to a form or create new data relationships. The system maps all the structural elements that constitute a form and builds semantic knowledge about what the form is designed to do.
Kodak Co. has also recognized the importance of manipulating scanned files with the release of its Capture Pro Software. It is designed to assist in complex capture jobs, said Craig Carlisle, a Kodak technical manager. Compatible with nearly all Kodak document scanners, Capture Pro Software provides improved methods for capturing and extracting information, including intelligent selective image display and post-scan image-processing features.
The software has single-click capture capability, which helps condense complex capture tasks to the push of a button or selection of a user-definable shortcut, such as Scan and Send to E-mail, Scan to PDF, Scan and Automatically Index, or Name and Output Image Files.
It also has automatic page-orientation correction to improve throughput and ensure that images are captured accurately. Its Batch Explorer gives a complete overview of what has been captured, which aids in quick selections of documents.
Doug Beizer (dbeizer@1105govinfo.com) is a staff writer at Washington Technology.