From microfilm to PDF
ProQuest is a key partner for newspapers, universities, libraries, and other content holders, preserving and enabling access to their rich and varied information. Those worldwide partnerships have built a growing content collection that now encompasses 90,000 authoritative sources, six billion digital pages, and spans 600 years.
In the department that digitizes and archives newspapers to microfilm, they underwent a transformation in the past decade where their workflow started to change from scanning hardcopy pages to receiving content in digital (PDF) form. Although this was an advantage, on one hand, it was a complete change in their production workflow and presented them with a challenge to expand.
“Once we started to receive PDF pages we had to change the way we worked.”
“Once we started to receive PDF pages we had to change the way we worked,” said Rick Griffith, Director of Manufacturing at ProQuest. “PDFs needed to be downloaded, organized and RIP’d for imaging, manually by an operator” continued MarcYeiter, Applications Administrator.
With this manual workflow, two full-time employees were “maxed out”, and could only process ten titles with this new PDF process. Soon after, one of ProQuest’s largest clients agreed that they too would start sending PDFs instead of printed pages for the 80 titles they published. At this point, it was obvious to ProQuest that it was time to look at ways to automate or they would be buried trying to keep up with the change to PDF submission by their clients.
Automation was a must
Marc Yeiter, Applications Administrator, was tasked with investigating automation options. Looking first at some of the enterprise level automation options, Marc quickly determined that these solutions were not the right fit for their needs. The initial investment was too costly and they really needed a solution that was scalable and they could maintain themselves.
Marc soon discovered Enfocus Switch, while investigating upgrades to their Enfocus Pitstop Server. When discovering Switch, Marc could see that Switch was more workflow-centric and looked very easy to administer, whereas the enterprise systems they investigated were much more complex. In addition, Marc was impressed by the open approach to Switch which meant it may be able to be used for their unique workflow. They decided to learn more and contacted Enfocus.
After some communication with Enfocus, All Systems Integration (ASI), was recommended for their automation project. ASI is an Enfocus certified integration partner located outside of Boston, Massachusetts. Working with ASI, they developed an example of their current workflow using Switch. From this introduction, the Proquest team was able to see clearly how their production could be improved.
“It was important that Switch was able to address our current workflow. After a few meetings with ASI, we felt very comfortable that Switch was going to do what we needed and it was in our budget.”
Starting with the New York Times, ProQuest started rolling out their automated workflows. Switch handled all the tedious tasks that the operators previously handled, such as receiving PDFs from the FTP server and sorting them into editions.
“In the beginning, our staff were skeptical that the Switch workflow could sort the editions in the right order. Every page would come in as a single page PDF and would have to be sorted based on a spreadsheet that described the order for that edition,” said Rick.
With the help of ASI, they used some of the advanced features of Switch to read into the spreadsheet file and sort the incoming page files into the correct order automatically. In addition, since every publisher had different set-ups, ASI trained ProQuest on how to modify the workflow for other publishers. This way ProQuest could manage the system themselves and add on more clients and editions as needed. This was a key advantage for ProQuest as they didn’t want to have to rely on an outside vendor to expand their automation system every time a new publisher came on-board.
Soon after implementing their first client, the staff fully embraced the new automation system and ProQuest moved forward bringing on other clients such as Gannett Publishing that had 80 newspapers under their brand. Today, ProQuest has five employees processing 150 titles with the capacity to add more titles and not have to increase staff.
The benefits of Switch were very clear for ProQuest. Switch allows them to expand their services quickly and respond to changes with customers. Their employees are focused on quality and not spending time doing mundane tasks, such as moving files on the network. Plus, Switch is flexible and scalable enough that they can expand their workflows to accommodate other products, and create new products helping to keep their services innovative and fresh.
Soon after getting their workflow operating for their newspaper microfilm services, they turned their focus to automating their dissertation service. Today, ProQuest runs two Switch servers internally processing over 1.2 million pages a month.
Proquest has also implemented a third Switch Server to process incoming XML data files. With the open architecture of Switch, ASI was able to build a full stack web app WorkIT that tightly integrates with Switch. Switch assembles the appropriate publications for creating each microfiche. Then, it populates the custom WorkIT database, which manages the entire process. Management can see real-time updates on jobs as they go through the system. Multiple Switch servers not only give Proquest additional capacity but also provides redundancy for all of the Switch processes.
Looking to the future, the ProQuest team is planning to expand their Switch automation into their hardcopy digitization work. In these workflows, scanned images will need to be sorted, OCR scanned to make them searchable and processed for microfiche and for the web. Today, ProQuest scans about 24 million pages a year and soon, Switch will be a key part of this workflow too.