Papers
Linearization and Byte Serving Explained: Linearization is a feature of PDF and byte serving is a feature of HTTP.
Text Storage and Text Indexing: Traditional full-text indexing schemes extract the text for searching, but fail to also extract page layour and word coordinate information - the mapping of text to image is lost.
SearchPDF for Millions of Documents: SearchPDF can be used to create a searchable index of millions of documents when the documents are correctly organized for optimal performance. This paper discusses the organization of documents and search-indexes when the document collection is in the millions.
Highlight File Format - Adobe Technical Note 5172: This technical note describes the file format and URL specification that allows a Web server to highlight text in a PDF file being displayed by version 3.0 and above of the Adobe Acrobat Viewers (Acrobat and Reader).
XMP in PDF Metadata Dictionaries and Streams: Metadata can be stored in a PDF document in either of the following ways: In a document information dictionary associated with the document. In a metadata stream (PDF 1.4) associated with the document or a component of the document.
SearchPDF Client Platform Requirements: SearchPDF requires no special client software other than a web browser and the Acrobat Reader.  But there is a weakness with Macintosh OS-X, and other installation and configuration issues for the Acrobat Reader need to be considered. Includes a complete listing of Acrobat Reader downloads.
User Authentication for SearchPDF and PDFMetamaker: User Authentication can be added to SearchPDF and PDFMetamaker using any of a large number of authentication product add-ons for Microsoft IIS.  User Authentication is reviewed here, including a featured authentication product and a live demonstration.
Text-Image Maps for Document Delivery on the Web: Text-image maps are a valuable tool for coordinating the full text and page images of documents in electronic libraries. In different situations, either the text or the image is a more convenient representation -- and text-image maps make it easy to change between representations as the situation dictates.
FineReader Inside: FineReader is the first OCR system of an entirely new generation substantially different from all its predecessors. The object of recognition for FineReader is the entire document and not just printed text. This has become possible thanks to ABBYY’s sophisticated recognition technology based on the principles of Integrity, Purposefulness and Adaptivity (IPA Technology).
A Diversity of Languages: It is a well-known fact that hundreds of languages are spoken around the world, but we in the digital publishing and document imaging industries often do not take pause to consider that written and published documents exist in all of these languages as well. We have considered this need and offer the PDFPublish product with the capability of OCRing in 176 languages.
PDF Viewing Tip: Did you know that it is possible to change the Search Hit Highlighting color in Windows?
PDF WebSearch at MIT: Division 3 of MIT - Lincoln Laboratory, located in Lexington, MA, has implemented an innovative solution for document management on their corporate Intranet.  Web-based document management is seamlessly integrated with calendar and project management.  Document Management is now fully automated, with users making document submissions directly into the system.
Intelligent Documents for the Web: While HTML files are native to the web and web browsers, other file formats are not.  A few  file formats besides HTML can be displayed by the web browser natively.  Others require web browser plug-ins. Each format differs in behavior on the web due to differences in both the architecture of the file format and in the way it is managed inside the web browser. In this mixed and evolving environment, SearchPDF is putting forward operational standards for what constitutes "intelligent documents" for the web.
Metadata at the Sub-Document Level: Intelligent documents, such as PDF and DjVu, can contain multiple pages. There is intelligence to be encapsulated in metadata at the document-level, and in many cases there is also intelligence to be encapsulated at the sub-document level.
Formerly Known As...: Adobe terminology for PDF, then and now.
PDF Migration from CD-ROM to WEB: Digital Publications, once entirely associated with CDs, will move to the Web and to DVDs.  The Web will dominate.  The competitive advantage will be in how rich and effective the delivery can be.  SearchPDF can take you to the leading edge today.
"Browse Before You Buy" & "Take It With You": The concepts of "Browse Before You Buy" & "Take It With You" are central to purchasing books in a bookstore, yet they are missing from the selling of books on the web.  That is, until now!
Adobe Support Knowledgebase Document 321627 - Searching the Contents of PDF Files on a Web Site: While Adobe Acrobat 4.x and later and other Adobe products enable you to create PDF files that are optimal for posting on a Web site, Adobe Systems does not make a search engine that enables you to search the contents of all PDF files on a Web site. Therefore, you can't search the contents of a site's PDF files the same way you can search a site's HTML pages using a Web search engine (e.g., Hotbot or Yahoo). To search the contents of PDF files on a Web site, you can use Acrobat's Find and Search commands on individual PDF files as you view them, use a third-party PDF search engine, or use the Adobe PDF IFilter plug-in for Microsoft Index Server.
Developing and Deploying Web Applications using ASP and Microsoft Internet Information Server: With IIS 4.0, Microsoft introduces a new paradigm to the Web--transactional applications. Transactions are the plumbing that now make it possible to run real business applications with rapid development, easy scalability and ATOMIC reliability.
Acrobat Catalog Problems and Workarounds: Acrobat Catalog Problems and Workarounds
Precision In Searching: Precision In Searching means using the shortest, fastest way to find exactly what you are looking for. It also means the ability to easily modify and refine a search so you can zero in on your target.
MetaFeatures Comparison Chart - PDF, DjVu, LuraDocument: MetaFeatures Comparison Chart - PDF, DjVu, LuraDocument
dtSearch Case Study: dtSearch Reviews PDF WebSearch  8-29-2000
Competing Search Products: Like PDF WebSearch and dtSearch Web, ISYS Web and Verity Search both support search-term-highlighting and hit-page-navigation.  RetrievalWare WebExpress was a disappointment.  When you launch the PDF file, it flashes a message saying "Creating Highlighted Document and Jumping To It..." but then the PDF file opens without any highlighting.
Comparison of Excalibur and PDF WebSearch Boolean Commands: Comparison of Excalibur and PDF WebSearch Boolean Commands
XML and ASP.NET: XML and ASP.NET are promising technologies for the evolution of the SearchPDF product.
Plugin Detection Test: Plugin Detection Test