Linearization and Byte Serving Explained

Linearization is a feature of PDF and byte serving is a feature of HTTP.

Byte Serving means that HTTP requests can include a specification of which bytes in the file are to be delivered, and the server will deliver that range.

A PDF file consists of a load of arbitrarily linked objects: to be able to find these objects their offsets from the top of the file are stored in one or more cross-reference tables. The cross-reference tables are referenced from an entry at the end of the PDF file.

What linearization does is make hint tables that contain references to all objects on a per page basis. The hint information is put somewhere near the top of the file (it must be in the first few KB for it to be recognized).  As you navigate the file across the web, information from the hint tables is used to specify the range of bytes to be got by HTTP.

Without Linearization you have to read to the end of the file to be able to determine its structure.

It is all described in great detail in the PDF Reference doc.

How do you use it?

Byte serving is something that needs turning on at the server.

Linearization happens for you when you select Optimize for Fast Web View during distilling or saving the file in Acrobat.


Is there any way to only serve up a portion of a PDF file at a time to a remote client reader? If the user goes to page 10 of a document, then serve that page for instance?

In theory, that's what "linearization" (aka Fast Web View) is supposed to provide when used in conjunction with a modern web server and a properly configured web browser.

Unfortunately, not all software that performs linearization of PDF's does it correctly, not all web servers are 100% compliant with the "byte-serving" spec, and not everyone's browser and/or
Acrobat is setup to "do the right thing".

As defined, PDF is a random access format and therefore the entire PDF file needs to be present before viewing any of it.

Actually, the random access nature isn't what requires that (anymore), but it's the fact that in a normal PDF, the cross reference table is at the end, and not the beginning.   That's the big thing that linearization brings - Xref at the beginning so that it can be easily read.

Also, with HTTP/1.1 there is this concept of "byte-serving" whereby a client can request a range of bytes from the server (instead of all bytes, the default).  This is how the PDF plugin can read selective parts of the PDF.