I am thinking hard today about the problem of creating a physical book, with its attendant technologies, that correlates easily with its electronic counterpart, without simply creating a digital facsimile of the book (that is, I want to avoid simulating its physical technology to achieve correlation). From now on any time I say “book” or “codex” I mean the physical representation of the data as an object with bound pages.
The first step is to dissect the book in order to identify what is the text or content and what is part of the technology of the book. This is more complicated than it looks — my first guess was that any letters or numbers would be obviously part of the text, but this is not true. So let’s look at the pieces:
The page. This is an integral part of the book. It need have no relation to the text.
The table of contents. This is a part of the book’s technology but also probably a necessary part of the e-book’s technology. It is a way to rapidly skip to a relevant logical section of the text (chapter, heading, maybe even sub-heading). In the book it references a page. In the ebook it references a logical position (that is, the user doesn’t care how it’s referenced — she clicks and arrives — whereas in the book the user is part of this machine by manually converting a page number to a page by flipping through the codex).
The page number. This is book technology because pages are book technology.
The heading. This is part of the text. It allows the subdivision of the contents into logical texts that have a consistent context. This allows improved navigation (by table of contents), serves as a landmark within the text (through typographic differentiation from the body text), acts as a referent for discussing the text (I can say “Chapter 3 was especially clever,” and you know how to get there), and acts as a tool for the writer to organize thought.
The paragraph. This is the fundamental unit of text as a thought delivery mechanism. It is the smallest unit of the body text worth referring explicitly to. Note that the book provides references (usually, as an exception see any Bible) to a larger unit (the page) that is not textually relevant. The page reference is handy only because the book is a book. It’s a workaround for real referencing. Evidence? Every time you have had trouble finding an indexed item on a page.
The word and the letter. These are sub-units of the paragraph and only of typographic interest. Shared by all forms.
The line. An artifact of the book, usually, though verse makes the line a relevant part of the content. Note, however, that in the case of verse a versal line might span more than one book line. This indicates that it might be more valuable to think of the versal line as a paragraph. The line then becomes purely an artifact of physical printing and is not electronically interesting to the reader.
Illustration. Firmly part of the content.
Tables. Content.
Notes (end, side, or otherwise). Certainly part of the content and even books fail to find a consistent way to manage them. Ebooks similarly fail or succeed in different measures with notes. They are the deepest parenthesis we have, though, and need representation in all forms, and therefore are content.
Index. It is tempting to see the index as part of book technology because it refers to pages (artifact) and appears to be replaced by electronic searches. I think this is too shallow an observation, though. Certainly a bad or automatically generated index is no better than a search, but we should only be interested in a good index. A good index is a selection of words that are contextually interesting (that is, not all words) combined with a subset of all possible references in the text that represents the “best” places to find that word. It is a deliberate anticipation of the interests of the reader and of the ways in which the book will be used. This has electronic value either as a way to sort search results (indexer’s ranking before numerical ranking) or as a kind of inside-out table of contents (words linking to some reference in the text — a page in a book and something else in an ebook).
Now, what do we need to do with these elements? Here come the requirements. All representations of the data (book or ebook) shall:
…allow reading the author’s text in the order the author intended. Books have an order. Ebooks must have an order.
…allow readers to share the location of points of interests in the text. Books traditionally do this by page number. Ebooks so far fail. A successful scheme will work for both.
…deliver legible content to the consumer regardless of the consumer’s choice of technology. This is easy for the book because the user doesn’t choose the technology — she gets a book. This is also where the PDF falls apart — it makes assumptions about something that is intrinsically part of the user’s choice by requiring page shapes, font choices, and other elements that are artifacts of its simulation of a physical object.
…allow rapid motion throughout the content, forwards and backwards, with cues as to progress along the way. Books allow you to riff through pages to find the rough location you recall seeing previously. This process is extremely fast, though the actual location isn’t. The actual underlying requirement here is that progressing an arbitrary distance through the text should be a roughly atomic action. Most e-readers fail us here — finding a spot much further in the text than the current location is usually very multi-modal — raise a menu, make a choice, type some text, click a thing — and this is antithetical to the desire of the user. This has to change in the reader technology. It hasn’t been a priority because the target of the current reader is the paperback, which is expected to be read only in one direction. For pedagogical or reference works (obviously I’m thinking of games here too) this is not nearly enough. The PDF comes through here with page thumbnails, and some similar kind of landmarking (or mapmaking, perhaps) would be something to explore on the reader side.
…allow simple reference from a table of contents item to its referent location in the text.
…allow simple reference from an index item to its referent location in the text.
Not all of these are addressable by manipulating the text. Internal referencing is, for example — the book solves it with page numbers and I can think of several ways to solve it in an ebook. I can only think of a few that will work for both and still correlate — give the same results regardless of how you have chosen to purchase and view the text. Some are purely technological and await smarter software.
But I promised to talk about correlation because that’s what I was thinking about, so finally here that is. You are at a party talking about your favourite text and you whip yours out and skip to a bookmark you made and find the note you took on the content and read out the bit and the observation to your friend. She gets out her book and asks you where that is exactly so she can read it herself and check to see if she made notes.
Scenario A: You have a book and she has a reader. You announce, “Page 123, about halfway down.” She is baffled. “I don’t have page numbers.”
Scenario B: You both have books. You announce, “Page 123, about halfway down.” She flips and smiles. “I never spotted that!”
Scenario C: You both have readers. You announce, “Location 4322 or so.” She is baffled. “Mine doesn’t use the same locations.”
Obviously A and C are unacceptable. Now, can’t fix the technology from here but, as with the textual addition of codex technology we see in the page number, we can fix the text to allow a mutually comprehensible reference language:
Scenario A`: You have a book and she has a reader. You announce, “Amphora, 2:16,” because your running header says Amphora on the left page and Section Two on the right and the paragraph you have underlined has the number 16 beside it. She clicks to chapter Amphora and again on section two ten scans for the paragraph with “16″ leading it. She smiles. “I never spotted that!”
Scenario B`: You both have books. You announce, “Page 123, about halfway down.” She flips and smiles. “I never spotted that!” Because books till need page numbers because they are books and have pages! But, alternatively, the same mechanism as in A` — perhaps she has a different printing or even translation (or indeed a different language!) and so you announce, “Amphora, 2:16,” because your running header says Amphora on the left page and Section Two on the right and the paragraph you have underlined has the number 16 beside it. She checks the table of contents and finds the heading for Amphora and the subheading for section two and goes to that page number. She scans for the paragraph with “16″ leading it. She smiles. “I never spotted that!” Note that you cannot do this now. If your friend has the Penguin Classic of Crime and Punishment and you have a first edition in Russian, there is no useful correlation for the two of you.
Scenario C`: You both have readers. You announce, “Amphora, 2:16,” because your running header says Amphora Section 2 and the paragraph you have underlined has the number 16 beside it. She clicks to chapter Amphora and again on section two ten scans for the paragraph with “16″ leading it. She smiles. “I never spotted that!” Alternatively, you just click “share by Bluetooth with Amelia” and she sees what you see.
So achieving correlation is not hard and can be done without adding a burden of selection on all parties. A simple typographic convention (like the page number) provides the functionality.
–BMurray
I’m nodding a lot here.
Especially on the “allow rapid motion throughout the content” failure of e-products — I’m thinking also about how some rapid page flipping can let you compare two “distant” segments of the text side-by-side. Most (perhaps all) e-readers don’t let you show two disparate positions in the content at the same time, analagous to “sticking your index finger in page 123, then thumbing over to page 97 to look at the time increments table, then lifting up page 123 to see the power description that talks about manipulating something X time increments”.
Multi position viewing is one way to address this, but even more radical in e-content would be somehow *embedding* the time increments table in a pop-up window every time it’s mentioned — or even empowering the reader to edit and annotate the document in a way that allows him to reorder/duplicate the content in useful ways peculiar to his own concept of utility.
I think font (and indeed a great deal of layout) is part of the art of making a book. I don’t think it necessarily has the same place in an electronic form (we’ll want to find new things that do allow artistic expression of course!) I can see an advantage in giving tactical cues (emphasize this, use a different family here than there) but to tie ourselves to font selection in electronic books has a minefield of issues associated with it, not the least of which is font licensing — creatively, the font itself has the same copyright burden as illustration.
When I choose a font for a book I can make assumptions about physical presentation — print resolution, page size and shape, whitespace, and so on. The electronic reader places all these choices in the hands of the user and so font selection must (sadly!) reside there also.
Fred has identified my pet peeve with digital publications, and I didn’t even know I had one, until I read his post. It would be ideal if by pointing to, or hovering over a word or phrase, the ‘device’ would automatically bring up additional related content, that could then be marked, annotated, re-ordered, archived etc…
That would be a huge advance in reader technology and not incredibly hard to do. Not something that the content provider can make happen, sadly, but something that would lift the device from paperback audience (for which it is wonderful) to general-purpose (research, pedagogical, reference) audiences. Very cool idea.
I’m not completely sold on the idea of the “new world” giving us a complete and total freedom from font/presentation elements. We had that for a time before you could inflict your own style and layout choices in webpages, and frankly it was something of a barrier to commercial interest in that.
Commercial interest — a potential prime driver of technology in this case, at least as far as adoption goes — is going to demand some preservation of presentation control, because presentation control allows the commercial entities to enforce brand awareness and to control how the commercial value of the content is delivered to the end consumer.
TiVo crushed the rest of the DVR market early on because they deliberately worked on strategies to play well with the commercial interests of advertisers (which is why a 30-second skip option on TiVos is a deeply buried feature and not enabled by default, among other things). So ceding some amount of presentation control to commercial interests is going to better enable the rest of the ideal implementation of our hypothetical badass e-reader of tomorrow.
These are all good observations, but couldn’t most of these issues be solved by improving the ability of e-readers to handle PDF formats?
I understand that you don’t want simply a digital facsimile, but it does seem that this is the best way to get a correlation, especially since it can be done improving the already available technology, rather than (re-)inventing entirely new ways of doing essentially the same thing.
Fred, yes, I think that there is an artistic and commercial interest that is not well addressed. However, pointing to web pages is an illuminating example: your web page cannot tell me what font to use, and yet we see a proliferation of web pages (commercial, some of them!). What it can do is offer hints, and this would be a good addition to ebook formats (and, frankly, it’s already in most of them). Your web page can say “use Garamond and, failing that, any serifed font” but the only way it can say “these letters look exactly like this” is to send an image rather than logical text. That is also available in ebook formats. I think what is missing is solely sophistication in the producers of electronic text — right now they are mostly automatic (and naive) conversions from old data.
Lanfranc, my issue with a facsimile is that it fails to take advantage of the possibilities inherent in unpaginated text. Paginated text is, logically, a subset of paginated text and therefore less powerful intrinsically. Adobe itself chose to develop “reflowable” formats (ePub) rather than retrofit PDF precisely because (I think) a correctly retooled PDF would in fact be a reflowable format, and patching an already aging construct was probably less viable for them than building from the foundation of an intrinsically more powerful idea.
I don’t think I was clear there — yes, commercial interests are a barrier to the success of a “better” new technology and this has always been the case. Frequently, however, a wave of failures (c.f. Xerox) is followed (when the time is right commercially) with sudden adoption (c.f. Windows and Mac). The web is already pointing the way with font and layout — all of what is done on the web is friendly to a hypothetical e-reader of today.
Sorry, not all of what happens on the web. Whenever we make assumptions about absolute space (pixels) we design badly for the web but we design anyway. The same criteria for design quality would apply and be abused for any reflowable content device.
Just realised that a paragraph number as an index into the text needs to be uniquely searchable if it’s to be independent from reader software functions. That means a unique character (already have a pilcrow I guess) and an increasing number (never reset to 1 in the same book). Or leading zeroes and increasing.
There is still some assumption about content built into the Book, Chapter, and Verse style of content markup, which is that the Verse that is marked up by a number is going to be a significant amount of text and so the markup will not have too much overhead. Visually, a paragraph number is not very much noise while reading if the paragraphs are large, but there are classes of content that don’t match that assumption.
The example I have foremost in mind is large sections of dialogue, where the paragraph markers would become line markers and so become fairly intrusive to the text. I imagine there are other situations where the resolution of the paragraph is inconvenient as a reference (such as anything written by John Locke). Formatting issues could alleviate the noise (consider the contemporary example set by ruby characters).
That said, it’s a small niggle of an insightful look at ways to break ebooks out of their book heritage. I keep getting excited at the thought of putting a PDF on my iPod Touch for easy reference on the go, but the reality is that it’s a royal pain to actually reference them in that format.
This post and the last one got me thinking about HTML again. It’s overlooked quite a bit, because it’s foundational. No one walks into a new house and thinks first off, “Wow! I bet there’s a great concrete slab under here!” But HTML is, pardon the nerdgasm, one of the most impressive communication technologies ever devised.
And it was devised by academics (and eventually commercial interests) too, for exactly these kinds of problems. How do you point out a “page” when all the content is in one stream? Hyperlinks. How do you enforce layout, font, color, and style when the user can manipulate the raw text all they want? You use CSS to implement a default style.
Admittedly it has limits and is not perfect, but its a solid model to move from. You seem to touch on this a bit in the comments. Another interesting element of HTML is the ability to hide data and manipulate it. Notes needn’t be at the bottom, side, or appendix. They can exist “behind” their marker in the text, only to appear when the user focuses on them. Page numbers, line numbers, even word numbers can all exist as hidden information; a rich backdrop of meta-content to be explored as needed by the users (by users I mean content providers and consumers).
But I think, in a lot of ways, Google was right: its all about search. I feel like “Check page 123″ or “Go to section 3:14″ are both archaic and artificial to how we naturally approach text. Closer is “it’s in the Ship section, near the part about lasers.” Search technology turns that phrasing into a position in the text through something like “Ship : Lasers : Cool Thing”. It’s not referential referencing, but contextual referencing.
One more thought on correlation that Toph pointed out. There is a real case he reviewed of a German book considered canonical and its English translation. As this is a book of academic interest, referring to its contents requires somehow referring to the location in the canonical (the German) form even if you only have the English. The English translation solves this by placing a marker in the English text every time the German text changes page. So the text:
“This is the English translation [177] of some German text.”
…indicates that in the German, page 177 begins with the German for “of some German text.”
That’s pretty elegant. It’s partly cool because it moves backwards — rather than require a new markup for books, it only places that onus on new material. That smells “right” to me.
Bad Behavior has blocked 58 access attempts in the last 7 days.
11:27
I find it odd, that you don’t include ‘font’ as an integral element of content. Surely given your own passion for selecting ‘the right font’, it should be part of the artistic product, rather than a choice to be made by the reader.