Erik Benson's wonderful All Consuming book site continues to delight me. The newest feature, First Line Trivia, presents the first line of a book on each refresh of the home page. You try to guess the book, and click through to see the answer. Members, who can edit book metadata, add the first-line data, IMDb-style. Example:
First Line Trivia
"This is a tale of two cities. Cities of the near future, say 10 or 20 years from now."
This could get addictive!
The first-line-trivia feature pushed me over the activation threshold, and I registered for the site. As a member, you can create a list of friends, which is seeded for you with candidates gleaned from Google's what's related and bl.ogs' related blogs. When friends add books to their All Consuming lists, you can receive them as Web (and optionally email) recommendations--and vice versa, your list can recommend books to them.
In fact, I'm unlikely to maintain an explicit book list because the blog universe that All Consuming inhabits already disseminates book awareness very effectively. Bloggers mention books on their blogs; All Consuming picks up on those references; its RSS feed brings them to my attention.
I'm surprised that there isn't more chatter about All Consuming on the weblogs I read. Increasingly, when I link to a book, I'm now likely to offer its All Consuming URL rather than its Amazon URL. Of course, as I just realized when reading this interview, Erik is an Amazon employee. Perfect! All Consuming is, in my view, one of the cleverest imaginable marketing schemes for Amazon--and for books in general. More and more books are available, at ever higher prices, but fewer and fewer people read. Boosting demand is the only hope for publishing, and Erik's service does that magnificently well. I'm more aware of books now than I have been in years. And since All Consuming's URIs are compatible with LibraryLookup,
it's easier than ever to satisfy the increased demand.
Update
When you use All Consuming, you may be surprised by unintended consequences. The other day, I was puzzled to see it attribute to me a reference to a book I hadn't mentioned on this blog. I wrote to Erik about it, and he was stumped for a while too, then he realized that it must have been something my Google box found and made available to All Consuming. It just now happened again. I put the phrase "all consuming" into my Google box. A few minutes later, All Consuming attributed a reference to Affluenza: The All-Consuming Epidemic to my blog. I find these spontaneous interactions fascinating and delightful. I can foresee, though, that a time will come when we'll want to be able to control these effects--for example, by applying robots.txt-like technology at the level of page components.
The most compelling effect in Minority Report, for me, was the visualization of active paper. Last night we watched it again, and later some friends dropped by. To put this in context, I live in smalltown New Hampshire, not Silicon Valley or Silicon Alley. There are lots of dial-up Internet happening here, and DSL is growing, but Wi-Fi households are rare. When a topic came up in conversation, and I flipped open the TiBook to check it out, I had an epiphany. The future really is here, albeit not evenly distributed. I didn't mention, and I'm sure it didn't occur to my friends, that I was connecting wirelessly to the Internet. It seemed completely natural that "the Internet" would be "in" this little box, whether or not wires were running to it. The technology is disappearing into the woodwork, as it should. It is becoming a small-i internet.
The emergence of Wi-Fi really has to be the story of the year. I'm currently reading The Wireless Networking Starter Kit, an excellent primer. The authors, Adam Engst and Glenn Fleishman, explaining how and why Wi-Fi is transformative, finally conclude: "It's just freaking cool." Amen to that!
The recent discussion about active intermediaries (Sam Ruby, Phil Windley) sent me in an unexpected direction. What I meant to do was revisit some earlier writing on Web proxies, email proxies, and SOAP routing, and try to draw some conclusions. Instead, I invented another bookmarklet.
Here was the problem. It's nice that I can now look up a book in my local library, but what if it's not in the collection? My library's OPAC (online public access catalog) enables you to ask the library to acquire a book, but the required fill-in form creates an activation threshold that I am rarely motivated to leap over.
The basic LibraryLookup bookmarklet is a kind of intermediary. It coordinates two classes of services--Amazon/BN/isbn.nu/AllConsuming and your local library's OPAC--to facilitate a lookup. I couldn't resist trying to create another intermediary that would facilitate a purchase request.
The solution I'll present here is less general than the basic lookup in several ways, but also interesting in several ways. Here are the ways in which it is less general:
Amazon-only. The basic lookup works with any site whose URL matches either /ISBN or isbn=ISBN. But to fill out a purchase request, more information is needed. This solution relies on Amazon-specific markup to find that information.
Innovative-only. The basic lookup works with any of four OPAC systems (and potentially others, as users discover and report the URI patterns that can enable them). But since I only have an account at my own library, which uses an Innovative OPAC, that's the only case I could try. Further, the solution is likely not to work with your Innovative OPAC. A little spelunking reveals that the /acquire function (e.g. http://your.library.baseurl/acquire) produces differently-constituted forms from one Innovative OPAC to the next. Sometimes name/password, sometimes PIN and library-card number, etc. Mine uses name and library-card number.
Nevertheless, here are the reasons I find the solution interesting.
It works. Specifically, it works for my library, but the geek-inclined should find it easy to adapt it to another Innovative OPAC, and--presumably--to other OPACs.
It's a simple but compelling demonstration of the JavaScript DOM.
It uses JavaScript to set and get Amazon cookies. I'm sure JS hackers take this for granted, but I've never had occasion to try it.
It's a live example of the technique (which Derek Robinson mentioned to me and Art Rhyno showed me) that removes the MSIE bookmarklet size limit.
 |
Intermediating a library purchase request |
Here is the bookmarklet you can drag to your link toolbar:
Please Acquire
Here is an Amazon page against which to test it: The Eighth Day of Creation: Makers of the Revolution in Biology.
Clicking the bookmarklet's link should bring up a screen like the one shown here. It's OK to click the button. I've neutered the script so it will just pop up a message rather than send the request. To unneuter it, rewrite the form's action= attribute to specify your OPAC's acquisition-request URL.
A few points to note in the code that follows:
Amazon's consistent use of the first META tag makes it very easy to pick out the book's title and author, like so:
var m0 = document.getElementsByTagName('META')[0];
var titleAuthor = m0.getAttribute('content');
Gotta love that million-dollar markup!
There's no million-dollar markup for the publisher's name and date. Digging that out of the page is feasible, but much harder. I punt, in this case, by referring the librarian to the book's Amazon URL.
The setCookie script written into the generated form uses Amazon's domain. I'd never thought about this, but cookies are a two-way street. Amazon can use them to coordinate with me, but I can also use them to coordinate with Amazon. In this example, the generated form looks for Amazon cookies named MyLibraryUserName and MyLibraryUserID. If it finds them, it defaults two fields to their values. Otherwise, whatever you type there is remembered (via the onChange() handler) in Amazon cookies.
All in all, an instructive little exercise. This sort of technique won't replace active intermediaries, including the local kind that work at the level of HTTP or SMTP. Rather, it will complement them. Users need to be able to see, and approve, what intermediaries propose to do on their behalf. I like the idea of an interactive intermediary that prepares a connection between two services, previews it for the user, and then makes the connection.
Update
The script below contains a privacy bomb which, after a few minutes of reflection, I removed from the live version invoked by the bootloader. It's a fascinating scenario, actually:
You don't want Amazon to see your library-card number.
Don't store/send it at all. This, of course, eliminates most of the convenience of the solution.
Store/send an encrypted version.
You do want Amazon to see your library-card number. Does that sound crazy? Maybe not. Reasons I might trust Amazon with that information:
Because it could use it to coordinate my library activity with my Amazon activity, and make better-informed Amazon recommendations. In particular, Amazon could emphasize books known not to be available to me in my local library. This would certainly seem to be a fair quid-pro-quo for the use of that handy ISBN in its URI!
Because it could use it to offer me an email-notification service alerting me to overdue library books.
I find (2b) especially intriguing. It's not really in Amazon's interest for me to be aware of what's available in the local library, and it's not really in the library's interest for me to be made promptly aware of fines accumulating there. By yoking them together, I might be able to play the two services off against one another to my benefit--and to theirs.
The bookmarklet's bootloader
javascript:void((function() {var%20element=document.createElement('script'); element.setAttribute('src', 'http://weblog.infoworld.com/udell/gems/acquire.js'); document.body.appendChild(element)})())
The script loaded by the bootloader
var setCookieScript = 'function setCookie(Name1, Value1) { var expires = new Date(); expires.setFullYear(expires.getFullYear()+1); var cookie = Name1 + '=' + escape(Value1) + ';domain=amazon.com;path=/;expires=' +expires.toGMTString(); alert(cookie); document.cookie = cookie; }';
function getCookie(Name)
{
var s = '; '+document.cookie+';';
var i = s.indexOf('; '+Name+'=');
if (i == -1)
{ return ''; }
else
{
i += 3 + Name.length;
var j = s.indexOf(';', i);
return unescape(s.substring(i, j));
}
}
var myLibraryUserID = getCookie('MyLibraryUserID');
var myLibraryUserName = getCookie('MyLibraryUserName');
var m0 = document.getElementsByTagName('META')[0];
var titleAuthor = m0.getAttribute('content');
var re = /(.+),s*([^,]+)$/;
re.test(titleAuthor);
var title = RegExp.$1;
var author = RegExp.$2;
var win = window.open('','LibraryAcquisitionRequest', 'resizable=1,scrollable=1,width=600,height=400');
win.document.write('<html><head><title>Request acquisition of: ' + titleAuthor + '</title><scr' + 'ipt>' + setCookieScript + '</scr' + 'ipt></head><body>');
win.document.write('<p>Request acquisition of: ' + titleAuthor + '</p>');
win.document.write('<form name="acquire" method="post" action="javascript:alert(\'Demonstration only!\');"><table><tr><td align="right">Author: </td> <td><input name="author" value="' + author + '" size="40" maxlength="255"></td> </tr><tr><td align="right">Title:</td> <td><input name="title" value="' + title + '" size="40" maxlength="255"></td> </tr><tr><td align="right">Where/when published:</td> <td><input name="publish" value="See Amazon: ' + location.href + '" size="60" maxlength="255"></td> </tr><tr><td align="right">Where mentioned:</td> <td><input name="mention" value="Amazon" size="60" maxlength="255"></td> </tr><tr><td align="right">Other info:</td> <td><input name="other" value="Intermediated by the LibraryLookup project" size="40" maxlength="255"></td> <tr><tr><td align="right">Your name:</td> <td><input name="name" value="' + myLibraryUserName + '" size="40" maxlength="255" onChange="javascript:setCookie (\'MyLibraryUserName\',forms[0].name.value);"></td> </tr> <tr><td align="right">14-digit library card #:</td> <td><input name="barcode" type="text" value="' + myLibraryUserID + '" size="40" maxlength="40" /* use with caution! OnChange="javascript:setCookie (\'MyLibraryUserID\',forms[0].barcode.value);" */></td> </tr><tr><td align="left" colspan="2"><br><input name="submit" type="submit" value="Ask library to acquire this book"></td> </tr></table></form><p></body></html>');
win.document.close();
In an essay called Peer and non-peer review, Andrew Odlyzko pooh-poohs the fear that blogging (although he doesn't call it that) will undermine the classical system of scholarly peer review:
With the development of more flexible communication systems, especially the Internet, we are moving towards a continuum of publication. I have argued, starting with [3]1, that this requires a continuum of peer review, which will provide feedback to scholars about articles and other materials as they move along the
continuum, and not just in the single journal decision process stage.
Obviously I agree. I'm not a scientist, but when asked in mid-2000 to produce a report on how Internet-based communication could improve scientific collaboration, I focused (in part) on weblogs and RSS as engines of distributed awareness and precise feedback.
Back in September, Sébastien Paquet wrote me a thoughtful email, which I cited with permission, on the subject of blogging and research culture. His assessment bears repeating:
Here are reasons why Sébastien thinks blogging and research culture should naturally go together:
- Scholars value knowledge. They have a lot of it to manage and track.
- A scholar's professional survival depends on name recognition. A K-log can help provide visibility and recognition.
- Scholars are used to writing; most of them can write well.
- Scholars are geographically disparate. They need to nurture relationships with people that they seldom meet in person.
- Scholars need to interlink in a person-to-person fashion (see Interlinktual)
- Scholars already rely heavily on interpersonal trust and direct communication to determine what new stuff is worth looking at. Such filtering is one of the central functions weblog communities excel at.
- For many scholars, the best collaborations come about when they find someone who shares their values and goals (this is argued e.g. in section 3 of Phil Agre's excellent Networking on the Network). The personal output that is reflected in one's weblog makes it much easier to check for such a match than work that is published through other channels.
- Scholars recognize the value of serendipity. Serendipity can come pretty quickly through weblogging; see Manufactured Serendipity.
- Every scholar must strive to be a knowledge hub in his niche, and an expert in related areas. A K-log is a good medium for this, as it is a way of letting knowledge flow through you while adding your personal spin.
- Scholars pride themselves on being independent thinkers. K-logs epitomize independent thought.
Here are reasons why Sébastien thinks blogging has failed to become a research nexus:
- It takes time.
- "The technology is not well-established and tested at this point."
- Many people don't like being among the first ones doing something.
- Not all scholars are used to the Web and hypertext.
- Shyness and fear of public mistakes. Many scholars won't write unless they have to.
They may especially be reluctant to publicly expose ideas that they haven't tested.
- Fear that someone else will pick up their ideas and work them out before they do.
The sixth objection probably looms largest. The enterprise of science is at once exquisitely collaborative and fiercely competitive. One of the most poignant examples of the resulting dilemma is detailed in Horace Freeland Judson's The Eighth Day of Creation, the authoritative history of the elucidation of DNA's structure. Rosalind Franklin came very close to solving the riddle. But in the end, her X-ray crystallographic photos of DNA, conveyed indirectly to James Watson, triggered the crucial insight. She was denied the opportunity to collaborate directly, died of cancer a few years later, and is now a historical footnote.
Obviously the world of science was less kind to women then than it is now. But Robert Axelrod's The Evolution of Cooperation suggests that Franklin probably would have been out of luck in any case. In his analysis, cooperation can arise and be sustained only when the Prisoner's Dilemma is iterated--that is, when there is reason to expect many future interactions, and when there is no clearly-defined endgame. The hunt for the structure of DNA wasn't like that. A once-in-a-lifetime career-making Nobel-prize-winning goal was in view, and that distorted the payoff matrix.
In science (and in business) we might as well admit that, in such cases, competition will suppress cooperation. Rarely, we're pursuing a quest for a once-in-a-lifetime payoff. Usually, though, we're playing a game that looks more like an iterated prisoner's dilemma. A kind of meta-prisoner's-dilemma then arises. How can you tell the difference?
1Tragic loss or good riddance? The impending demise of traditional scholarly journals: There are obvious dangers in
discontinuous change away from a system that has served the scholarly
community well [Quinn]. However, I am convinced that future systems of
communication will be much better than the traditional journals.
Although the transition may be painful, there is the promise of a
substantial increase in the effectiveness of scholarly work.
Publications delays will disappear, and reliability of the
literature will increase with opportunities to add comments
to papers and attach references to later works that
cite them.
I tinkered a bit more with the LibraryLookup project yesterday. First, I noticed that the Build your own bookmarklet feature was broken in Mozilla. It turns out that any undeclared variable in the JavaScript will break it. Some kind of security feature, perhaps? Anyway, fixed. While I was at it, I added a feature that previews the link that will be embedded in the bookmarklet, so you can test it first. It's the same principle as the ASP.NET test page.
The bookmarklet generator also now emits a streamlined script. The original version, I'm embarrassed to say, went like so:
var re=/[/-](d{9,9}[dX])|isbn=(d{9,9}[dX])/i;
if ( re.test ( location.href ) == true )
{
var isbn=RegExp.$1;
if ( isbn.length == 0 )
{ isbn = RegExp.$2 };
...
Of course, all that was really necessary was:
var re= /([/-]|isbn=)(d{9,9}[dX])/i;
if ( re.test ( location.href ) == true )
{
var isbn = RegExp.$2
...
How did this happen? The usual way: when I expanded the original pattern to include the "isbn=" case, I didn't refactor. An instinctive programmer would have refactored on the fly. I'm not one, so I didn't see this until later. The problem with seeing it later is that you run smack into Don's Amazing Puzzle. It's far too easy to see a written text in terms of what we think it should say, rather than what it actually says.
(Here, by the way, are two tips for Radio UserLand folks who want to include JavaScript in items and stories. First, remove all blank lines from your script, because the Radio formatter will turn these into <p> tags that will break the script. Second, backslash-escape all instances of //--which if it occurs nowhere else, will be found before the closing end-comment tag. Radio's not-very-discriminating URL auto-activator is triggered by an unescaped //--like this one: //.)
Next, I took another look at the service lists. The first one came from Innovative's customer page, since withdrawn. The others I found by Googling for URL signatures. But I had been meaning to dig into the Libdex lists that a Palo Alto librarian, Martha Walters, referred me to. That turned out to be a fairly straightforward text-mining exercise which yielded, for Innovative and Voyager libraries in particular, greatly expanded lists with much more descriptive library names--and international coverage. Some of the many newly-added libraries:
Hong Kong - Kowloon - City University of Hong Kong
Scotland - St Andrews - University of St Andrews
Wales - Bangor - University of Wales Bangor and North East Wales Institute
Finland - Helsinki - Helsinki University
Puerto Rico - Gurabo - Universidad del Turabo
Scotland - Edinburgh - Edinburgh University
Because the Libdex catalog uses an extremely regular HTML format, it was not hard to reinterpret the HTML as a directory of services. But it wasn't as easy as it could have been, either. On the Backweave blog, Jeff Chan wonders whether Mark Pilgrim's use of the CITE tag is really an improvement over raw text mining. And Jeff mentions my report on Sergey Brin's talk at the InfoWorld conference, where I quote him as saying:
Look, putting angle brackets around things is not a technology, by itself. I'd rather make progress by having computers understand what humans write, than by forcing humans to write in ways computers can understand.
This isn't an either/or proposition. Like Mark, I strongly recommend exploiting to the hilt every scrap of latent semantic potential that exists within HTML. Like Jeff, I strongly recommend sharpening your text-mining skills because semantic markup, in whatever form, will never capture the totality of what can be usefully repurposed.
I guess I'm an extreme anti-extremist.
The 115 columns I wrote for BYTE.com are now restored to the public Web. I took this step reluctantly, and would have preferred that the original namespace remain intact, but so be it. Those columns that have continuing value can now weave themselves back into the fabric of the Web.
This exercise was another chance to experiment with Creative Commons licensing, which had raised some questions. In the case of these columns, I chose the Attribution-NoDerivs-NonCommercial 1.0 option, following the logic expressed by Denise Howell (via Scripting News).
Based on comments, I've also rethought my use of the CC license for LibraryLookup. My thinking on this was quite badly muddled, I'm afraid, mixing patent and copyright issues. As Matt Brubeck pointed out, a copyright has no bearing on patents, but publication alone is a hedge against potential frivolous use.
In the end, I concluded that LibraryLookup was a poor test case for the application of CC licensing to software. So I switched to the more basic Attribution license. I spent quite a while staring at the screen before I decided what to write in the Description metadata field. Here is what I finally said: A performance, expressed in text, data, and code.
Is it software? Phil Wainewright has a great essay on his Loosely Coupled weblog today: Software, Jim, but not as we know it.
Is it software? asks Dave. That's such a great question! From the moment I first saw an HTML form on a Web page, it was clear that boundaries were about to blur. Web pages are both documents and programs. Web sites are both publications and applications. URLs are both phrases and function calls. Text is code, code is data, data is text.
The renewed understanding of documents and URLs in the SOAP community, over the past year, is an appreciation of this fundamental intertwingularity. Joshua Allen's terrific recent essay, Naked XML, translates into practical terms:
I have a litmus test of sorts that I use to determine if someone has "got it". I show them an XPath like "//contact[.//fax]" and watch their faces. Of the people who understand what it does, most will have no reaction, and most of the rest (the experts) will raise their brows skeptically and say "only a stupid person would write such an inefficient query!". There are yet precious few who exclaim "that is how things should be!" as their faces light up.
The lesson, of course, is that real-world information is chaotic. In any but the smallest "proof of concept" systems, the best that one can hope for is to be able to recognize small pockets of structure within a sea of otherwise unstructured information.
[Better Living through Software]
Now, take a look at Jonnosan's geographical service browser. Note, in particular, this feature:
Let's think about why not. Consider this query, which leads to a status page containing:
<TD>
On Shelf
</td>
It would be great, of course, if all 117,418 libraries in the U.S. were to offer comprensive XML APIs. I'm optimistic (or foolish) enough to think that I might even live to see the day. Meanwhile, though, suppose this status page were instead merely well-formed HTML or XHTML, with structural cues, like so:
<td class="availability">
On Shelf
</td>
There's a nice little "pocket of structure within a sea of otherwise unstructured information."
Multiply by 117,418. It adds up.
Is it software? Yes.
Thanks to Andrew Mutch, the LibraryLookup project has added support for a fourth vendor of library software, Sirsi/DRA. The Google technique for service discovery turned up about fifty of these systems. But when Martha Walters showed me the master list of vendors, I remembered Will Cox's number--117,418 libraries in the U.S. alone.
Googling remains a useful way to discover services, but it only finds a fraction of four supported systems, and there are many still unsupported. So here's a complementary approach: Build your own bookmarklet.
The idea here is twofold. First, if your library uses one of the supported systems, but isn't listed, you can just generate the bookmarklet you'll need. Second, it provides a framework that can easily include more systems, as people discover and report the URL patterns that can drive them.
|