Wednesday, October 3, 2007

BBC NEWS | Technology | Spam weapon helps preserve books

BBC NEWS | Technology | Spam weapon helps preserve books

Fascinating article explaining how the Internet Archive Million Book project is using the words that can't be recognised by OCR as the spam-beating words that you have to type correctly to log into systems.

Almost as good as the National Library of Vienna, which digitised the card catalogue to its print collection, and got only a couple of key index fields typed in. Now, when you order a photo (and pay to digitise it on demand) you also complete their catalogue record for them. Talk about 'user-generated content'!

No comments: