cn969126a4c2tbm meinte am 27. Mai, 02:31:
please learn more about OCR
I find it astounding that someone running a blog on archival issues is so unfamiliar with OCR issues.Google is likely using the best Fraktur recognition they can buy. Still, there is no OCR that can recognize a modern, cleanly scanned book without error, let alone a 19th century Fraktur book captured with Google's processes.
Nevertheless, any scanning effort needs to prioritize, and compared to living scripts with hundreds of millions of users, Fraktur is probably low priority.
Finally, metadata in libraries, digital or otherwise, is full of errors. Without manually checking every book, there is simply no way to be reasonably sure that a book is out of copyright. That 1899 book might well be a 1973 best seller.
Nevertheless, UMich is making an effort, if you bother reading their "Help" file: "Please use the feedback form at the bottom of each page in the Pageturner application to let us know if our records are incorrectly restricting access to an item. For more information, see the University of Michigan Library Access and Use Policy."
KlausGraf antwortete am 27. Mai, 02:43:
please learn more on this weblog
This was an entry from September 2006, it's partly outdated. I think I know enough on OCR and metadata. Feel free to leave this weblog if you don't like it.