Archive in der Zukunft
Archive von unten
... weitere
Weblog abonnieren

Google's Copyright Paranoia blocks non-US-citizen from viewing the full text of books which are PUBLIC DOMAIN world wide. There is no change in the policy I have criticized at

Now UMich is blocking in the same way US-citizen from viewing books which are PD in the US. See the example at

Update: I have checked with an US-Proxy: UMich has the same IP rights management like Google. Us citizen can see the full text of Geibel at

End of Update

Geibel died in 1873 (see the UMich catalog entry!) and thus his works are PD in the EU (and worldwide, including Mexico).

The Chroniken der deutschen Städte are at UMich in the same way not free as in the Google. There can be found hundreds of PD books which UMich sees falsely as copyrighted.

That Google is doing a very poor scanning job (no book without errors) is well known for all digitization experts.

Its Fraktur OCR is worthless, see

The decision of the UMich Regents NOT to show library patrons the digitized books which UMich believes copyrighted makes the whole effort for Fraktur books actually worthless.

If a German Fraktur book from say 1925 (author died 1945) is scanned it is PD in the EU 2015. We have to wait until that date before there is any profit from the scanning. "Search only" does'nt make sense if there is no searchable OCR text. Let's have a look on an US edition of Wilhelm Hauff (shown as PD at UMich):

Zcnn menu bit (Iisuen Tieben unub @on (Iistem
geliebl felt tell!, boss bofet et' sienanbgtasgip
(Slusben, unb en if is Rep.
(Sic (mielten jept gmot)cn Slnicgs'nolf. Die Rod)
cutlet, Sic Sen @Jtiilseifcr eon (tines Sbcssmrahcs
(Schalberoff oat' l@neihinpen em(maties ash baum gu.
sun ben (sinufis milpetfeilt baUd, fimrnten oaf cit
@oom mit hem bemein, tat' l@niulcims (Sorbcs
if rem ttfct ge(fmiebems (matte. Utber ben @lbat
beftosh mom olfo sift ben geningflc 3uunti(tl surf r.
Itben tie Sent (,t)ra(es beitusmen?

Can you detect any string worth for searching in this text?

There are thousands of Fraktur books which are OCR'd in this useless way by Google.

Let's be fair to Google. Its a commercial enterprise no welfare institution. But there is no excusation for the ignorant and incompetent UMich librarians who are blocking PD books and ignoring the serious Fraktur problem.
KlausGraf meinte am 2006/09/07 21:33:
Good article on Google and UMich 
biopilz meinte am 2007/03/09 14:17:
re-asked: fraktur-problem...--
> ....librarians...serious Fraktur problem...
know the problem myself trying to ocr such a book? tried it with adobe acrobat 4.0 capture and suppose due to the missing appropriate-font it did NOT work -yet!
any solution anyone?!? says thanks for answers... 
cn969126a4c2tbm meinte am 2008/05/27 02:31:
please learn more about OCR
I find it astounding that someone running a blog on archival issues is so unfamiliar with OCR issues.

Google is likely using the best Fraktur recognition they can buy. Still, there is no OCR that can recognize a modern, cleanly scanned book without error, let alone a 19th century Fraktur book captured with Google's processes.

Nevertheless, any scanning effort needs to prioritize, and compared to living scripts with hundreds of millions of users, Fraktur is probably low priority.

Finally, metadata in libraries, digital or otherwise, is full of errors. Without manually checking every book, there is simply no way to be reasonably sure that a book is out of copyright. That 1899 book might well be a 1973 best seller.

Nevertheless, UMich is making an effort, if you bother reading their "Help" file: "Please use the feedback form at the bottom of each page in the Pageturner application to let us know if our records are incorrectly restricting access to an item. For more information, see the University of Michigan Library Access and Use Policy." 
KlausGraf antwortete am 2008/05/27 02:43:
please learn more on this weblog
This was an entry from September 2006, it's partly outdated. I think I know enough on OCR and metadata. Feel free to leave this weblog if you don't like it. AGB

xml version of this page

powered by Antville powered by Helma