English Corner
KlausGraf - am Mittwoch, 18. März 2009, 20:06 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
published on Salon JS Blog ( http://board-js.blogspot.com/2009/03/summary-cologne-archives-collapse-ix.html )
Good News: Cooperation between Cologne Digital Archive (CDA) and City of Cologne (CDA, March 18th 2009 via Archivalia)
"Over the last days, we [CDA] received many enquiries concerning legal issues and the Digital Historical Archive. Even in the press and relevant news groups and mailing lists, this topic has been discussed.
Against this background, the initiators, prometheus e.V. (Cologne) in cooperation with the Department for History (Bonn), and the Cologne Historical Archive will clarify the legal situation in a cooperation contract. This will create a legal basis for the uploaders, the initiators and the CHA.
After a particular interlocution, all persons in charge agreed that this project will be continued under the supervision of the CHA and will be transformed into a 'citizen’s archive' (Buergerarchiv)."
Good News: Cooperation between Cologne Digital Archive (CDA) and City of Cologne (CDA, March 18th 2009 via Archivalia)
"Over the last days, we [CDA] received many enquiries concerning legal issues and the Digital Historical Archive. Even in the press and relevant news groups and mailing lists, this topic has been discussed.
Against this background, the initiators, prometheus e.V. (Cologne) in cooperation with the Department for History (Bonn), and the Cologne Historical Archive will clarify the legal situation in a cooperation contract. This will create a legal basis for the uploaders, the initiators and the CHA.
After a particular interlocution, all persons in charge agreed that this project will be continued under the supervision of the CHA and will be transformed into a 'citizen’s archive' (Buergerarchiv)."
Frank.Schloeffel - am Mittwoch, 18. März 2009, 19:35 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
published on Salon Jewish Studies Blog ( http://board-js.blogspot.com/2009/03/orphan-works-and-public-domain.html )
Peter Brantley (Executive Director for the Digital Library Federation) on The orphan monopoly:
"There are a large number of ways that books might fall into orphan status. A quick consultation of Peter Hirtle’s copyright table at Cornell Univ. allows us to see how easy this is. The impact of foreign rights is fiendishly complicated, and even the rules for U.S. publications are baroque; for older works it is a crafty rightsholder indeed who can figure out whether they might retain a claim. As Peter Hirtle observed to me in an email, 'The lengthening copyright terms and the gradual removal of formalities (especially the automatic renewal of works published since 1963) means that works that would have passed into the public domain in the past because the rights owners weren't concerned are still protected. The chances that the rights holders are either unidentifiable or not locatable also goes up.'
[...]
There are rough estimates of around 7 million digitized volumes in GBS [Google Book Search] subtracting 750,000 newly identified works gives us 6.25 million. Let’s take a guess that there are maybe 1.5 million public domain works (this is not entirely out of the blue, but informed by earlier orphan works studies and reports), leaving 4.75 million titles. That’s a lot of books – about 2/3 of the total. It might be more, it might be less; it is a big number.
[...]
A large number of these orphans are going to be truly public domain books, just like pre-1923 works. However, we may never know that they actually have public domain status due to historically incomplete record keeping, and the lack of a national rights tracking and notification infrastructure." (via Archivalia@Twitter)
Recommended:
Article "Public domain" on Wikipedia
Articles in Archivalia ref. to "Public domain" (German)
WIKIMEDIA COMMONS a database of 4,120,985 [March, 18th 2009] freely usable media files TO WHICH ANYONE CAN CONTRIBUTE!
Peter Brantley (Executive Director for the Digital Library Federation) on The orphan monopoly:
"There are a large number of ways that books might fall into orphan status. A quick consultation of Peter Hirtle’s copyright table at Cornell Univ. allows us to see how easy this is. The impact of foreign rights is fiendishly complicated, and even the rules for U.S. publications are baroque; for older works it is a crafty rightsholder indeed who can figure out whether they might retain a claim. As Peter Hirtle observed to me in an email, 'The lengthening copyright terms and the gradual removal of formalities (especially the automatic renewal of works published since 1963) means that works that would have passed into the public domain in the past because the rights owners weren't concerned are still protected. The chances that the rights holders are either unidentifiable or not locatable also goes up.'
[...]
There are rough estimates of around 7 million digitized volumes in GBS [Google Book Search] subtracting 750,000 newly identified works gives us 6.25 million. Let’s take a guess that there are maybe 1.5 million public domain works (this is not entirely out of the blue, but informed by earlier orphan works studies and reports), leaving 4.75 million titles. That’s a lot of books – about 2/3 of the total. It might be more, it might be less; it is a big number.
[...]
A large number of these orphans are going to be truly public domain books, just like pre-1923 works. However, we may never know that they actually have public domain status due to historically incomplete record keeping, and the lack of a national rights tracking and notification infrastructure." (via Archivalia@Twitter)
Recommended:
Article "Public domain" on Wikipedia
Articles in Archivalia ref. to "Public domain" (German)
WIKIMEDIA COMMONS a database of 4,120,985 [March, 18th 2009] freely usable media files TO WHICH ANYONE CAN CONTRIBUTE!
Frank.Schloeffel - am Mittwoch, 18. März 2009, 19:26 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
Yesterday was Natasha's birthday. One of the gifts I wanted to give her was a subscription to the University of Pennsylvania library (we live right next to UPenn), so that she would have remote access to all the academic databases (the free library of Philadelphia provides remote access to some, but not all, of those databases). Natasha has long lamented her lack of access to scientific journals now that she is no longer in school. Such journals are key to both her work as a sustainable food writer. Also, it is relevant to her personal enrichment, due to her interests, and background, in biological science. Further, as a long-time graduate student myself, I remember just how useful it was to have remote access to such a treasure trove of academic work. Even now, there have been numerous times during my writing where I run up against firewalls on JSTOR. So, I figured it was time to get us access to all of this great information.
The problem is, as a I discovered, even if you are willing to spend $400 a year for access to the library as an individual, or $800 as part of a corporate account, access to many of the academic databases is still restricted. Unless you have a job with the university, or are enrolled as a student, many of the databases with the best available research are nearly impossible to access. Right now, the only way it seems that we can ever have access to many academic journals is for someone with access to illegally let us borrow their username and password. Nice.
Read more:
http://www.openleft.com/showDiary.do?diaryId=12243
The problem is, as a I discovered, even if you are willing to spend $400 a year for access to the library as an individual, or $800 as part of a corporate account, access to many of the academic databases is still restricted. Unless you have a job with the university, or are enrolled as a student, many of the databases with the best available research are nearly impossible to access. Right now, the only way it seems that we can ever have access to many academic journals is for someone with access to illegally let us borrow their username and password. Nice.
Read more:
http://www.openleft.com/showDiary.do?diaryId=12243
KlausGraf - am Dienstag, 17. März 2009, 23:11 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
Summary VI at Salon Jewish Studies Blog ( http://board-js.blogspot.com/2009/03/summary-cologne-archives-collapse-viii.html )

Cologne Historical Archive's head getting criticism concering copyright statement (Koeln.de March 16th 2009 via Archivalia)
Director of the Cologne Historical Archive is now getting criticism concerning the copyright statement.
Since March 7th, the initiative “Digital Historical Archive Cologne” offers the opportunity to upload copies of the material from the collapsed archive. This way, scientist can help to restore the documents.
Bettina Schmidt-Czaia (director of the CHA), she of all people criticized this ambitious project pointing out copyright issues and ensures indignations.
„Many scientists, that received ordered copies of our material in former times, are now putting them online to provied an access“, said Schmidt-Czaia to Cologne Courier. This might break „copyright-laws“ of the documents. „It would be better, if these copies would be handed to us“, she urged the scientists.
On the internet, this announcement created hard criticism. In the forum „Archivalia“ (among others), is pointed out that there do not exist copyrights for documents whose authors died over 70 years ago. Most of the documents are free to common (common free) and can be handed to others/third.
“Reconsider donation cooperativeness“
The fact that the head of the archive holds on to bureaucratical issues like copyright in this crisis caused a lot of unpleasure. Especially, when one relies on the help of active citizens to rescue the documents. At this moment, it is important not to exclude the scientists but to use their interest to reproduce as many documents as possible, as well as to receive financial support. In the forum as in the service Twitter already appeared a call to reconsider donation cooperativeness towards the archive.
User unfriendly charges in the former CHA are also criticized, taking a digital picture of a document with own camera cost 2 Euros each. The users of the archive state that these fees could be the reason why so less digitized copies were made over the last years.
Comment by Klaus Graf (Archivalia): Let us hope the CHA recognizes that Digital Historical Archive is a support and that a great offer to become a „Bürgerarchiv“ (citizen archive) is given. Hopefully the responsible body (Unterhaltsträger), the City of Cologne will understand that the standing on these fees to collect some little money is absurd in this situation, relying on civic support.
Cultural possessions are common properties, they belong to all of us. We must rap our politicians on the knuckles, when their administration trends towards removing this policy.
A more detailed position of mine towards this issue can be read in the article “Cultural possessions must be free to access” (Kulturgut muß frei sein) in the “Kunstchronik”: http://archiv.twoday.net/stories/5254099/
News on the Cologne Historical Archive by Sebastian Post@Twitter from the 61. Westfaelische Archivtag (March 17th-18th 2009 in Detmold) via Archivalia
1. Archival material & finding aids
Many backup films were recovered.
The Cologne Historical Archive (CHA) queries the German government refering to the archival material backups which are stored in the Barbarastollen. The Barbarastollen is the so called "Memory of German culture". Here, 825 Mio images on microfilms are kept save.
The finding aids of the CHA are completely rescued.
2. Statement on access to digitized documents from the CHA
The CHA do not dislike! the access to uploaded digitized material from the Cologne archives collections via internet, but the CHA has to agree upon every initiative. (see also:
3. Needs and hopes
Manpower and support will be needed for a long time. For sure is: The memory of the city is not lost. We are working on the rescue!
Pictures from the Erstversorgungszentrum (EVZ), showing the divisiveness of rubble and archival material (press service city of Cologne March 16th 2009 via Archivalia)

Cologne Historical Archive's head getting criticism concering copyright statement (Koeln.de March 16th 2009 via Archivalia)
Director of the Cologne Historical Archive is now getting criticism concerning the copyright statement.
Since March 7th, the initiative “Digital Historical Archive Cologne” offers the opportunity to upload copies of the material from the collapsed archive. This way, scientist can help to restore the documents.
Bettina Schmidt-Czaia (director of the CHA), she of all people criticized this ambitious project pointing out copyright issues and ensures indignations.
„Many scientists, that received ordered copies of our material in former times, are now putting them online to provied an access“, said Schmidt-Czaia to Cologne Courier. This might break „copyright-laws“ of the documents. „It would be better, if these copies would be handed to us“, she urged the scientists.
On the internet, this announcement created hard criticism. In the forum „Archivalia“ (among others), is pointed out that there do not exist copyrights for documents whose authors died over 70 years ago. Most of the documents are free to common (common free) and can be handed to others/third.
“Reconsider donation cooperativeness“
The fact that the head of the archive holds on to bureaucratical issues like copyright in this crisis caused a lot of unpleasure. Especially, when one relies on the help of active citizens to rescue the documents. At this moment, it is important not to exclude the scientists but to use their interest to reproduce as many documents as possible, as well as to receive financial support. In the forum as in the service Twitter already appeared a call to reconsider donation cooperativeness towards the archive.
User unfriendly charges in the former CHA are also criticized, taking a digital picture of a document with own camera cost 2 Euros each. The users of the archive state that these fees could be the reason why so less digitized copies were made over the last years.
Comment by Klaus Graf (Archivalia): Let us hope the CHA recognizes that Digital Historical Archive is a support and that a great offer to become a „Bürgerarchiv“ (citizen archive) is given. Hopefully the responsible body (Unterhaltsträger), the City of Cologne will understand that the standing on these fees to collect some little money is absurd in this situation, relying on civic support.
Cultural possessions are common properties, they belong to all of us. We must rap our politicians on the knuckles, when their administration trends towards removing this policy.
A more detailed position of mine towards this issue can be read in the article “Cultural possessions must be free to access” (Kulturgut muß frei sein) in the “Kunstchronik”: http://archiv.twoday.net/stories/5254099/
News on the Cologne Historical Archive by Sebastian Post@Twitter from the 61. Westfaelische Archivtag (March 17th-18th 2009 in Detmold) via Archivalia
1. Archival material & finding aids
Many backup films were recovered.
The Cologne Historical Archive (CHA) queries the German government refering to the archival material backups which are stored in the Barbarastollen. The Barbarastollen is the so called "Memory of German culture". Here, 825 Mio images on microfilms are kept save.
The finding aids of the CHA are completely rescued.
2. Statement on access to digitized documents from the CHA
The CHA do not dislike! the access to uploaded digitized material from the Cologne archives collections via internet, but the CHA has to agree upon every initiative. (see also:
3. Needs and hopes
Manpower and support will be needed for a long time. For sure is: The memory of the city is not lost. We are working on the rescue!
Pictures from the Erstversorgungszentrum (EVZ), showing the divisiveness of rubble and archival material (press service city of Cologne March 16th 2009 via Archivalia)
Frank.Schloeffel - am Dienstag, 17. März 2009, 20:49 - Rubrik: English Corner
http://www.philological.bham.ac.uk/bibliography/essay.html
WHY I AM GOING ON STRIKE
When I got into the bibliography business, over a decade ago, text-posting was a new thing. Sites posting texts (both html transcripts and photographic reproductions) were first being established, it was a period of initial experimentation, so it was very understandable that each site went its own way according to its managers’ ideas of how such a site ought to be operated, and that every site manager felt free to behave as a law unto himself. The situation was a kind of free-wheeling, “Wild West” one, with no agreed-upon standards or conventions. Eleven years later, the number of text-posting sites, many sponsored by well-established libraries and other institutions, has multiplied and the number of available texts has increased, both to astronomic levels, and the availability of a large number of texts in electronic form has become an important feature of contemporary literary culture. But, to my astonishment, the degree of chaos and anarchy has scarcely decreased. While I can name a number of sites which are superbly managed in the best tradition of librarianship, many others fall short of these standards, sometimes to a jaw-dropping degree. I am going to mention some gross offenses against good practice, all of which militate against users’ interests, and these will no doubt strike some readers as impossibly exaggerated, but I could easily document the reality of each and every one of them. And if you rely on posted texts for your work, gentle reader, I can also assure you that your interests are affected by the failure of posting sites to observe good standards. So this is a subject about which you should care. Although your primary reaction should, of course, be a feelingof great gratitude towards anybody who makes texts freely available to you, when you perceive that you are being victimized by shoddy practices, and that your work is being impeded by them, you should not hesitate to make your displeasure known.
What malfeasances do I have in mind? In the first place, when one begins to visit text-posting sites, it quickly becomes evident that there is nothing remotely like uniformity in their structure and design. Nearly all of them are, to some degree, different and some are downright idiosyncratic. The result is that when one visits a new site, one is confronted with the necessity of figuring out how to navigate it and find what one wants (and this sometimes involves an exasperating waste of time), since some are considerably more “user friendly” than others. I am not urging any rigorous standardization of design, but in my work I have visited hundreds of such sites, and the varying degrees to which site designers adhere to good ergonomic principles is very striking. Some sites are a joy to work with, and one immediately feels at home. In the case of others, one has the feeling of being constantly engaged in a duel of wits with the site designer (and sometimes coming out the loser). Clearly, it would be in readers’ interests if sites developed some kind of norms or guidelines regarding design and structure. It is my suspicion, by the way, that some sites are designed, and some important policy decisions made about their management, by low-level technicians with inadequate supervision by professional librarians. If I am right, this is a sure-fire formula for disastrous results. As a general rule, every text-posting site requires “hands-on” supervision by a senior librarian.
The single most important design principle involves informing the reader of what holdings the site makes available. Although some site managers appear to think that a Search function is by itself sufficient, some means for browsing the site’s holdings is no less vital a necessity than is a catalogue for a traditional library. Ideally, there should be two browsable lists, one of authors and the other of titles. And the availability of this browsing feature needs to be prominently advertised on the welcome page rather than stashed away in some obscure corner of the site, so that it is immediately accessible to the viewer. It is extremely frustrating to imagine that the people who maintain text sites lacking this feature probably maintain some sort of running list of their holdings for internal management purposes, but that it has not entered their heads that they need to share this information with the rest of the world. The absence of any kind of browsing or catalogue feature goes particularly far towards diminishing the usefulness of sites, which contain a huge number of offerings: the larger the number, the more important browsing becomes (imagine the Library of Alexandria without Callimachus’ catalogue, and you’ll have some idea of the condition of Google Books and The Internet Archive).
It is also necessary for site managers to grasp this seemingly self-evident point: as soon as they begin to post texts, people are actually going to read them and use them, and to manage their material in such a way as to respect this fact, making sure that readers are helped rather than hindered. They also need to understand that, when they post texts, they are making certain tacit commitments to their readers, which they are henceforth obliged to honor, and that they can reasonably be accused of unethical conduct if they fail to honor them.. And this immediately brings me to the subject of URLs.
There are two ways of presenting a site. The first is to assign a fixed, predictable, and permanent URL to each posted text. The second is to use a Javascript “juke box” technology, so that each time a text is accessed, it is assigned a different and temporary one. The vast superiority of the former method at least ought be obvious, although to the managers of a discouraging number of sites it is, unfortunately, not. Individual readers are going to want to bookmark links to texts of interest. Scholars may want to cite URLs in their publications. Even more, in view of the ever-rising costs associated with traditional print publication, scholarly publication is destined to shift increasingly to electronic form. And, as soon as academicians begin to publish their research electronically, they almost automatically start to explore the possibilities of hypertext, with the result that direct links supplement or even replace traditional bibliographical references. All of this is facilitated by the assignment of unique URL to individual texts, but is rendered impossible by “juke box” technology. The assignment of unique URL’s to individual texts is, in fact, is just as much a feature of good librarianship than the assignment of unique shelfmarks to individual physical holdings in a traditional library.
The key word in the preceding paragraph is “permanent.” Whether they realize this nor not, as soon as they assign a URL to a text, the managers of a site enter into a solemn relation of trust with their visitors. It is a strange thing that librarians who would not dream of tampering with, say, the shelfmarks of their manuscript collections (which in some cases have remained undisturbed for centuries), are capable of making arbitrary and capricious changes in the URLs of their electronic postings, although changes in the latter wreak no less damage than are the former. The very best sites advertise the addresses of their postings as PURL’s (Permanent URL’s), thereby issuing an iron-clad guarantee to visitors that they will remain unchanged. Such sites ought to set the standard for the profession as a whole. When this principle is violated, an important relation of trust with readers is violated. For this I guarantee: as soon as a URL is posted, it will be used, and readers need be able to rely on its continuing validity.
The concept of permanence, of course, goes deeper. Posting a text involves an implicit solemn promise to the reader that the text will stay posted. But on some sites texts can mysteriously disappear without any acknowledgement of their removal. Even entire sites vanish without explanation. Some text-sites are maintained by private individuals, as labors of love. One feels great gratitude and respect for the individuals who maintain such sites, but at the same time one cannot help cringing at how short-lived they are, in all likelihood, destined to be. To speak very much about the issue of the long-term archiving of electronic material would take me too far off-subject, so suffice it to say that as no site is very likely to enjoy great longevity if it does not have institutional sponsorship. And once an institution sets up or sponsors a text-posting site, it is, in effect, assuming a responsibility to keep it available on a long-term basis. But I can name a couple of very valuable institution-sponsored sites that suddenly disappeared, to the appreciable detriment of scholarship.
I am highly conscious that, although I am a professional scholar I am a very amateur librarian who has no business dictating rules to the professionals. But I would be so bold as to insist to librarians that the electronic reproduction of texts, both in html format and as photographic reproductions, has become such an important function performed by modern libraries that the present “Wild West” situation needs to come to an end. Detailed industry-wide uniformity of structure and design may not be necessary or even desirable, but general standards of good procedure and some kind of code of ethical behavior need to be developed and observed by site managers, so that the greatest good can be derived from them, with the least possible harm inflicted. And, clearly, this development needs to be a collective effort. Electronic postings, surely, deserve to be treated with the same systematic care and respect that is shown towards physical holdings as a matter of course. Not being a member of the librarian profession, I have no idea whether the management of text sites is yet formally regarded as a branch of library science, and taught (or even thought about) in the schools that provide instruction in that discipline. If not, it should be, and I respectfully suggest that it is high time that librarians begin talking to each other to develop a set of professional standards and ethics, for the better maintenance of such sites and to guarantee the good progress of the scholarship that depends on them. This will entail the development of some kind of “shame culture” in which errant site managers can be reformed as the result of their peers' disapproval. But the development and observations of such standards is not the exclusive business of librarians. It is the right and responsibility of every scholar who relies on posted texts, and also of the general reading public, to insist that sound managerial practices be developed and followed.
This brings me to my own situation. The dawning realization that the situation I encountered eleven years ago has not fundamentally changed entails a concomitant awareness that I cannot continue working with this bibliography. I was operating according to the assumption that a bibliographical record that was true when created would, over time, remain true, and could be represented as such to readers. Although in the past some relatively minor exceptions to this principle did occur, which I corrected as best I could, I believed that as a general rule it was valid. The fact that, by an act which I regard as a severe breach of faith with its readers, the Gallica site of the Bibliothèque Nationale has changed its URLs, thereby obviating the validity of several thousand entries in the present bibliography, has dramatically brought home to me the fact that, when it comes to maintaining text-posting sites, even the world’s premiere libraries cannot be trusted to adhere to fundamental principles of good library science. And trust between libraries, readers, and bibliographers is what it is all about. In the absence of such trust, therefore, continued effort on maintaining this bibliography would clearly be a waste of effort better spent on other projects. I am therefore going “on strike” and will not invest any more time and effort in this bibliography until the situation has materially improved.
http://www.philological.bham.ac.uk/bibliography/
AN ANALYTIC BIBLIOGRAPHY OF ON-LINE NEO-LATIN TEXTS
DANA F. SUTTON
The University of California, Irvine
The enormous profusion of literary texts posted on the World Wide Web will no doubt strike future historians as remarkable and important. But this profusion brings with it an urgent need for many specialized on-line bibliographies. The present one is an analytic bibliography of Latin texts written during the Renaissance and later that are freely available to the general public on the Web (texts posted in access-restricted sites, and Web sites offering electronic texts and digitized photograpic reproductions for sale are not included). Only original sites on which texts are posted are listed here, and not mirror sites.
This page was first posted January 1, 1999 and most recently revised March 16, 2009 . The reader may be interested to know that it currently contains 29,750 records. I urge all those who are able to suggest additions or corrections to this bibliography, as well as those who post new texts on the Web, to inform me by e-mail, so that this bibliography can be kept accurate and up to date. I take this opportunity to express my gratitude to all the individuals who have supplied me with corrections and new information (I extend especial thanks to Klaus Graf and Tommy Tyrberg, who are both responsible for the addition of many hundreds of bibliographical items to this list).
A few further Neo-Latin on-line texts contained in various lists of such items compiled by others are not included here because an invalid URL address is provided. Over the passage of time, of course, some of the URL addresses given here may be changed or broken. If you become aware of such difficulties, I would be grateful to have them drawn to my attention.
NOTE: in addition to standard abbreviations, in this bibliography the special abbreviation dpr (“digitized photographic reproduction”) is employed; unless otherwise specified, the file in question is in PDF format.
NOTE: Access to post-1864 items on the Google Books and University of Michigan University Library sites appear to be blocked for residents of at least some non-US nations.
NOTE: Two sources of texts listed here, La Biblioteca Virtual de Andalucia, and the Universitat de Valéncia Biblioteca Digital, appear to be in the process of rebuilding their sites, and a number of texts previously posted by them are not currently available. These have therefore been at least temporarily withdrawn from this bibliography, but I would hope that they will eventually be posted once more.
EMERGENCY NOTICE
It has been drawn to my attention that the Gallica site of the Bibltiothèque Nationale has, without warning, changed the URLs of its holdings to a new system. The nearly 4000 links to their holdings listed in this bibliography are therefore invalid. At the moment I have no idea of how to cope with this situation, since the new URL scheme is not such that it can be updated in this bibliography by a simple global search-and-replace operation: it appears that each URL would have to be updated manually, which I am unwilling to do. This is, in my opinion, a grave violation of basic principles of library science (no less than if the Bibltiothèque Nationale were to alter the shelfmarks of their physical holdings in an equally arbitrary way), and represents a betrayal of the trust of scholars who use their online material. I request that all affected users of this site join me in contacting the Gallica site to protest this decision in the strongest possible terms, using your professional title, if you have one. They may be contacted at gallica2@bnf.fr
WHY I AM GOING ON STRIKE
When I got into the bibliography business, over a decade ago, text-posting was a new thing. Sites posting texts (both html transcripts and photographic reproductions) were first being established, it was a period of initial experimentation, so it was very understandable that each site went its own way according to its managers’ ideas of how such a site ought to be operated, and that every site manager felt free to behave as a law unto himself. The situation was a kind of free-wheeling, “Wild West” one, with no agreed-upon standards or conventions. Eleven years later, the number of text-posting sites, many sponsored by well-established libraries and other institutions, has multiplied and the number of available texts has increased, both to astronomic levels, and the availability of a large number of texts in electronic form has become an important feature of contemporary literary culture. But, to my astonishment, the degree of chaos and anarchy has scarcely decreased. While I can name a number of sites which are superbly managed in the best tradition of librarianship, many others fall short of these standards, sometimes to a jaw-dropping degree. I am going to mention some gross offenses against good practice, all of which militate against users’ interests, and these will no doubt strike some readers as impossibly exaggerated, but I could easily document the reality of each and every one of them. And if you rely on posted texts for your work, gentle reader, I can also assure you that your interests are affected by the failure of posting sites to observe good standards. So this is a subject about which you should care. Although your primary reaction should, of course, be a feelingof great gratitude towards anybody who makes texts freely available to you, when you perceive that you are being victimized by shoddy practices, and that your work is being impeded by them, you should not hesitate to make your displeasure known.
What malfeasances do I have in mind? In the first place, when one begins to visit text-posting sites, it quickly becomes evident that there is nothing remotely like uniformity in their structure and design. Nearly all of them are, to some degree, different and some are downright idiosyncratic. The result is that when one visits a new site, one is confronted with the necessity of figuring out how to navigate it and find what one wants (and this sometimes involves an exasperating waste of time), since some are considerably more “user friendly” than others. I am not urging any rigorous standardization of design, but in my work I have visited hundreds of such sites, and the varying degrees to which site designers adhere to good ergonomic principles is very striking. Some sites are a joy to work with, and one immediately feels at home. In the case of others, one has the feeling of being constantly engaged in a duel of wits with the site designer (and sometimes coming out the loser). Clearly, it would be in readers’ interests if sites developed some kind of norms or guidelines regarding design and structure. It is my suspicion, by the way, that some sites are designed, and some important policy decisions made about their management, by low-level technicians with inadequate supervision by professional librarians. If I am right, this is a sure-fire formula for disastrous results. As a general rule, every text-posting site requires “hands-on” supervision by a senior librarian.
The single most important design principle involves informing the reader of what holdings the site makes available. Although some site managers appear to think that a Search function is by itself sufficient, some means for browsing the site’s holdings is no less vital a necessity than is a catalogue for a traditional library. Ideally, there should be two browsable lists, one of authors and the other of titles. And the availability of this browsing feature needs to be prominently advertised on the welcome page rather than stashed away in some obscure corner of the site, so that it is immediately accessible to the viewer. It is extremely frustrating to imagine that the people who maintain text sites lacking this feature probably maintain some sort of running list of their holdings for internal management purposes, but that it has not entered their heads that they need to share this information with the rest of the world. The absence of any kind of browsing or catalogue feature goes particularly far towards diminishing the usefulness of sites, which contain a huge number of offerings: the larger the number, the more important browsing becomes (imagine the Library of Alexandria without Callimachus’ catalogue, and you’ll have some idea of the condition of Google Books and The Internet Archive).
It is also necessary for site managers to grasp this seemingly self-evident point: as soon as they begin to post texts, people are actually going to read them and use them, and to manage their material in such a way as to respect this fact, making sure that readers are helped rather than hindered. They also need to understand that, when they post texts, they are making certain tacit commitments to their readers, which they are henceforth obliged to honor, and that they can reasonably be accused of unethical conduct if they fail to honor them.. And this immediately brings me to the subject of URLs.
There are two ways of presenting a site. The first is to assign a fixed, predictable, and permanent URL to each posted text. The second is to use a Javascript “juke box” technology, so that each time a text is accessed, it is assigned a different and temporary one. The vast superiority of the former method at least ought be obvious, although to the managers of a discouraging number of sites it is, unfortunately, not. Individual readers are going to want to bookmark links to texts of interest. Scholars may want to cite URLs in their publications. Even more, in view of the ever-rising costs associated with traditional print publication, scholarly publication is destined to shift increasingly to electronic form. And, as soon as academicians begin to publish their research electronically, they almost automatically start to explore the possibilities of hypertext, with the result that direct links supplement or even replace traditional bibliographical references. All of this is facilitated by the assignment of unique URL to individual texts, but is rendered impossible by “juke box” technology. The assignment of unique URL’s to individual texts is, in fact, is just as much a feature of good librarianship than the assignment of unique shelfmarks to individual physical holdings in a traditional library.
The key word in the preceding paragraph is “permanent.” Whether they realize this nor not, as soon as they assign a URL to a text, the managers of a site enter into a solemn relation of trust with their visitors. It is a strange thing that librarians who would not dream of tampering with, say, the shelfmarks of their manuscript collections (which in some cases have remained undisturbed for centuries), are capable of making arbitrary and capricious changes in the URLs of their electronic postings, although changes in the latter wreak no less damage than are the former. The very best sites advertise the addresses of their postings as PURL’s (Permanent URL’s), thereby issuing an iron-clad guarantee to visitors that they will remain unchanged. Such sites ought to set the standard for the profession as a whole. When this principle is violated, an important relation of trust with readers is violated. For this I guarantee: as soon as a URL is posted, it will be used, and readers need be able to rely on its continuing validity.
The concept of permanence, of course, goes deeper. Posting a text involves an implicit solemn promise to the reader that the text will stay posted. But on some sites texts can mysteriously disappear without any acknowledgement of their removal. Even entire sites vanish without explanation. Some text-sites are maintained by private individuals, as labors of love. One feels great gratitude and respect for the individuals who maintain such sites, but at the same time one cannot help cringing at how short-lived they are, in all likelihood, destined to be. To speak very much about the issue of the long-term archiving of electronic material would take me too far off-subject, so suffice it to say that as no site is very likely to enjoy great longevity if it does not have institutional sponsorship. And once an institution sets up or sponsors a text-posting site, it is, in effect, assuming a responsibility to keep it available on a long-term basis. But I can name a couple of very valuable institution-sponsored sites that suddenly disappeared, to the appreciable detriment of scholarship.
I am highly conscious that, although I am a professional scholar I am a very amateur librarian who has no business dictating rules to the professionals. But I would be so bold as to insist to librarians that the electronic reproduction of texts, both in html format and as photographic reproductions, has become such an important function performed by modern libraries that the present “Wild West” situation needs to come to an end. Detailed industry-wide uniformity of structure and design may not be necessary or even desirable, but general standards of good procedure and some kind of code of ethical behavior need to be developed and observed by site managers, so that the greatest good can be derived from them, with the least possible harm inflicted. And, clearly, this development needs to be a collective effort. Electronic postings, surely, deserve to be treated with the same systematic care and respect that is shown towards physical holdings as a matter of course. Not being a member of the librarian profession, I have no idea whether the management of text sites is yet formally regarded as a branch of library science, and taught (or even thought about) in the schools that provide instruction in that discipline. If not, it should be, and I respectfully suggest that it is high time that librarians begin talking to each other to develop a set of professional standards and ethics, for the better maintenance of such sites and to guarantee the good progress of the scholarship that depends on them. This will entail the development of some kind of “shame culture” in which errant site managers can be reformed as the result of their peers' disapproval. But the development and observations of such standards is not the exclusive business of librarians. It is the right and responsibility of every scholar who relies on posted texts, and also of the general reading public, to insist that sound managerial practices be developed and followed.
This brings me to my own situation. The dawning realization that the situation I encountered eleven years ago has not fundamentally changed entails a concomitant awareness that I cannot continue working with this bibliography. I was operating according to the assumption that a bibliographical record that was true when created would, over time, remain true, and could be represented as such to readers. Although in the past some relatively minor exceptions to this principle did occur, which I corrected as best I could, I believed that as a general rule it was valid. The fact that, by an act which I regard as a severe breach of faith with its readers, the Gallica site of the Bibliothèque Nationale has changed its URLs, thereby obviating the validity of several thousand entries in the present bibliography, has dramatically brought home to me the fact that, when it comes to maintaining text-posting sites, even the world’s premiere libraries cannot be trusted to adhere to fundamental principles of good library science. And trust between libraries, readers, and bibliographers is what it is all about. In the absence of such trust, therefore, continued effort on maintaining this bibliography would clearly be a waste of effort better spent on other projects. I am therefore going “on strike” and will not invest any more time and effort in this bibliography until the situation has materially improved.
http://www.philological.bham.ac.uk/bibliography/
AN ANALYTIC BIBLIOGRAPHY OF ON-LINE NEO-LATIN TEXTS
DANA F. SUTTON
The University of California, Irvine
The enormous profusion of literary texts posted on the World Wide Web will no doubt strike future historians as remarkable and important. But this profusion brings with it an urgent need for many specialized on-line bibliographies. The present one is an analytic bibliography of Latin texts written during the Renaissance and later that are freely available to the general public on the Web (texts posted in access-restricted sites, and Web sites offering electronic texts and digitized photograpic reproductions for sale are not included). Only original sites on which texts are posted are listed here, and not mirror sites.
This page was first posted January 1, 1999 and most recently revised March 16, 2009 . The reader may be interested to know that it currently contains 29,750 records. I urge all those who are able to suggest additions or corrections to this bibliography, as well as those who post new texts on the Web, to inform me by e-mail, so that this bibliography can be kept accurate and up to date. I take this opportunity to express my gratitude to all the individuals who have supplied me with corrections and new information (I extend especial thanks to Klaus Graf and Tommy Tyrberg, who are both responsible for the addition of many hundreds of bibliographical items to this list).
A few further Neo-Latin on-line texts contained in various lists of such items compiled by others are not included here because an invalid URL address is provided. Over the passage of time, of course, some of the URL addresses given here may be changed or broken. If you become aware of such difficulties, I would be grateful to have them drawn to my attention.
NOTE: in addition to standard abbreviations, in this bibliography the special abbreviation dpr (“digitized photographic reproduction”) is employed; unless otherwise specified, the file in question is in PDF format.
NOTE: Access to post-1864 items on the Google Books and University of Michigan University Library sites appear to be blocked for residents of at least some non-US nations.
NOTE: Two sources of texts listed here, La Biblioteca Virtual de Andalucia, and the Universitat de Valéncia Biblioteca Digital, appear to be in the process of rebuilding their sites, and a number of texts previously posted by them are not currently available. These have therefore been at least temporarily withdrawn from this bibliography, but I would hope that they will eventually be posted once more.
EMERGENCY NOTICE
It has been drawn to my attention that the Gallica site of the Bibltiothèque Nationale has, without warning, changed the URLs of its holdings to a new system. The nearly 4000 links to their holdings listed in this bibliography are therefore invalid. At the moment I have no idea of how to cope with this situation, since the new URL scheme is not such that it can be updated in this bibliography by a simple global search-and-replace operation: it appears that each URL would have to be updated manually, which I am unwilling to do. This is, in my opinion, a grave violation of basic principles of library science (no less than if the Bibltiothèque Nationale were to alter the shelfmarks of their physical holdings in an equally arbitrary way), and represents a betrayal of the trust of scholars who use their online material. I request that all affected users of this site join me in contacting the Gallica site to protest this decision in the strongest possible terms, using your professional title, if you have one. They may be contacted at gallica2@bnf.fr
KlausGraf - am Dienstag, 17. März 2009, 04:59 - Rubrik: English Corner
Holley, Rose: How Good Can It Get? Analysing and Improving OCR Accuracy in Large Scale Historic Newspaper Digitisation Programs
http://www.dlib.org/dlib/march09/holley/03holley.html
Basic OCR correction by public users was implemented and tested in the prototype search system released to State and Territory Libraries for testing in December 2007. User correction of text was positively received, though most Libraries asked if and how moderation would take place. It was then implemented in the Beta search system (without moderation), which had a soft release to the public without any publicity on 25 July 2008. In the first three months of use (July - October 2008) the public immediately began correcting OCR. We have found it quite hard to monitor what they are doing, how well they are doing it, and how it is affecting the overall quality of the data, since moderation is not yet in place and login to do it is not mandatory (it is optional) at this stage. We also have had difficulties measuring the accuracy of the OCR-corrected text. We have three methods of measuring text correction: number of lines corrected, number of correction "transactions" (i.e., pressing the "save corrections" button), and number of different articles corrected. However, it is questionable how useful any of the three methods are. We are assuming that all correction transactions are to improve text and make it right. No extra text can be added, only existing lines corrected. No text has been deliberately incorrectly changed as far as we are aware.
The results of user activity within the first 12 weeks of the soft launch (without publicity) are that 868 registered users have corrected text and approximately 390 unregistered users (total of 1,200 text correctors). 700,000 lines of text have been corrected within 50,000 articles. The top text corrector has corrected 50,000 lines of text within nearly 2,000 individual articles. Some articles have had corrections added by more than seven users (e.g., articles in the first Australian newspaper the 1803 Sydney Gazette). This particular issue in its entirety has had several different users working on corrections, because it is difficult to read and is an important newspaper.
User feedback returned via surveys, e-mails, phone calls and the "contact us" form has been overwhelmingly positive and interesting. Users did not expect to be able to correct OCR text. Once they discovered they could, they quickly took to the concept and method, and several reported finding correcting the text both addictive and rewarding. Users were actively correcting much more than they or we had expected to correct. In addition, our own users have the potential to achieve a 100% accuracy rate with their knowledge of English, history and context, whereas our contractors are only achieving an accuracy of 99.5% in the title headings.
See also
Holley, Rose (2009) Many Hands Make Light Work: Public Collaborative Text Correction in Australian Historic Newspapers. ISBN 978-0-642-27694-0. Available at http://www.nla.gov.au/ndp/project_details/documents/ANDP_ManyHands.pdf
Excerpt:
The Australian Newspapers beta service has clearly demonstrated that users want to engage and be
involved with full text newspaper data in new and exciting ways. The use of web 2.0 technologies can
enable this. Without publicity, ‘how‐to’ tutorials or even a familiar and refined interface or concept,
the service still rapidly harnessed an active group of users who are enthusiastically enhancing and
improving the data by use of the text correction, tagging and comments functions. Users have
demonstrated a willingness to work towards the ‘common good’, to volunteer their time, energy, skill,
knowledge and ideas and to be involved long term in a program of national historic significance. The
collaborative activity from this new community is enhancing the quality of the data and therefore the
accuracy of full‐text searching in a way that the National Library of Australia could never have
achieved using its own resources alone.
http://www.dlib.org/dlib/march09/holley/03holley.html
Basic OCR correction by public users was implemented and tested in the prototype search system released to State and Territory Libraries for testing in December 2007. User correction of text was positively received, though most Libraries asked if and how moderation would take place. It was then implemented in the Beta search system (without moderation), which had a soft release to the public without any publicity on 25 July 2008. In the first three months of use (July - October 2008) the public immediately began correcting OCR. We have found it quite hard to monitor what they are doing, how well they are doing it, and how it is affecting the overall quality of the data, since moderation is not yet in place and login to do it is not mandatory (it is optional) at this stage. We also have had difficulties measuring the accuracy of the OCR-corrected text. We have three methods of measuring text correction: number of lines corrected, number of correction "transactions" (i.e., pressing the "save corrections" button), and number of different articles corrected. However, it is questionable how useful any of the three methods are. We are assuming that all correction transactions are to improve text and make it right. No extra text can be added, only existing lines corrected. No text has been deliberately incorrectly changed as far as we are aware.
The results of user activity within the first 12 weeks of the soft launch (without publicity) are that 868 registered users have corrected text and approximately 390 unregistered users (total of 1,200 text correctors). 700,000 lines of text have been corrected within 50,000 articles. The top text corrector has corrected 50,000 lines of text within nearly 2,000 individual articles. Some articles have had corrections added by more than seven users (e.g., articles in the first Australian newspaper the 1803 Sydney Gazette). This particular issue in its entirety has had several different users working on corrections, because it is difficult to read and is an important newspaper.
User feedback returned via surveys, e-mails, phone calls and the "contact us" form has been overwhelmingly positive and interesting. Users did not expect to be able to correct OCR text. Once they discovered they could, they quickly took to the concept and method, and several reported finding correcting the text both addictive and rewarding. Users were actively correcting much more than they or we had expected to correct. In addition, our own users have the potential to achieve a 100% accuracy rate with their knowledge of English, history and context, whereas our contractors are only achieving an accuracy of 99.5% in the title headings.
See also
Holley, Rose (2009) Many Hands Make Light Work: Public Collaborative Text Correction in Australian Historic Newspapers. ISBN 978-0-642-27694-0. Available at http://www.nla.gov.au/ndp/project_details/documents/ANDP_ManyHands.pdf
Excerpt:
The Australian Newspapers beta service has clearly demonstrated that users want to engage and be
involved with full text newspaper data in new and exciting ways. The use of web 2.0 technologies can
enable this. Without publicity, ‘how‐to’ tutorials or even a familiar and refined interface or concept,
the service still rapidly harnessed an active group of users who are enthusiastically enhancing and
improving the data by use of the text correction, tagging and comments functions. Users have
demonstrated a willingness to work towards the ‘common good’, to volunteer their time, energy, skill,
knowledge and ideas and to be involved long term in a program of national historic significance. The
collaborative activity from this new community is enhancing the quality of the data and therefore the
accuracy of full‐text searching in a way that the National Library of Australia could never have
achieved using its own resources alone.
KlausGraf - am Dienstag, 17. März 2009, 02:21 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
http://blog.librarylaw.com/librarylaw/2009/03/google-books-settlement-at-columbia-part-1.html
http://blog.librarylaw.com/librarylaw/2009/03/google-books-settlement-at-columbia-part-2.html
http://blog.librarylaw.com/librarylaw/2009/03/google-books-settlement-at-columbia-part-2.html
KlausGraf - am Montag, 16. März 2009, 18:24 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
KlausGraf - am Montag, 16. März 2009, 16:56 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen
KlausGraf - am Montag, 16. März 2009, 14:41 - Rubrik: English Corner
noch kein Kommentar - Kommentar verfassen