elag 2002

Sharing Italian digital resources in search of a model


[Sharing Italian digital resources : in search of a model pubblicato con modifiche in Semantic web and libraries, 26th Library systems Seminar, Rome, 17-19 April 2002. - Roma: Biblioteca nazionale centrale, 2003, pp. 41-43.]

«...we expect significant developments within next decade in document delivery systems. An increasing percentage of the world’s literature will be stored in a form in which it could be rapidly retrieved [...] Thus vital documents would be displayed on demand» [Wolpert, 1978].

The above quotation is from the proceedings of a Conference held in 1977, The On-line Revolution in Libraries.

Now , in 2002, the dream is coming true, still a lot of work has to be done to share efforts: on the one hand we have huge public investments in digitization projects – on the other hand – «there are a number of key problems which risk limiting realizing the potential of these resources, whether culturally, socially or economically» [Lund, 2001].

With regards to the Italian situation one of the key problems is the fragmentary approach in the sharing of digital (or digitized) resources.

Of course this is not only true in the digital field. As authoritatively noted «The National Library Service's plan (SBN = Servizio Bibliotecario Nazionale) springs from need to coordinate library services in Italy, where fragmentation of the bibliographic records and of collections, along with administrative differences among libraries, have always been an obstacle to an efficient library service [Peruginelli, 1990]

In fact in order to share digital resources (data) we have to share information about the data (bibliographic records or – if you prefer – metadata for resource discovery).

To access information resources coming from different institutions (not only libraries) we need – roughly speaking – one of the following central services:

  • a central physical catalog (centralized model)
  • a virtual catalog (distributed search model)
  • a central index (a.k.a. search engine model)

The model of the single physical union catalog is far from becoming an obsolete tool. It «complements the emerging distributed search models by offering substantially different functionality, quality, performance, and management characteristics»[iv]. On the other hand the model of the virtual union catalog – based for instance on Z39.50 protocol – does not present high central costs of implementation but we have to consider that «the scaling properties of a distributed search system can be quite unattractive when compared to a centralized union catalog: each participating system must be capable of handing the query load that all users of the union system represent, since each search will be sent to each participating system”. Another problem of the virtual union catalog model is the semantic interoperability: as can be found in a recent study «The tests indicated a tremendous variability in the implementations of the Z39.50 protocol, to the extent that one might begin to question whether this is an international standard at all»[v].

We all know the pros and cons of a central index like for instance Google and we also know that an index is «an alphabetical list of names, subjects, etc., with references(usually at the end of a book)»,according to the Oxford English Dictionary.

To talk about the index model – however – I would like to spend some words on the original SBN project (1985). In fact the Italian library network (SBN) is based on a physical union catalog model (even if it is called Index). Over 2000 libraries are members of the Italian library network, but not all the Italian institutions interested in sharing digital resources are SBN libraries.

The original project was not based on a central physical union catalog, rather on the central index model (hence the name Index). The following are some interesting quotations from a feasibility study done in 1985:

– « ... that, in addition to the individual site machines and databases, there would a shared, national database of indexed bibliographic records, each record of the Identity Card type, with associated information, including the location of more complete records, and the location of holdings»

– «This system would act primarily as a directory»

– «It would be consulted by the sites mainly to obtain routing information, that is the name of the site which contains the most up to date version of the record ... »

– «Obtaining the required data itself would be done by directly accessing whichever site possesses the required information»

Of course an Index cannot meet all the FRBR objectives i.e. find, identify, select and obtain. An Index could mainly be helpful for the first objective i.e. to find (locate) all resources sharing the same index entry, with regards to the other objectives – it is useful to direct the user to the real catalogs, a sort of subsidiarity principle.

However to evaluate the proposal of an Index we have to examine:

  • the bibliographic elements included in the Index entries;
  • the way these elements are recorded.

Every index entry in the 1985 SBN proposal was based on eight elements - excluding information on “location of more complete records, and the location of holdings”. These elements were automatically extracted from a bibliographic record and normalized using the typical information retrieval algorithms (for instance conversion of special character to the basic character, conversion on uppercase etc.) Four elements (author key, two title keys and date) were in fact based on the core of bibliographic citation (author, title and publication date). The rest of the elements were the identifier, the standard number and three important instantiation elements in coded form (bibliographic level, language and country of publication).

An hypothetical record of the Identity Card type from the Index of 1985 SBN Proposal should look like:

Bibliographic record of the Identity Card type

<?xml version="1.0" encoding="UTF-8"?>

<!-- in this fictional example we are using an xml like notation: the Index entry refers to a real bibliographic

record of the SBN Union Catalog:

Negroponte, Nicholas. Being Digital. – London : Hodder & Stoughton, 1995.

In this example we use the “+” character instead of the space character -->

<Index_proposal_1985>

<oclc_key schema= “3+1+1+1 from the words of the title”>beid++</oclc_key>

<title_key schema = “first 50 char of the title”>

being+digital++++++++++++++++++++++++++++++++++++

</title_key>

<author schema=”4+1 form the main entry”>negrn+</author>

<publication_date>1995</publication_date>

<language>eng</language>

<country_of_publication>gb</country_of_publication>

<identifier schema=”isbn”>0679439196</identifier>

<identifier schema=“sbn record identifier“>MIL0112563</identifier>

< ! — routing information = location where you can find the complete bibliographic record

and the information on local holding -->

<routing_information>

MI0669 - Biblioteca della Facolta' di scienze politiche dell'Universita' degli studi di Milano

</routing_information>

<routing_information>

RM0369 - Biblioteca ISTAT – Roma

</routing_information>

</Index_proposal_1985>

The conclusion of this brief note is left open. The original Index model should be taken into account in sharing ( to share?)Italian digital resources: its structure is still valid even if the model needs to be updated. For instance the elements of the revised model could be based on Dublin core. «In this context, Dublin Core presents itself as a metadata pidgin for digital tourists who must find their way in this linguistically diverse landscape. Its vocabulary is small enough to learn quickly, and its basic pattern is easily grasped. It is well-suited to serve as an auxiliary language for digital libraries»[vi].

What is important is to be aware that an Index is not a (Meta)Catalog:

«He to the index turns, and quickly sees

What pages show the proper remedies»[vii].



References:


  • Wolpert, 1978
    • Wolpert, Samuel A., Potential: reaction, in The on-line revolution in libraries : proceedings of the 1977 Conference in Pittsburgh, Pennsylvania, edited by Allen Kent, Thomas J. Galvin. New York [etc]: M. Dekker, 1978 p. 27.
  • Lund, 2001
  • Peruginelli, 1990
    • Peruginelli, Susanna – Pettenati , Corrado. The National library network in Italy in European library networks, edited by Karl Wilhelm Neubauer, Esther R. Dyer. Norwood: Ablex, 1990, p. 195.

[iv] Linch, Clifford A, Building the Infrastructure of Resource Sharing: Union Catalogs, Distributed Search, and Cross-Database Linkage, «Library trends» 45, 3 (Winter 1997), p. 448-461.

[v] Feasibility study for a national union catalog, Final Report 25 April 2001, by Peter Stubley, Rob Bull and Tony Kidd, http://www.uknuc.shef.ac.uk/.

[vi] Baker, Thomas. A grammar of Dublin Core, «D-lib magazine», 6(2000), 10.

[vii] Orlando furioso XXII, 16 This electronic version is based on that edition of the poem published in The Orlando furioso of Ludovico Ariosto, Translated by William Stewart Rose (London, 1910). This work is in the public domain http://sunsite.berkeley.edu/OMACL/Orlando/. The idea of this quotation is taken from: Maltese, Diego. Servizio nazionale e servizio locale in Servizio bibliotecario nazionale e servizio locale: la realizzazione di Ferrara. – Ferrara : Artstudio, 1988, p. 25.