UNLIMITED FREE
ACCESS
TO THE WORLD'S BEST IDEAS

SUBMIT
Already a GlobalSpec user? Log in.

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

Customize Your GlobalSpec Experience

Finish!
Privacy Policy

This is embarrasing...

An error occurred while processing the form. Please try again in a few minutes.

NISO RP-19

Open Discovery Initiative: Promoting Transparency in Discovery

active, Most Current
Buy Now
Organization: NISO
Publication Date: 1 January 2020
Status: active
Page Count: 74
scope:

In broad terms, ODI focuses primarily on the issues related to the composition of the central index associated with these discovery services and not with the design of user interfaces. The initiative does not seek to intrude into areas of proprietary innovation that distinguish each of the discovery service products.

The arena of index-based discovery spans many different issues, some of which lend themselves to a more open and standard treatment, while others remain in the realm of product development. The Open Discovery Initiative recognizes that even among the issues that might potentially benefit from its attention, some rank at a higher priority and others may need to be addressed through possible follow-up activities.

In Scope

A primary area of interest for ODI involves the arena of content coverage of the discovery services represented in their central index. The content used to create the indexes of a discovery service comes from a range of information providers and content products, such as commercial and nonprofit publishers, universities and other research institutions, and many other types of organizations. The content of interest to ODI includes any materials that libraries consider within their collection, regardless of the business model for acquisition or the type of license, such as commercially restricted or open access.

This initiative aims to address the following questions in the realm of content coverage:

• The quantity of content that a provider makes available to discovery service providers relative to its total offerings

• The form of that content, such as whether it consists of citation-level metadata or if it also includes full-text representations

• Whether the discovery services operate in a way that results presented to the user do not favor or disfavor items from any given content source or material type

• The specific metadata fields provided within metadata records

• The specific metadata fields indexed in the discovery services

• Whether any controlled vocabularies or ontologies are included

• How A&I products relate to discovery services

• How branding of content providers is presented in a discovery service

This initiative aims to address the following technical issues:

• The transfer mechanisms or protocols by which data are delivered from content providers to discovery service creators

• The format in which the records are delivered by content providers to discovery service creators

One area of focus deals with the definitions of the metadata delivered by content providers to discovery services, as well as the data made available to licensed customers. A perceived lack of transparency across these data flows prompted the need to develop definitions of the data points and to clarify what metadata or data elements are made available to which parties and under which conditions. For example, a content provider might allow certain metadata elements to be included in the search index for retrieval purposes, but not allow those elements to be displayed in the final user interface. Conversely, libraries might not understand what elements from which full-text or A&I services are available and in which circumstances. Some elements might be displayed to authenticated users, and some not, but definitions of these distinctions are sometimes vague, if they are described at all.

Another topic concerns factors related to whether or not a discovery service functions with a bias towards certain databases or content products based on business relationships rather than user needs or library preferences. It was deemed important to propose practices that facilitate the presentation of unbiased links to a user following the execution of a search through a discovery service and in support of this objective to ensure transparency about discovery service practices.

Out of Scope

This initiative does not address issues related to performance or features of the discovery services, as these are inherently business and design decisions guided by competitive market forces.

Aspects of index-based discovery not deemed within the scope of ODI include:

User interface issues - The user interface components of a discovery service may depend on the content indexes, but are out of scope for this ODI initiative.

Relevancy ranking - The specific methods that a discovery service employs to order items in a search result set may fall within the realm of proprietary technologies used competitively to differentiate commercial offerings.

Further, the demands of relevancy ranking-necessary for satisfying user expectations in keeping pace with open web search applications-require continuous enhancement of supporting technologies and algorithms. As such it would be both impractical and an impediment to require that service providers continuously publish the highly dynamic and substantially detailed workings of their search relevancy algorithms. Therefore, the ODI Working Group concludes that, while disclosure of the broad aspects of a relevance ranking algorithm is encouraged, there should be no expectation that a discovery service provider would explain in any level of detail the ranking algorithms it applies. (As noted above, however, whether rankings result in the bias of results presented to the end user is within scope.)

APIs exposed by discovery services - Several discovery service providers offer access to their discovery index through an application program interface (API). This is a set of protocols that allows a computer program to query the index and receive search results. Some libraries build their own interfaces to the search index; others use the API to embed data retrieved from the search index into other applications, like pull lists of citations into course sites or other web pages.

The ODI Working Group determined that APIs were out of scope for this initial foundational stage of the initiative. This decision was made somewhat reluctantly based on the growing need for libraries to have access to data via API. Because the provision of an API to customers is largely a business decision by the vendor, it was felt the existence of an API could not be suggested as a best practice. Additionally, the desire to recommend standardization of API results would have added significant complexity to the workgroup's recommendations. Thus, the current committee concluded that best practices concerning discovery service APIs should be deferred until a later round of work. See Section 4.2 for recommended next steps.

Protocols for data exchange - In the arena of the technical mechanisms involved in the transfer of data between content providers and discovery services, the ODI concludes that the existing protocols and methodologies previously defined provide adequate options and that it is not necessary to create a new protocol specifically for use within the discovery services ecosystem. There has been much work and standards development in the area of file formats, schemas, naming practices, transport mechanisms, etc.

Purpose

The Open Discovery Initiative (ODI) aims to facilitate progress through exploration of relevant issues and the development of recommended practices for the current generation of library discovery services based on centrally indexed search. The domain of index-based discovery services involves a complex ecosystem of interrelating issues and interests among content providers, libraries, and discovery service creators.

This model of discovery relies on an index populated with metadata, full-text, or other representations of the content items-such as journal articles, book chapters, e-books, research reports, reference sources, images, maps, datasets, audiovisual materials, and other selected material-that a library provides to its users. The content comes from a range of information providers and products, such as commercial and nonprofit publishers, universities and other research institutions, and many other types of organizations. The content of interest to ODI includes any materials that libraries would consider within their collection, regardless of the business model for acquisition or the type of license, such as commercially restricted or open access.

Several major discovery products have been released to the market since early 2009 that are based on the model of centrally indexed search-largely influenced by the Google search model and users' expectations for a single, unified discovery experience. An increasing number of libraries, especially those that serve academic or research institutions, have invested in index-based discovery services. These products serve as one of the interfaces through which the library's patrons gain access to the rapidly growing breadth of information that may be available to them. These discovery services play an increasingly strategic role in the way that libraries provide users with access to their collection, they represent a growing segment of the library technology industry, and they may become a factor in how libraries select content products. These factors draw attention to the discovery services arena for any improvements that might be gained through this Recommended Practice.

To work effectively, discovery services need to be as comprehensive as possible in their content coverage. Libraries expect their uniquely licensed and purchased electronic content to be indexed within their discovery service of choice. Further, they require comprehensive and clear representation of each category of content in the discovery service. Content items not represented in a discovery service present a challenge to libraries in how they might otherwise ensure that these materials are discovered and accessed. Libraries have an interest in knowing whether any content providers are excluded or underrepresented in any given discovery service.

The Open Discovery Initiative aims to facilitate increased transparency in the content coverage of index-based discovery services and to recommend consistent methods of content exchange or other mechanisms. Full transparency will enable libraries to objectively evaluate discovery services and to deal with daily operational issues surrounding these products.

Discovery services depend on the cooperation of content providers with discovery service creators to provide access to metadata or full text of information resources in order to create effective indexes. The inclusion of data in the indexes of the current slate of discovery services is based on private agreements and ad hoc exchange methodologies between information providers and discovery service creators. Index-based discovery can potentially benefit content providers through enhanced exposure of their materials. It also presents some concerns, such as enabling library patrons to bypass the specialized interfaces created by content providers, potentially reducing or eliminating branding and losing control in how content is presented to the end user. And, as libraries' uptake of these services increases, the usage (and perceived value) of publisher products can be greatly influenced by how successfully discovery services drive readers to content providers' assets.

ODI investigated the need for standard protocols to make the transfer of data from content providers to discovery service creators. Consistent practices in the exchange and formats of data aim to lower the level of complexity as content flows through this ecosystem, mitigating technical issues that might hinder broader participation by content providers or potential discovery service creators.

Libraries need a clear understanding of the degree of exposure for the content that they have acquired as represented in a discovery service. This understanding is essential as libraries evaluate and select a discovery service, and on an ongoing basis once it is implemented. Libraries require specific information on exactly which articles, databases, and other sources are represented; whether they are indexed in full text, by citations only, or both; and whether the metadata derives from aggregated databases or abstract and indexing (A&I) resources.

In the operation of an index-based discovery service, many different factors contribute to how it presents and orders results and how it connects users to content resources. For any given item of content, multiple metadata elements contributed from different content providers may be indexed by the discovery service. For a journal article, for example, its full text might be contributed by the primary publisher, citation data from the provider of an aggregated database, and abstracts or controlled vocabulary terms may be provided by yet another provider. Content providers are motivated to contribute to discovery services in order to gain more access from the patrons associated with the libraries that implement the discovery services. It is therefore important to each type of content provider that its contributions are appropriately recognized. If a record contributed by an A&I service, for example, leads to the selection of a full-text resource from another provider, how does the A&I service gain benefit from the discovery transaction? A subgroup of ODI on Fair Linking was established to explore and make recommendations on these issues.

The Open Discovery Initiative recognized and aimed to address perceptions regarding bias and concerns about the possibility of bias in discovery services. Special concerns surround the possibility of bias when discovery services are owned by the same corporate parent as content products or services. Concerns also arise through exclusive arrangements or other business relationships made by a discovery service with a content provider that might introduce bias. Some of these recommended practices were developed with the intent of helping discovery service providers mitigate concerns that exist in the community about conflicts of interest and other relationships that create bias. By explaining the nature of their business connections with related content providers and third parties alike, and affirming the neutrality of their discovery offerings, these services will be positioned to reassure both libraries and content providers about the nature of their practices.

Document History

NISO RP-19
January 1, 2020
Open Discovery Initiative: Promoting Transparency in Discovery
In broad terms, ODI focuses primarily on the issues related to the composition of the central index associated with these discovery services and not with the design of user interfaces. The initiative...
January 1, 2014
Open Discovery Initiative: Promoting Transparency in Discovery
In broad terms, ODI focuses primarily on the issues related to the composition of the central index associated with these discovery services and not with the design of user interfaces. The initiative...

References

Advertisement