Hahn, Jim

ORCID

Disciplines

Cataloging and Metadata
Collection Development and Management
Library and Information Science

Position

Head of Metadata Research

Introduction

Working collaboratively across the Penn Libraries, I am developing a vision for the services, technologies and policies to enhance discovery of collections, following international standards and best practices for linked data and metadata; including transformation of existing library metadata into new schemas, and training of cataloging and metadata librarians in the creation and use of linked data.

Research Interests

linked data
metadata
recommender systems

View all metadata

Search Results

Now showing 1 - 10 of 14

BF Interlingua: Interoperability among BIBFRAME linked data vocabularies
(2023-01-19) Hahn, Jim
Presentation exploring an interchange process among BIBFRAME linked data vocabularies.
SVDE model interoperability: SVDE and the BIBFRAME interchange structure
(2022-11-08) Hahn, Jim
Provides an overview on a possible interchange structure for BIBFRAME using RDF/XML from Library of Congress as the interchange structure. The presentation details selected normalization steps of an SVDE instance into the RDF/XML Library of Congress structure. The presentation concludes with an example of loading SVDE normalized data into the Alma Sandbox at Penn by way of a locally hosted linked data editor, Marva.
Institutional linked data frameworks for collections, discovery, and access
(2023-06-26) Hahn, Jim
Linked data is the language of the web, and is a profoundly important opportunity for libraries to expand their reach. This talk will highlight how University of Alberta and University of Pennsylvania are leveraging key community partnerships (Share-VDE, LD4, PCC, Wikidata) and institutional strategic objectives to drive implementation choices, presenting generalizable approaches to successful linked data strategies in libraries. A general linked data vision takes a holistic approach to library implementation by supporting overall library strategic goals and objectives. Linked data objectives are not a separate or compartmentalized goal, but rather the overarching long term effects of linked data support library operations. Indeed, as library linked open data have proliferated on the web, linking external datasets with classic sources of library information have led to increased exposure of library resources, collections, and expertise. This talk will contextualize linked data from the perspective of current systems; extensions that can be made from where any library is at; and from any level of resource constraint. It is often possible to include linked data in more traditional representations, or to make connections between linked data and more familiar formats. This presentation will introduce examples of these mixed-format or “hybrid” linked data environments as they are expected to be the most common way in which linked data is used in production in the next few years. For the foreseeable future, linked data strategy is based on experimentation, innovation, and collaboration. This will encompass work in a hybrid environment that will include MARC records, sometimes enhanced with linked data identifiers, as well as more “native” linked data formats.
Marva BIBFRAME editor and Alma: the incorporation of native BIBFRAME descriptions to Alma
(2022-10-21) Hahn, Jim
The Library of Congress Network Development and MARC Standards office has been engaged in standards redevelopment for BIBFRAME and in the re-engineering of cataloging tools such as the Marva metadata editor for native BIBFRAME description. The code for the Marva BIBFRAME editor is available in the open source by way of GitHub (https://github.com/lcnetdev/marva-frontend; https://github.com/lcnetdev/marva-backend ). Using the Mavra linked data editor, along with ExLibris catalog APIs that interface with Alma, metadata research outputs from the University of Pennsylvania Library produced a reusable data flow such that BIBFRAME in its native Library of Congress structure ( BIBFRAME in XML/RDF) can now be directly posted into Alma. This presentation will detail how the incorporation of native BIBFRAME descriptions in Alma was accomplished by way of RESTful web services and Marva configuration. Catalogers at the University of Pennsylvania library are able to use a locally configured and locally hosted Marva system for creation of BIBFRAME descriptions that post directly to Alma as native BIBFRAME.
A Comparative Evaluation of Linked Data Discovery in the Share-VDE 2.0 Catalog
(2022-06-25) Hahn, Jim
Share-VDE (SVDE) is a library-driven initiative which brings together the bibliographic catalogs and authority files of a community of libraries in an innovative discovery environment based on linked data. The beta release of the SVDE 2.0 (https://www.svde.org) catalog was collaboratively shaped among multiple perspectives and stakeholder groups. A team at the University of Pennsylvania Libraries gathered feedback from library catalogers working in linked data, university faculty, and new undergraduate students in order to understand how linked data supports user tasks promulgated in the IFLA Library Reference Model (IFLA-LRM). Specific user tasks evaluated include ascertaining how library catalogers make use of advanced search functionality provided in the linked data interface. Context finding tasks included evaluating how Penn catalogers might find a linked data search useful for providing context to their searching or for helping to understand a research area. Specific LRM mapping focused on the LRM Identify user task; particularly disambiguation of similar name results. For comparative results similar questions are posed to students and faculty. Several targeted questions of faculty included understanding the relationships in linked data that are useful for future research planning using linked data search. In compiling results of the study we will describe the linked data functionality and scenarios which the Share-VDE 2.0 discovery system addresses. This session will be particularly useful for those who are looking to understand how new and in-demand systems like linked data discovery support improved user experience outcomes as users navigate and explore collections.
Hybrid linked data approaches in traditional discovery environments using Share-VDE linked data
(2023-07-11) Hahn, Jim
Hybrid linked data approaches for traditional discovery environments improve discovery in contemporary library systems using Share-VDE (SVDE) linked data. Hybrid linked data environments include “traditional” data structures alongside linked data systems and processes. The Share-VDE project (https://svde.org) is a collaborative discovery environment based on linked data. Explored in this talk are several lesser known and non-intuitive uses of Share-VDE linked data including discovery integration possibilities; data mining and machine learning process which targets Share-VDE enriched data. As brief records in our integrated library system receive improved cataloging from semi-automated subject indexing, we can improve traditional discovery and findability by better contextualizing the resources with linked data subject headings from FAST. The presentation will include screenshots and sample starter code that builds on Annif with Share-VDE data, among others.
A Comparative Evaluation of the Share-VDE Search System
(Taylor and Francis Group, 2024-07-03) Hahn, Jim; Ahnberg, Kayt; Giusti Serra, Liliana
The Share-VDE search system (https://svde.org) shifts the library discovery paradigm from record-based indexing and retrieval to that of linked data entity exploration. This paper reports results of iterative testing of multiple versions of the Share-VDE interface. The testing included remote user experience (UX) interviews with a total of twenty participants across four rounds of tests spanning two years. The comparison among participants encompassed catalogers, students of all levels, and faculty. Synthesizing IFLA LRM user tasks with interface evaluation methods supported the qualitative inquiry into how linked data systems in general, and BIBFRAME specifically, can support search system objectives.
BIBFRAME instance mining: Toward authoritative publisher entities using association rules
(2020-11-25) Hahn, Jim
With the transition of a shared catalog to BIBFRAME linked data, there is now a pressing need for identifying the canonical Instance for clustering in BIBFRAME. A fundamental component of Instance identification is by way of authoritative publisher entities. Previous work in this area by OCLC research (Connaway & Dickey, 2011) proposed a data mining approach for developing an experimental Publisher Name Authority File (PNAF). The OCLC research was able to create profiles for "high-incidence" publishers after data mining and clustering of publishers. As a component of PNAF, Connaway & Dickney were able to provide detailed subject analysis of publishers. This presentation will detail a case study of machine learning methods over a corpus of subjects, main entries, and added entries, as antecedents into association rules to derive consequent publisher entities. The departure point for the present research into identification of authoritative publisher entities is to focus on clustering, reconciliation and re-use of ISBN and subfield b of MARC 260 along with the subjects (650 - Subject Added Entry), main entries (1XX - Main Entries) and added entries (710 - Added Entry-Corporate Name) as signals to inform a training corpus into association rule mining, among other machine learning algorithms, libraries, and methods.
A research agenda for the evaluation of semantic search interfaces
(2020-07-21) Hahn, Jim
For research libraries to move successfully from experimentation to implementation with library linked data and semantic search interfaces, we need to better understand how these systems can best support users' information discovery needs. This presentation outlines a research agenda and methodologies we will use in the University of Pennsylvania Libraries to evaluate, implement, and extend library discovery systems using enriched and linked metadata, including systems used in the LD4 community. Our intended audience is librarians and developers of linked data systems with an interest in discovery. Our research agenda begins with user tasks described in the IFLA Library Reference Model (LRM), but also considers extensions to those basic tasks that linked data-enabled systems support. These extensions include enhanced topical browsing; the discovery of works, people, and topics across multiple information collections; and the selection and delivery of the most appropriate copy of a sought-after work in a multi-library context that takes into account both content and obtainability. Our evaluation methodologies include both quantitative and qualitative analyses. We will begin with analyses of current discovery search logs, and continue with user studies to determine user tasks that are well-supported by our semantic search systems, and to identify gaps in data, services, and interface designs for meeting user needs. We may also prototype extensions to our platforms and metadata and test their effectiveness. Platforms we will use for evaluation include the SHARE-VDE linked data catalog being developed with various library partners, the Blacklight-based catalog developed at Penn for our local library collection and for an experimental Ivy+ shared discovery service, and Penn's Online Books Page catalog and accompanying Forward to Libraries service. We hope in our research to articulate the best uses of semantic searching, and the most effective investments in systems, interfaces, and metadata to support discovery in a linked-data environment.
Bibliographic Entities are Described by Sets
(2021-07-26) Hahn, Jim
A set theoretical frame based on Svenonius's theory of bibliographic entities is the departure point for this short talk on entity description. This talk will briefly show how properties of bibliographic entity descriptions may be identified using a frequent pattern data mining algorithm over targeted sets of existing metadata descriptions. The MARC21 corpus used in this case was comprised of clustered sets of publishers and publisher locations from the library MARC21 records found in the Platform for Open Data (POD). POD is a data aggregation project involving member institutions of the IvyPlus Library Confederation and contains seventy million MARC21 records, forty million of which are unique.

Hahn, Jim

Email Address

ORCID

Disciplines

Research Projects

Organizational Units

Position

Introduction

Research Interests

Filters

Author

Subject

Date

Type

Publication Type

Settings

Sort By

Results per page

Search Results

Usage statistics

Penn's Heritage