Hahn, Jim

Profile Picture
Email Address
Cataloging and Metadata
Collection Development and Management
Library and Information Science
Research Projects
Organizational Units
Head of Metadata Research
Working collaboratively across the Penn Libraries, I am developing a vision for the services, technologies and policies to enhance discovery of collections, following international standards and best practices for linked data and metadata; including transformation of existing library metadata into new schemas, and training of cataloging and metadata librarians in the creation and use of linked data.
Research Interests
linked data
recommender systems

Search Results

Now showing 1 - 10 of 10
  • Publication
    SVDE model interoperability: SVDE and the BIBFRAME interchange structure
    (2022-11-08) Hahn, Jim
    Provides an overview on a possible interchange structure for BIBFRAME using RDF/XML from Library of Congress as the interchange structure. The presentation details selected normalization steps of an SVDE instance into the RDF/XML Library of Congress structure. The presentation concludes with an example of loading SVDE normalized data into the Alma Sandbox at Penn by way of a locally hosted linked data editor, Marva.
  • Publication
    BIBFRAME instance mining: Toward authoritative publisher entities using association rules
    (2020-11-25) Hahn, Jim
    With the transition of a shared catalog to BIBFRAME linked data, there is now a pressing need for identifying the canonical Instance for clustering in BIBFRAME. A fundamental component of Instance identification is by way of authoritative publisher entities. Previous work in this area by OCLC research (Connaway & Dickey, 2011) proposed a data mining approach for developing an experimental Publisher Name Authority File (PNAF). The OCLC research was able to create profiles for "high-incidence" publishers after data mining and clustering of publishers. As a component of PNAF, Connaway & Dickney were able to provide detailed subject analysis of publishers. This presentation will detail a case study of machine learning methods over a corpus of subjects, main entries, and added entries, as antecedents into association rules to derive consequent publisher entities. The departure point for the present research into identification of authoritative publisher entities is to focus on clustering, reconciliation and re-use of ISBN and subfield b of MARC 260 along with the subjects (650 - Subject Added Entry), main entries (1XX - Main Entries) and added entries (710 - Added Entry-Corporate Name) as signals to inform a training corpus into association rule mining, among other machine learning algorithms, libraries, and methods.
  • Publication
    BF Interlingua: Interoperability among BIBFRAME linked data vocabularies
    (2023-01-19) Hahn, Jim
    Presentation exploring an interchange process among BIBFRAME linked data vocabularies.
  • Publication
    A Comparative Evaluation of Linked Data Discovery in the Share-VDE 2.0 Catalog
    (2022-06-25) Hahn, Jim
    Share-VDE (SVDE) is a library-driven initiative which brings together the bibliographic catalogs and authority files of a community of libraries in an innovative discovery environment based on linked data. The beta release of the SVDE 2.0 (https://www.svde.org) catalog was collaboratively shaped among multiple perspectives and stakeholder groups. A team at the University of Pennsylvania Libraries gathered feedback from library catalogers working in linked data, university faculty, and new undergraduate students in order to understand how linked data supports user tasks promulgated in the IFLA Library Reference Model (IFLA-LRM). Specific user tasks evaluated include ascertaining how library catalogers make use of advanced search functionality provided in the linked data interface. Context finding tasks included evaluating how Penn catalogers might find a linked data search useful for providing context to their searching or for helping to understand a research area. Specific LRM mapping focused on the LRM Identify user task; particularly disambiguation of similar name results. For comparative results similar questions are posed to students and faculty. Several targeted questions of faculty included understanding the relationships in linked data that are useful for future research planning using linked data search. In compiling results of the study we will describe the linked data functionality and scenarios which the Share-VDE 2.0 discovery system addresses. This session will be particularly useful for those who are looking to understand how new and in-demand systems like linked data discovery support improved user experience outcomes as users navigate and explore collections.
  • Publication
    Bibliographic Entities are Described by Sets
    (2021-07-26) Hahn, Jim
    A set theoretical frame based on Svenonius's theory of bibliographic entities is the departure point for this short talk on entity description. This talk will briefly show how properties of bibliographic entity descriptions may be identified using a frequent pattern data mining algorithm over targeted sets of existing metadata descriptions. The MARC21 corpus used in this case was comprised of clustered sets of publishers and publisher locations from the library MARC21 records found in the Platform for Open Data (POD). POD is a data aggregation project involving member institutions of the IvyPlus Library Confederation and contains seventy million MARC21 records, forty million of which are unique.
  • Publication
    Hybrid linked data approaches in traditional discovery environments using Share-VDE linked data
    (2023-07-11) Hahn, Jim
    Hybrid linked data approaches for traditional discovery environments improve discovery in contemporary library systems using Share-VDE (SVDE) linked data. Hybrid linked data environments include “traditional” data structures alongside linked data systems and processes. The Share-VDE project (https://svde.org) is a collaborative discovery environment based on linked data. Explored in this talk are several lesser known and non-intuitive uses of Share-VDE linked data including discovery integration possibilities; data mining and machine learning process which targets Share-VDE enriched data. As brief records in our integrated library system receive improved cataloging from semi-automated subject indexing, we can improve traditional discovery and findability by better contextualizing the resources with linked data subject headings from FAST. The presentation will include screenshots and sample starter code that builds on Annif with Share-VDE data, among others.
  • Publication
    Share-VDE 2.0: a panel discussion among the Share-VDE working group chairs
    (2021-07-21) Hahn, Jim
    This panel will convene a diverse group of linked data professionals who serve as chairs of the Share-VDE working groups. The working groups include an Advisory Council (AC), Authority-Identifier Management Services (AIMS), Cluster Knowledge Base Editor (CKB), Sapientia Entity Identification (SEI), and a UX/UI group. The overall effect of combining their focus areas can be seen in the new Share-VDE 2.0 platform. Panelists will discuss how Share-VDE 2.0 implements an interoperable ecosystem of linked data structures and projects (e.g. LD4P/Sinopia) in part due to the information modeling that was completed by the SEI working group which sought to reference and implement interoperable BIBFRAME entity models. To manage such models a design of J.Cricket with the CKB editor WG was completed while incorporating services for the critical tasks of authority control with the AIMS WG. Panelists have contributed to the significant revision and enhancement of SVDE infrastructure including support for a re-visioning of the front end discovery interface that presents a next generation linked data discovery system, Share-VDE.
  • Publication
    A research agenda for the evaluation of semantic search interfaces
    (2020-07-21) Hahn, Jim
    For research libraries to move successfully from experimentation to implementation with library linked data and semantic search interfaces, we need to better understand how these systems can best support users' information discovery needs. This presentation outlines a research agenda and methodologies we will use in the University of Pennsylvania Libraries to evaluate, implement, and extend library discovery systems using enriched and linked metadata, including systems used in the LD4 community. Our intended audience is librarians and developers of linked data systems with an interest in discovery. Our research agenda begins with user tasks described in the IFLA Library Reference Model (LRM), but also considers extensions to those basic tasks that linked data-enabled systems support. These extensions include enhanced topical browsing; the discovery of works, people, and topics across multiple information collections; and the selection and delivery of the most appropriate copy of a sought-after work in a multi-library context that takes into account both content and obtainability. Our evaluation methodologies include both quantitative and qualitative analyses. We will begin with analyses of current discovery search logs, and continue with user studies to determine user tasks that are well-supported by our semantic search systems, and to identify gaps in data, services, and interface designs for meeting user needs. We may also prototype extensions to our platforms and metadata and test their effectiveness. Platforms we will use for evaluation include the SHARE-VDE linked data catalog being developed with various library partners, the Blacklight-based catalog developed at Penn for our local library collection and for an experimental Ivy+ shared discovery service, and Penn's Online Books Page catalog and accompanying Forward to Libraries service. We hope in our research to articulate the best uses of semantic searching, and the most effective investments in systems, interfaces, and metadata to support discovery in a linked-data environment.
  • Publication
    Institutional linked data frameworks for collections, discovery, and access
    (2023-06-26) Hahn, Jim
    Linked data is the language of the web, and is a profoundly important opportunity for libraries to expand their reach. This talk will highlight how University of Alberta and University of Pennsylvania are leveraging key community partnerships (Share-VDE, LD4, PCC, Wikidata) and institutional strategic objectives to drive implementation choices, presenting generalizable approaches to successful linked data strategies in libraries. A general linked data vision takes a holistic approach to library implementation by supporting overall library strategic goals and objectives. Linked data objectives are not a separate or compartmentalized goal, but rather the overarching long term effects of linked data support library operations. Indeed, as library linked open data have proliferated on the web, linking external datasets with classic sources of library information have led to increased exposure of library resources, collections, and expertise. This talk will contextualize linked data from the perspective of current systems; extensions that can be made from where any library is at; and from any level of resource constraint. It is often possible to include linked data in more traditional representations, or to make connections between linked data and more familiar formats. This presentation will introduce examples of these mixed-format or “hybrid” linked data environments as they are expected to be the most common way in which linked data is used in production in the next few years. For the foreseeable future, linked data strategy is based on experimentation, innovation, and collaboration. This will encompass work in a hybrid environment that will include MARC records, sometimes enhanced with linked data identifiers, as well as more “native” linked data formats.
  • Publication
    Marva BIBFRAME editor and Alma: the incorporation of native BIBFRAME descriptions to Alma
    (2022-10-21) Hahn, Jim
    The Library of Congress Network Development and MARC Standards office has been engaged in standards redevelopment for BIBFRAME and in the re-engineering of cataloging tools such as the Marva metadata editor for native BIBFRAME description. The code for the Marva BIBFRAME editor is available in the open source by way of GitHub (https://github.com/lcnetdev/marva-frontend; https://github.com/lcnetdev/marva-backend ). Using the Mavra linked data editor, along with ExLibris catalog APIs that interface with Alma, metadata research outputs from the University of Pennsylvania Library produced a reusable data flow such that BIBFRAME in its native Library of Congress structure ( BIBFRAME in XML/RDF) can now be directly posted into Alma. This presentation will detail how the incorporation of native BIBFRAME descriptions in Alma was accomplished by way of RESTful web services and Marva configuration. Catalogers at the University of Pennsylvania library are able to use a locally configured and locally hosted Marva system for creation of BIBFRAME descriptions that post directly to Alma as native BIBFRAME.