Penn Libraries

The Penn Libraries network includes 19 physical libraries, recognized for their collections, and a digital library known for innovation and richness of content. Through exhibitions and lectures, and through the acquisition and preservation of literary and artistic artifacts, the Penn Libraries documents a wealth of social and historical periods, bringing scholarship to life at the University and in the various communities it serves.



Search results

Now showing 1 - 10 of 1586
  • Publication
    A Comparative Evaluation of the Share-VDE Search System
    (Taylor and Francis Group, 2024-07-03) Hahn, Jim; Ahnberg, Kayt; Giusti Serra, Liliana
    The Share-VDE search system ( shifts the library discovery paradigm from record-based indexing and retrieval to that of linked data entity exploration. This paper reports results of iterative testing of multiple versions of the Share-VDE interface. The testing included remote user experience (UX) interviews with a total of twenty participants across four rounds of tests spanning two years. The comparison among participants encompassed catalogers, students of all levels, and faculty. Synthesizing IFLA LRM user tasks with interface evaluation methods supported the qualitative inquiry into how linked data systems in general, and BIBFRAME specifically, can support search system objectives.
  • Publication
    Sociotechnical Automation Science: A Case Study in Developing and Augmenting an Ensemble Neural Network with Multiple LLMs for Subject Cataloging at the Penn Libraries
    (2024-06-26) Hahn, Jim
    The sociotechnical aspects of automation play a crucial role in the development of machine learning systems. Through deep collaboration with cataloging professionals at the Penn Libraries, we have created a set of subject indexing algorithms that are ensembled into a neural network. Librarians have evaluated multiple rounds of the algorithm outputs. By identifying the failure points in the neural network-based subject assignment process, we incorporated LLM tasks such as evaluating search result relevance, summarizing search results, and assessing topical assignments of synthetic summaries. Implementing LLM tasks draws on the linguistic strengths of LLMs, rather than world knowledge. The data processing is integrated into an Apache Airflow pipeline, allowing librarians to input an Excel file, which begins the workflow for generating candidate subject descriptions. These machine learning outputs are poised for a pilot test in production systems this summer.
  • Publication
    Penn Library's LJS 494 - [Marʼeh ha-ofanim] ... [etc.]. (Video Orientation)
    Porter, Dot
    Video Orientation to the University of Pennsylvania Library's LJS 494, a Hebrew translation of a fundamental treatise on medieval astronomy and cosmology that describes and illustrates the Ptolemaic model of a spherical earth divided into climactic zones at the center of the concentric spheres of the universe. Followed by Ruaḥ ha-ḥen, a 13th-century philosophical work that was a popular introduction to science, here attributed to Yehudah ibn Tibon. It has also been attributed to Jacob ben Abba Mari ben Samson Anatoli and Zeraḥyah ha-Yeṿani. Occasional marginal notes. Final page contains Hebrew notes and pen trials by various hands (f. 22v). Written in northern Italy in the second quarter of the 15th century (based on watermark information). Record on Franklin, with link to a digital copy: Record on Internet Archive, with a link to PDF:
  • Publication
    Penn Library's Library's LJS 495 - [Kharīdat al-ʻajāʼib wa farīḍat ...] (Video Orientation)
    Porter, Dot
    Video Orientation to the University of Pennsylvania Library's LJS 495, a cosmography containing a compendium of place names, seas, and mountains; information on flora and fauna; and a brief explanation of the game of chess. This text has also been attributed to the 14th-century author Zayn al-Dīn ʻUmar ibn al-Muẓaffar ibn al-Wardī. The item is undated, though it was possibly produced in the mid-to-late 15th or early 16th century. The colophon has been partially pasted over.
  • Publication
    FAIR Assessment Checklist for Data Repositories
    (University of Pennsylvania, 2024-01-25) Phegley, Lauren
    This assessment checklist is intended to support data repository managers who want to evaluate their repositories FAIR enabling practices. The FAIR checklist is provided as a guide to evaluating current implementation and future actions to make a repository FAIR enabling. The intention of this checklist is to allow for honest evaluation of concrete ways to be FAIR enabling, rather than admonishment for lack of adoption.
  • Publication
    Data Dictionary Blank Template
    (2023-10) Phegley, Lauren
    This is a blank data dictionary template intended to assist researchers with documenting the variables, structure, content, and layout of their datasets. A good data dictionary has enough information about each variable for it to be self explanatory and interpreted properly by someone outside of the original research group. There are two different file types for the data dictionary avaliable, a Excel file (.xslx) and a .csv file. The Excel file has both the template and the field descriptions on different sheets, while the .csv template and field descriptions are seperated into two csv's. This is because csv's do no allow for multiple sheets in one file. The template section provides you with commonly required columns that are necessary to fully define your data. The field descriptions section is where you define the column headers and possible values that can be entered. There is an example in the first row that can be deleted for you to enter in your own data. This template is build off of the Ag Data Commons "Data Dictionary - Blank Template" from the United States Department of Agriculture ( [no longer accessible online as of 2023-12-18].
  • Publication
    Audiovisual Data Curation Primer Presentation
    (2023-12-14) Phegley, Lauren
    This presentation was given as part of the Data Curation Network's Primer Webinar held on 2023-12-14. The authors presented the highlights of our Audiovisual Data Curation Primer, which is a peer-reviewed concise resource designed to provide support for data curators in learning about audiovisual files. The full primer is openly avaliable at
  • Publication
    Index to James G. Spady's contributions to the Philadelphia New Observer, 1985-2006
    (University of Pennsylvania Libraries, 2023) Xie, Zhangyang
    An index to pieces written by the writer, activist, journalist, and historianJames G. Spady (1944-2020) in the Philadelphia New Observer newspaper which are present as clippings in the Spady Papers at the University of Pennsylvania (UPenn Ms. Coll. 1509). This index was compiled during 2023 and contains more than 1,700 articles written by Spady indexed under several key terms, many of which deal with Philadelphia arts, culture, and African American history. While not an exhaustive list of everything Spady wrote for the Philadelphia New Observer, the index is based on clippings made and selected by Spady and thus offers a particularly valuable insight into Spady's professional work, academic interests, community engagement, and philosophy of life.
  • Publication
    The Diary of Ray Evans, 1939-1945
    (University of Pennsylvania Libraries, 2015) Evans, Ray, 1915-2007
    The papers of Hollywood lyricist and University of Pennsylvania alumnus Ray Evans (Wharton class of 1936) were donated to the Kislak Center for Special Collections, Rare Books and Manuscripts in 2011 by the Ray and Wyn Ritchie Evans Foundation. Among the treasures in the collection is the songster’s diary, kept during the war years of 1939-1945. Written in pen on looseleaf notebook paper and bound by a small three ring binder, it may have been one of several diaries kept by Evans, but today it is the only one that survives. As a means to encourage exploration of these formative years, Evans’ lyrics and music in general, and his collection of personal and professional papers here at Penn we have transcribed and digitized the diary, presenting it as an online flip book. Images of the diary pages and a transcription for easier reading are presented side by side. The transcription here was created by John F. Anderies and first published to the web in 2015 at the URL . The fully digitized diary is also available online at . The contents of this diary are copyright The Ray & Wyn Ritchie Evans Foundation.
  • Publication
    BIBFRAME instance mining: Toward authoritative publisher entities using association rules
    (2020-11-25) Hahn, Jim
    With the transition of a shared catalog to BIBFRAME linked data, there is now a pressing need for identifying the canonical Instance for clustering in BIBFRAME. A fundamental component of Instance identification is by way of authoritative publisher entities. Previous work in this area by OCLC research (Connaway & Dickey, 2011) proposed a data mining approach for developing an experimental Publisher Name Authority File (PNAF). The OCLC research was able to create profiles for "high-incidence" publishers after data mining and clustering of publishers. As a component of PNAF, Connaway & Dickney were able to provide detailed subject analysis of publishers. This presentation will detail a case study of machine learning methods over a corpus of subjects, main entries, and added entries, as antecedents into association rules to derive consequent publisher entities. The departure point for the present research into identification of authoritative publisher entities is to focus on clustering, reconciliation and re-use of ISBN and subfield b of MARC 260 along with the subjects (650 - Subject Added Entry), main entries (1XX - Main Entries) and added entries (710 - Added Entry-Corporate Name) as signals to inform a training corpus into association rule mining, among other machine learning algorithms, libraries, and methods.