Scholarship at Penn Libraries

The scholarship published in this collection was authored by University of Pennsylvania Libraries staff.



Search results

Now showing 1 - 10 of 142
  • Publication
    Defensive Design: Developing a System-Agnostic Repository for Sustainable Long-Term Preservation
    (2018-06-04) Lynch, Kate; Morton-Owens, Emily
    Colenda, the University of Pennsylvania Libraries’ digital repository, was designed to promote longterm preservation. Its infrastructure is comprised of components selected to concentrate on factors that are of the most importance and that pose the greatest risks for long-term preservation of digital assets: safe file storage, the ability to track changes to objects over time, mechanisms for object management and discoverability, and migration paths that guarantee that objects can be safely migrated to new software and new versions of existing systems while preventing data loss. Favoring a pluggable architecture and preservation of software-agnostic representations of objects in order to keep future repository development plans flexible and open, our approach minimizes the risk of data loss in the long term and has allowed us to design a system in which the right tools for the task are always an option. In this paper, we will enumerate the risks/concerns influencing our design decisions and show how our approach addresses them while retaining a connection to the central open-source projects of the community, Fedora and Samvera, that make up significant portions of our stack.
  • Publication
    SVDE model interoperability: SVDE and the BIBFRAME interchange structure
    (2022-11-08) Hahn, Jim
    Provides an overview on a possible interchange structure for BIBFRAME using RDF/XML from Library of Congress as the interchange structure. The presentation details selected normalization steps of an SVDE instance into the RDF/XML Library of Congress structure. The presentation concludes with an example of loading SVDE normalized data into the Alma Sandbox at Penn by way of a locally hosted linked data editor, Marva.
  • Publication
    Constructing the Magazine of Early American Datasets (MEAD): An Invitation to Share and Use Data About Early America
    (2016-01-01) Smith, Billy G; Okrent, Nicholas E; Schocket, Andrew M; Wipperman, Sarah L
  • Publication
    Common Knowledge: Epistemology and the Beginnings of Copyright Law
    (2016-03-01) Enderle, Jonathan Scott
    Literary critics’ engagement with copyright law has often emphasized ontological questions about the relation between idealized texts and their material embodiments. This essay turns toward a different set of questions—about the role of texts in the communication of knowledge. Developing an alternative intellectual genealogy of copyright law grounded in the eighteenth-century contest between innatism and empiricism, I argue that jurists like William Blackstone and poets like Edward Young drew on Locke’s theories of ideas to articulate a new understanding of writing as uncommunicative expression. Innatists understood texts as tools that could enable transparent communication through a shared stock of innate ideas, but by denying the existence of innate ideas empiricists called the possibility of communication into question. And in their arguments for perpetual copyright protection, eighteenth-century jurists and pamphleteers pushed empiricism to its extreme, linking literary and economic value to the least communicative aspects of a text.
  • Publication
    Copyright and Provenance: Some Practical Problems
    (2007-12-01) Mark Ockerbloom, John
    Copyright clearance is an increasingly complex and expensive impediment to the digitization and reuse of information. Clearing copyright issues in a reliable and cost-effective manner for works created in the last 100 years can involve establishing complex provenance chains for the works, their copyrights, and their licenses. This paper gives an overview of some of the practical provenance-related issues and challenges in clearing copyrights at large scale, and discusses efforts to more efficiently gather and share information and its copyright provenance.
  • Publication
    Review of Charles Berlin, Harvard Judaica: A History and Description of the Judaica Collection in the Harvard College Library
    (2006-12-01) Kiron, Arthur
    The history of Harvard University’s sui generis Judaica Division and its collections are summarized with principled clarity and remarkable reserve in this new publication of the Harvard College Library. The author, Charles Berlin, the Lee M. Friedman Bibliographer in Judaica in the Harvard College Library and Head of its Judaica Division, is uniquely qualified to tell this story. Indeed, he is the author not only of this volume, but also of much of the recent history it recounts between its elegant, gold-embossed, yet understated hardbound covers. Given the extraordinary scope of Berlin’s achievements, this reviewer must pause to acknowledge his 42 years of contributions, which are recorded here
  • Publication
    Review of Cleaver, Laura, Illuminated History Books in the Anglo-Norman World, 1066–1272
    (2022-09-01) Mulhern, Edith
    The book reviewed argues that the meaning and context of Anglo-Norman histories cannot be fully understood without attention to the visual layout of the entire page. By comparing extant copies of the same texts, the book considers what factors influence the mise-en-page, and how these manuscripts circulated and were amended through time.
  • Publication
    BIBFRAME instance mining: Toward authoritative publisher entities using association rules
    (2020-11-25) Hahn, Jim
    With the transition of a shared catalog to BIBFRAME linked data, there is now a pressing need for identifying the canonical Instance for clustering in BIBFRAME. A fundamental component of Instance identification is by way of authoritative publisher entities. Previous work in this area by OCLC research (Connaway & Dickey, 2011) proposed a data mining approach for developing an experimental Publisher Name Authority File (PNAF). The OCLC research was able to create profiles for "high-incidence" publishers after data mining and clustering of publishers. As a component of PNAF, Connaway & Dickney were able to provide detailed subject analysis of publishers. This presentation will detail a case study of machine learning methods over a corpus of subjects, main entries, and added entries, as antecedents into association rules to derive consequent publisher entities. The departure point for the present research into identification of authoritative publisher entities is to focus on clustering, reconciliation and re-use of ISBN and subfield b of MARC 260 along with the subjects (650 - Subject Added Entry), main entries (1XX - Main Entries) and added entries (710 - Added Entry-Corporate Name) as signals to inform a training corpus into association rule mining, among other machine learning algorithms, libraries, and methods.
  • Publication
    Uncomplicating the business of repositories
    (2019-06-11) Lynch, Kate; Morton-Owens, Emily
    In this presentation, we discuss how our library runs our repository in production to meet the needs of our “business” as efficiently as possible. We have an interest in limiting the number of digital platforms we manage, for the purposes of sustainability and efficiency, but we must also consider how well a general platform can meet specific user needs. A governance group of administrators, in conference with stakeholders and developers, seeks to find the best way to accommodate each collection or functional need, with an eye to minimizing technical complexity, offering stakeholders self-serve options when possible, and maintaining a single canonical copy of each object. We will present some case studies of how material has been handled in our developing digital ecosystem, where preservation and access sometimes present conflicting priorities. We are exploring how our repository can best evolve to support our aims of making data and documents freely available.