Using eBPF Hooks to Profile Linux File System Activity Across Benchmarking Workloads

Loading...
Thumbnail Image
Penn collection
Interdisciplinary Centers, Units and Projects::Center for Undergraduate Research and Fellowships (CURF)::Fall Research Expo
Degree type
Discipline
Computer Sciences
Subject
Operating Systems
Kernel Benchmarking
Funder
Grant number
License
author or copyright holder retaining all copyrights in the submitted work
Copyright date
2025-09-10
Distributor
Related resources
Author
Goyal, Dhruv
Angel, Sebastian
Contributor
Duarte, Phillip
Tian, Tony
Tewari, Aditya
Abstract

Modern operating systems must balance performance and adaptability when managing diverse application workloads, yet their file system behavior relies on fixed policies that may not make optimal choices for dynamic workloads. This project investigates Linux file system activity by inserting eBPF probes into vfs_read and vfs_write, two central functions that dispatch user-level I/O requests. Using KernMLOps, a standardized benchmarking and instrumentation framework, we extended support to attach probes not only at function entry but also within specific branches, enabling precise distinction between legacy .read/.write paths and newer .read_iter/.write_iter paths. Experiments were conducted on CloudLab nodes with Linux kernel 6.6.42 across representative workloads, including Redis and Fio. Collected syscall metadata—PID/TID, buffer size, return values, and path usage—was analyzed with polars and matplotlib. Results revealed distinct workload “signatures”: Redis exhibited mixed path usage with noisy buffer distributions, while Fio showed structured, iterator-dominant access patterns. These differences highlight how workloads leave identifiable syscall traces, providing ML-ready features for adaptive OS policies. By characterizing I/O behavior in this way, we lay the foundation for data-driven optimizations in caching, batching, and scheduling, advancing the broader LDOS goal of using machine learning to optimize operating systems.

Advisor
Date of presentation
2025-09-15
Conference name
Conference dates
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
This project was supported by the Penn Undergraduate Research Mentoring (PURM) program.
Recommended citation
Collection