Martin, Milo

Email Address
Research Projects
Organizational Units
Research Interests

Search Results

Now showing 1 - 10 of 44
  • Publication
    Improved Sequence-Based Speculation Techniques for Implementing Memory Consistency
    (2008-05-27) Blundell, Colin; Martin, Milo M.K.; Wenisch, Tom
    This work presents BMW, a new design for speculative implementations of memory consistency models in shared-memory multiprocessors. BMW obtains the same performance as prior proposals, but achieves this performance while avoiding several undesirable attributes of prior proposals: non-scalable structures, per-word valid bits in the data cache, modifications to the cache coherence protocol, and global arbitration. BMW uses a read and write bit per cache block and a standard invalidation-based cache coherence protocol to perform conflict detection while speculating. While speculating, stores to block not in the cache are placed into a coalescing store buffer until those misses return. Stores are written speculatively to the primary cache, and non-speculative state is maintained by cleaning dirty blocks before being written speculatively. Speculative blocks are invalidated on abort and marked as non-speculative on commit. This organization allows for fast, local commits while avoiding a non-scalable store queue.
  • Publication
    Litmus Tests for Comparing Memory Consistency Models: How Long Do They Need to Be?
    (2011-01-01) Alur, Rajeev; Mador-Haim, Sela; Martin, Milo
    Even though the general problem of comparing two memory models is infeasible, in this paper we show that checking the equivalence of two memory models becomes feasible when we consider a more restricted class of memory models. We define a class of memory models that is expressive enough to include most known hardware memory models, and we establish a bound of two threads and no more than six memory access instructions for contrasting litmus tests in this class of models. Thus, we can compare memory models in this class by checking a small number of litmus tests. We build a tool for comparing memory models based on this theorem and use the tool to explore and map the space of this class of models.
  • Publication
    Protocol Design With Concolic Snippets
    (2012-01-01) Alur, Rajeev; Deshmukh, Jyotirmoy; Martin, Milo; Mador-Haim, Sela; Raghavan, Arun; Udupa, Abhishek
    With the maturing of computer-aided verification technology, there is an emerging opportunity to develop design tools that can transform the way systems are designed. In this paper, we propose a new way to specify protocols using concolic snippets, that is, sample execution fragments that contain both concrete and symbolic values. While the purely symbolic extreme is simply an alternative representation of the traditional communicating extended finite-state-machines, and the purely concrete extreme is an instantiation of the "programming by examples" paradigm, our specification language allows the designer to specify the desired protocol using a mixture of symbolic state machines and concrete scenarios. Our synthesis engine generalizes the snippets into a transition function, which is then analyzed using a model checker with respect to high-level temporal-logic correctness requirements. We describe a prototype implementation for design of cache coherence protocols built using (1) a straightforward enumeration of all expressions for transition functions, (2) a check for consistency with respect to concolic snippets using the SMT solver CVC3, and (3) a check for correctness using the model checker Murø. We discuss our experience in designing classical cache coherence protocols using the proposed methodology.
  • Publication
    Unrestricted Transactional Memory: Supporting I/O and System Calls Within Transactions
    (2006-05-01) Lewis, E. Christopher; Blundell, Colin; Martin, Milo
    Hardware transactional memory has great potential to simplify the creation of correct and efficient multithreaded programs, enabling programmers to exploit the soon-to-be-ubiquitous multi-core designs. Transactions are simply segments of code that are guaranteed to execute without interference from other concurrently-executing threads. The hardware executes transactions in parallel, ensuring non-interference via abort/rollback/restart when conflicts are detected. Transactions thus provide both a simple programming interface and a highly-concurrent implementation that serializes only on data conflicts. A progression of recent work has broadened the utility of transactional memory by lifting the bound on the size and duration of transactions, called unbounded transactions. Nevertheless, two key challenges remain: (i) I/O and system calls cannot appear in transactions and (ii) existing unbounded transactional memory proposals require complex implementations. We describe a system for fully unrestricted transactions (i.e., they can contain I/O and system calls in addition to being unbounded in size and duration). We achieve this via two modes of transaction execution: restricted (which limits transaction size, duration, and content but is highly concurrent) and unrestricted (which is unbounded and can contain I/O and system calls but has limited concurrency because there can be only one unrestricted transaction executing at a time). Transactions transition to unrestricted mode only when necessary. We introduce unoptimized and optimized implementations in order to balance performance and design complexity.
  • Publication
    SoftBound: Highly Compatible and Complete Spatial Memory Safety for C
    (2009-01-01) Nagarakatte, Santosh; Martin, Milo; Zhao, Jianzhou; Zdancewic, Stephan A
    The serious bugs and security vulnerabilities facilitated by C/C++’s lack of bounds checking are well known. Yet, C and C++ remain in widespread use. Unfortunately, C’s arbitrary pointer arithmetic, conflation of pointers and arrays, and programmer-visible memory layout make retrofitting C/C++ with spatial safety guarantees extremely challenging. Existing approaches suffer from incompleteness, have high runtime overhead, or require non-trivial changes to the C source code. Thus far, these deficiencies have prevented widespread adoption of such techniques. This paper proposes SoftBound, a compile time transformation for enforcing complete spatial safety of C. SoftBound records base and bound information for every pointer as disjoint metadata. This decoupling enables SoftBound to provide complete spatial safety while requiring no changes to C source code. Moreover, SoftBound performs metadata manipulation only when loading or storing pointer values. A formal proof shows this is sufficient to provide complete spatial safety even in the presence of wild casts. SoftBound’s full checking mode provides complete spatial violation detection. To further reduce overheads, SoftBound has a store-only checking mode that successfully detects all the security vulnerabilities in a test suite while adding 15% or less overhead to half of the benchmarks.
  • Publication
    Core Ironclad
    (2013-01-01) Osera, Peter-Michael; Eisenberg, Richard A.; DeLozier, Christian; Nagarakatte, Santosh; Martin, Milo; Zdancewic, Stephan A
    Core Ironclad is a core calculus that models the salient features of Ironclad C++, a library-augmented type-safe subset of C++. We give an overview of the language including its definition and key design points. We then prove type safety for the language and use that result to show that the pointer lifetime invariant, a key property of Ironclad C++, holds within the system.
  • Publication
    RETCON: Transactional Repair Without Replay
    (2009-11-25) Blundell, Colin; Martin, Milo; Raghavan, Arun
    Over the past decade, there has been a surge of academic and industrial interest in optimistic concurrency, i.e., the speculative parallel execution of code regions (transactions or critical sections) with the semantics of isolation from one another. This work analyzes bottlenecks to the scalability of workloads that use optimistic concurrency. We find that one common source of performance degradation is updates to auxiliary program data in otherwise non-conflicting operations, e.g. reference count updates on shared object reads and hashtable size field increments on inserts of different elements. To eliminate the performance impact of conflicts on such auxiliary data, this work proposes RETCON, a hardware mechanism that tracks the relationship between input and output values symbolically and uses this symbolic information to transparently repair the output state of a transaction at commit. RETCON is inspired by instruction replay-based mechanisms but exploits simplifying properties of the nature of computations on auxiliary data to perform repair without replay. Our experiments show that RETCON provides significant speedups for workloads that exhibit conflicts on auxiliary data, including transforming a transactionalized version of the reference python interpreter from a workload that exhibits no scaling to one that exhibits near-linear scaling on 32 cores.
  • Publication
    Ironclad C++: A Library-Augmented Type-Safe Subset of C++
    (2013-03-28) DeLozier, Christian; Eisenberg, Richard A.; Nagarakatte, Santosh; Osera, Peter-Michael; Martin, Milo; Zdancewic, Stephan A
    C++ remains a widely used programming language, despite retaining many unsafe features from C. These unsafe features often lead to violations of type and memory safety, which manifest as buffer overflows, use-after-free vulnerabilities, or abstraction violations. Malicious attackers are able to exploit such violations to compromise application and system security. This paper introduces Ironclad C++, an approach to bring the benefits of type and memory safety to C++. Ironclad C++ is, in essence, a library-augmented type-safe subset of C++. All Ironclad C++ programs are valid C++ programs, and thus Ironclad C++ programs can be compiled using standard, off-the-shelf C++ compilers. However, not all valid C++ programs are valid Ironclad C++ programs. To determine whether or not a C++ program is a valid Ironclad C++ program, Ironclad C++ uses a syntactic source code validator that statically prevents the use of unsafe C++ features. For properties that are difficult to check statically Ironclad C++ applies dynamic checking to enforce memory safety using templated smart pointer classes. Drawing from years of research on enforcing memory safety, Ironclad C++ utilizes and improves upon prior techniques to significantly reduce the overhead of enforcing memory safety in C++. To demonstrate the effectiveness of this approach, we translate (with the assistance of a semi-automatic refactoring tool) and test a set of performance benchmarks, multiple bug-detection suites, and the open-source database leveldb. These benchmarks incur a performance overhead of 12% on average as compared to the unsafe original C++ code, which is small compared to prior approaches for providing comprehensive memory safety in C and C++.
  • Publication
    Token Coherence: A New Framework for Shared-Memory Multiprocessors
    (2003-11-01) Martin, Milo; Hill, Mark D; Wood, David A
    Commercial workload and technology trends are pushing existing shared-memory multiprocessor coherence protocols in divergent directions. Token Coherence provides a framework for new coherence protocols that can reconcile these opposing trends.
  • Publication
    Adding Token Counting to Directory-Based Cache Coherence
    (2008-06-04) Raghavan, Arun; Blundell, Colin; Martin, Milo M.K.
    The coherence protocol is a first-order design concern in multicore designs. Directory protocols are naturally scalable, as they place no restrictions on the interconnect and have minimal bandwidth requirements; however, this scalability comes at the cost of increased sharing latency due to indirection. In contrast, broadcast-based systems such as snooping protocols and token coherence reduce latency of sharing misses by sending requests directly to other processors. Unfortunately, their reliance on totally ordered interconnects and/or broadcast limits their scalability. This work introduces PATCH (Predictive/Adaptive Token Counting Hybrid), a coherence protocol that provides the scalability of directory protocols while opportunistically using available bandwidth to reduce sharing latency. PATCH extends a standard directory protocol to track tokens and use token counting rules for enforcing coherence permissions. Token counting allows PATCH to support direct requests on an unordered interconnect, while a novel mechanism called token tenure uses local processor timeouts and the directory’s per-block point of ordering at the home node to guarantee forward progress without relying on broadcast. PATCH makes three main contributions. First, PATCH uses direct request prioritization to match the performance of broadcast-based protocols without restricting scalability. Second, PATCH introduces token tenure, which provides broadcast-free forward progress for token counting protocols. Finally, PATCH provides greater scalability than directory protocols when using inexact encodings of sharers because only processors holding tokens need to acknowledge requests. Overall, PATCH is a “one-size-fits-all” coherence protocol that dynamically adapts to work well for small systems, large systems, and anywhere in between