<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>Technical Reports (CIS)</title>
<copyright>Copyright (c) 2013 University of Pennsylvania All rights reserved.</copyright>
<link>http://repository.upenn.edu/cis_reports</link>
<description>Recent documents in Technical Reports (CIS)</description>
<language>en-us</language>
<lastBuildDate>Wed, 22 May 2013 14:20:17 PDT</lastBuildDate>
<ttl>3600</ttl>








<item>
<title>Ironclad C++:  A Library-Augmented Type-Safe Subset of C++</title>
<link>http://repository.upenn.edu/cis_reports/982</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/982</guid>
<pubDate>Thu, 09 May 2013 07:38:01 PDT</pubDate>
<description>
	<![CDATA[
	<p>C++ remains a widely used programming language, despite retaining many unsafe features from C. These unsafe features often lead to violations of type and memory safety, which manifest as buffer overflows, use-after-free ulnerabilities, or abstraction violations. Malicious attackers are able to exploit such violations to compromise application and system security. This paper introduces Ironclad C++, an approach to bring the benefits of type and memory safety to C++. Ironclad C++ is, in essence, a library-augmented type-safe subset of C++. All Ironclad C++ programs are valid C++ programs, and thus Ironclad C++ programs can be compiled using standard, off-the-shelf C++ compilers. However, not all valid C++ programs are valid Ironclad C++ programs. To determine whether or not a C++ program is a valid Ironclad C++ program, Ironclad C++ uses a syntactic source code validator that statically prevents the use of unsafe C++ features. For properties that are difficult to check statically Ironclad C++ applies dynamic checking to enforce memory safety using templated smart pointer classes. Drawing from years of research on enforcing memory safety, Ironclad C++ utilizes and improves upon prior techniques to significantly reduce the overhead of enforcing memory safety in C++. To demonstrate the effectiveness of this approach, we translate (with the assistance of a semi-automatic refactoring tool) and test a set of performance benchmarks, multiple bug-detection suites, and the open-source database leveldb. These benchmarks incur a performance overhead of 12% on average as compared to the unsafe original C++ code, which is small compared to prior approaches for providing comprehensive memory safety in C and C++.</p>

	]]>
</description>

<author>Christian DeLozier et al.</author>


</item>






<item>
<title>Network Intrusion Detection and Mitigation against Denial of Service Attack</title>
<link>http://repository.upenn.edu/cis_reports/981</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/981</guid>
<pubDate>Mon, 06 May 2013 11:14:54 PDT</pubDate>
<description>
	<![CDATA[
	<p>The growing use of Internet service in the past few years have facilitated an increase in the denial of service (DoS) attacks. Despite the best preventative measures, DoS attacks have been successfully carried out against high-prole organizations and enterprises, including those that took down Chase, BOA, PNC and other major US banks in September 2009, which reveal the vulnerability of even well equipped networks. These widespread attacks have resulted in significant loss of service, money, and reputation for organizations, calling for a practical and ecient solution to DoS attack detection and mitigation.</p>
<p>DoS attack detection and mitigation strengthens the robustness and security of network or computer system, by monitoring system activities for suspicious behaviors or policy violations, providing forensic information about the attack, and taking defensive measures to reduce the impact on the system. In general, attacks can be detected by (1) matching observed network trac with patterns of known attacks; (2) looking for deviation of trac behavior from the established prole; and (3) training a classier from labeled dataset of attacks to classify incoming trac. Once an attack is identied, the suspicious trac can be blocked or rate limited.</p>
<p>In this presentation, we present a taxonomy of DoS attack detection and mitigation techniques, followed by a description of four representative systems (Snort, PHAD, MADAM, and MULTOPS). We conclude with a discussion of their pros/cons as well as  challenges for future work.</p>

	]]>
</description>

<author>Dong Lin</author>


</item>






<item>
<title>Is a Rigorous Agile Methodology the Best  Development Strategy for Small Scale Tech Startups?</title>
<link>http://repository.upenn.edu/cis_reports/980</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/980</guid>
<pubDate>Wed, 13 Feb 2013 08:25:47 PST</pubDate>
<description>
	<![CDATA[
	<p>Recently, Agile development processes have become popular in the software development community, and have been shown to be effective in large organizations. However, given that the communication and cooperation dynamics in startup companies are very different from that of larger, more established companies, and the fact that the initial focus of a startup might be significantly different from its ultimate goal, it is questionable whether a rigid process model that works for larger companies is appropriate in tackling the problems faced by a startup. When we scale down even further and observe the small scale startup with only a few members, many of the same problems that Agile methodology sets out to solve do not even exist. Then, for a small scale startup, is it still worth putting the resources into establishing a process model? Do the benefits of adopting an Agile methodology outweigh the opportunity cost of spending the resources elsewhere? This paper examines the advantages and disadvantages of adopting an Agile methodology in a small scale tech startup and compares it to other process models, such as the Waterfall model and Lean Startup. In determining whether a rigorous agile methodology is the best development strategy for small scale tech startups, we consider the metrics of cost, time, quality, and scope in light of the particular needs of small startup organizations, and present a case study of a company that has needed to answer this very question.</p>

	]]>
</description>

<author>Alex Yau et al.</author>


</item>






<item>
<title>Automatic Test Case Generation and Test Suite Reduction for Closed-Loop Controller Software</title>
<link>http://repository.upenn.edu/cis_reports/979</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/979</guid>
<pubDate>Wed, 13 Feb 2013 08:25:42 PST</pubDate>
<description>
	<![CDATA[
	<p>Domains such as embedded systems, medical devices, process automation, etc. make use of controller software to make important decisions that can affect people’s lives and well being. Although safety-focused processes such as model-driven development can be used to assure a certain degree of quality in these applications, ultimately software testing still remains the primary mechanism by which faults are detected. However, a variety of challenges arises in identifying test cases for controller software, particularly in closed-loop systems that incorporate feedback from the entity being controlled, potentially leading to exponential growth in the number of paths through the code and difficulty in identifying sequences of inputs to put the application into the desired states for testing. In this paper, we present an approach to efficiently generating a set of test cases that will cover all reachable states in closedloop controller software, describe how it is possible to reduce the number of test cases without losing any coverage of states, and present evidence that, compared to other approaches, the technique significantly reduces the number of test cases (down to less than 1% in our experiments) needed to achieve the same level of coverage, with almost no negative effects on the test suite’s fault-ﬁnding capabilities.</p>

	]]>
</description>

<author>Christian Murphy et al.</author>


</item>






<item>
<title>CleanURL: A Privacy Aware Link Shortener</title>
<link>http://repository.upenn.edu/cis_reports/978</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/978</guid>
<pubDate>Wed, 16 Jan 2013 11:58:06 PST</pubDate>
<description>
	<![CDATA[
	<p>When URLs containing application parameters are posted in public settings privacy can be compromised if the those arguments contain personal or tracking data. To this end we describe a privacy aware link shortening service that attempt to strip sensitive and non-essential parameters based on difference algorithms and human feedback. Our implementation, CleanURL, allows users to validate our automated logic and provides them with intuition about how these otherwise opaque arguments function. Finally, we apply CleanURL over a large Twitter URL corpus to measure the prevalence of such privacy leaks and further motivate our tool.</p>

	]]>
</description>

<author>Daniel Kim et al.</author>


</item>






<item>
<title>Egress Online: Towards leveraging massively, multiplayer environments for evacuation studies</title>
<link>http://repository.upenn.edu/cis_reports/977</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/977</guid>
<pubDate>Mon, 26 Nov 2012 08:53:05 PST</pubDate>
<description>
	<![CDATA[
	<p>Large datasets of real human behaviors are of huge benefit across numerous domains, including evacuation safety, urban planning, marketing, and ergonomics. However, because large-scale experiments involving real human subjects are expensive and prohibitively difficult to organize, such datasets are scarce. Thus in this paper, we propose the use of massively multiplayer online (MMO) communities as an inexpensive and innovative way to capture datasets of large numbers of people under different conditions. We describe our implementation of an online data collection system, based on games, inside the popular massively multiplayer, online environment of Second Life. We evaluate the use of this system for performing evacuation experiments using a mix of Second Life residents and players recruited on campus. Our system was able to draw online participants, support data collection needs, and provide potential insights into high-level evacuation behaviors such as the choices of exit, effects of building debris, and the use-patterns of a building. Through experiments performed using our system, we found that Second Life residents found the game controls and environment to be significantly more compelling than lab participants; that players unfamiliar with our office building tended to evacuate primarily via the front entrance; and that in-game debris significantly increased the numbers of participants who failed to exit a building safely.</p>

	]]>
</description>

<author>Aline Normoyle et al.</author>


</item>






<item>
<title>Labeling Workflow Views with FineGrained Dependencies</title>
<link>http://repository.upenn.edu/cis_reports/976</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/976</guid>
<pubDate>Mon, 26 Nov 2012 06:54:35 PST</pubDate>
<description>
	<![CDATA[
	<p>This paper considers the problem of efficiently answering reachability queries over views of provenance graphs, derived from executions of workflows that may include recursion. Such views include composite modules and model fine-grained dependencies between module inputs and outputs. A novel view-adaptive dynamic labeling scheme is developed for efficient query evaluation, in which view specifications are labeled statically (i.e. as they are created) and data items are labeled dynamically as they are produced during a workflow execution. Although the combination of fine-grained dependencies and recursive workflows entail, in general, long (linear-size) data labels, we show that for a large natural class of workflows and views, labels are compact (logarithmic-size) and reachability queries can be evaluated in constant time. Experimental results demonstrate the benefit of this approach over the state-of-the-art technique when applied for labeling multiple views.</p>

	]]>
</description>

<author>Zhuowei Bao et al.</author>


</item>






<item>
<title>Metric Learning for Graph-based Domain Adaptation</title>
<link>http://repository.upenn.edu/cis_reports/975</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/975</guid>
<pubDate>Wed, 31 Oct 2012 14:21:42 PDT</pubDate>
<description>
	<![CDATA[
	<p>In many domain adaption formulations, it is assumed to have large amount of unlabeled data from the domain of interest (target domain), some portion of it may be labeled, and large amount of labeled data from other domains, aka source domain(s). Motivated by the fact that labeled data is hard to obtain in any domain, we design algorithms for the settings in which there exists large amount of unlabeled data from all domains, small portion of which may be labeled.</p>
<p>We build on recent advances in graph-based semi-supervised learning and supervised metric learning. Given all instances, labeled and unlabeled, from all domains, we build a large similarity graph between them, where an edge exists between two instances if they are close according to some metric. Instead of using predefined metric, as commonly performed, we feed the labeled instances into metric-learning algorithms and (re)construct a data-dependent metric, which is used to construct the graph. We employ different types of edges depending on the domain-identity of the two vertices touching it, and learn the weights of each edge.</p>
<p>We provide extensive empirical evidence demonstrating that our approach leads to significant reduction in classification error across domains, and evaluate the contribution of each resource: labeled and unlabeled data of the various domains.</p>

	]]>
</description>

<author>Paramveer S. Dhillon et al.</author>


</item>






<item>
<title>Emerging Scholars Program—a PLTL-CS Program that Increases Recruitment and Retention of Women in the Major</title>
<link>http://repository.upenn.edu/cis_reports/974</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/974</guid>
<pubDate>Thu, 20 Sep 2012 12:30:35 PDT</pubDate>
<description>
	<![CDATA[
	<p>The Emerging Scholars Program (ESP) in Computer Science is a Peer Led Team Learning (PLTL) approach to bringing undergraduates new to the discipline together with peer mentors to work on computational problems, and to expose them to the broad array of disciplines within computer science. This program demonstrates that computer science is necessarily a collaborative activity that focuses more on problem solving and algorithmic thinking than on programming. In spring 2012 the computer science department at an urban research university university completed the 9th iteration of ESP, with 104 women and 36 men completing the program. Our evaluation data indicates that ESP increased enrollment in the computer science major. 47% of students who took ESP along with the introduction to computer programming course at the university study site during this study majored in computer science. In addition, survey results indicated that a large majority of students intended to take another computer science course, were enthusiastic about the program, and found the workshop topics exciting and engaging. Participants reported that they learned more about computer science in ESP, and would recommend ESP to others.</p>

	]]>
</description>

<author>Rita M. Powell et al.</author>


</item>






<item>
<title>Secure Time-Aware Provenance For Distributed Systems</title>
<link>http://repository.upenn.edu/cis_reports/973</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/973</guid>
<pubDate>Wed, 25 Jul 2012 13:45:41 PDT</pubDate>
<description>
	<![CDATA[
	<p>Operators of distributed systems often find themselves needing to answer forensic questions, to perform a variety of managerial tasks including fault detection, system debugging, accountability enforcement, and attack analysis. In this dissertation, we present Secure Time-Aware Provenance (STAP), a novel approach that provides the fundamental functionality required to answer such forensic questions – the capability to “explain” the existence (or change) of a certain distributed system state at a given time in a potentially adversarial environment.</p>
<p>This dissertation makes the following contributions. First, we propose the STAP model, to explicitly represent time and state changes. The STAP model allows consistent and complete explanations of system state (and changes) in dynamic environments. Second, we show that it is both possible and practical to efficiently and scalably maintain and query provenance in a distributed fashion, where provenance maintenance and querying are modeled as recursive continuous queries over distributed relations. Third, we present security extensions that allow operators to reliably query provenance information in adversarial environments. Our extensions incorporate tamper-evident properties that guarantee eventual detection of compromised nodes that lie or falsely implicate correct nodes. Finally, the proposed research results in a proof-of-concept prototype, which includes a declarative query language for specifying a range of useful provenance queries, an interactive exploration tool, and a distributed provenance engine for operators to conduct analysis of their distributed systems. We discuss the applicability of this tool in several use cases, including Internet routing, overlay routing, and cloud data processing.</p>

	]]>
</description>

<author>Wenchao Zhou</author>


</item>






<item>
<title>Reduction-based Security Analysis of Internet Routing Protocols</title>
<link>http://repository.upenn.edu/cis_reports/972</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/972</guid>
<pubDate>Wed, 25 Jul 2012 13:45:30 PDT</pubDate>
<description>
	<![CDATA[
	<p>In recent years, there have been strong interests in the networking community in designing new Internet architectures that provide strong security guarantees. However, none of these proposals back their security claims by formal analysis. In this paper, we use a reduction-based approach to prove the route authenticity property in secure routing protocols. These properties require routes accepted and announced by honest nodes in the network are not tampered with by the adversary. We focus on protocols that rely on layered signatures to provide security: each route announcement is associated with a list of signatures attesting the authenticity of its subpaths. Our approach combines manual proofs with automated analysis. We define several reduction steps to reduce proving route authenticity properties to simple checks that can be automatically done by an automated tool Proverif. We show that our analysis is correct with respect to the trace semantics of the routing protocols.</p>

	]]>
</description>

<author>Chen Chen et al.</author>


</item>






<item>
<title>The Impact of an Agile Methodology on Software Development Costs</title>
<link>http://repository.upenn.edu/cis_reports/971</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/971</guid>
<pubDate>Wed, 25 Apr 2012 08:55:50 PDT</pubDate>
<description>
	<![CDATA[
	<p>With the emergence of the Internet, software development has become an integral part of almost every facet of business today. Because consumers have a surmounting demand for immediacy and convenience, companies are pressured to add web-based services to their product offerings. Therefore, an increasing number of resources are being allocated to the development of profitable software to meet customer needs. Because companies desire to maximize their profits, an efficient allocation of these resources is necessary to minimize costs. This can be achieved by implementing a process model that best converts their resources to quality products.</p>
<p>Agile software development is a relatively new framework aimed at reducing risk and production costs. It is based on iterative development and continuous feedback from all stakeholders throughout the development cycle. The switch to an agile process model from a traditional waterfall process model can reduce the risk associated with producing a large-scale software application by decreasing lead times and increasing team morale and productivity. My literature review and initial findings suggest that firms across industries can benefit from incorporating some degree of agility in their development process.</p>

	]]>
</description>

<author>Kristin Fergis</author>


</item>






<item>
<title>Processing Data-intensive Workflows in the Cloud</title>
<link>http://repository.upenn.edu/cis_reports/970</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/970</guid>
<pubDate>Fri, 20 Apr 2012 12:05:54 PDT</pubDate>
<description>
	<![CDATA[
	<p>In the recent years, large-scale data analysis has become critical to the success of modern enterprise. Meanwhile, with the emergence of cloud computing, companies are attracted to move their data analytics tasks to the cloud due to its  exible, on demand resources usage and pay-as-you-go pricing model. MapReduce has been widely recognized as an important tool for performing large-scale data analysis in the cloud. It provides a simple and fault-tolerance framework for users to process data-intensive analytics tasks in parallel across dierent physical machines. In this report, we survey alternative implementations of MapReduce, contrasting batched-oriented and pipelined execution models and study how these models impact response times, completion time and robustness. Next, we present three optimization strategies for MapReduce-style work-  ows, including (1) scan sharing across MapReduce programs, (2) work-  ow optimizations aimed at reducing intermediate data, and (3) schedul- ing policies that map work ow tasks to dierent machines in order to minimize completion times and monetary costs. We conclude with a brief comparison across these optimization strate- gies, and discuss their pros/cons as well as performance implications of using more than one optimization strategy at a time.University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-12-07.</p>

	]]>
</description>

<author>Zhuoyao Zhang</author>


</item>






<item>
<title>Dependent Interoperability</title>
<link>http://repository.upenn.edu/cis_reports/969</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/969</guid>
<pubDate>Thu, 29 Mar 2012 09:39:24 PDT</pubDate>
<description>
	<![CDATA[
	<p>In this paper we study the problem of interoperability – combining constructs from two separate programming languages within one program – in the case where one of the two languages is dependently typed and the other is simply typed. We present a core calculus called SD, which combines dependently- and simply-typed sub-languages and supports user-defined (dependent) datatypes, among other standard features. SD has “boundary terms" that mediate the interaction between the two sub-languages. The operational semantics of SD demonstrates how the necessary dynamic checks, which must be done when passing a value from the simply-typed world to the dependently typed world, can be extracted from the dependent type constructors themselves, modulo user-defined functions for marshaling values across the boundary. We establish type-safety and other meta-theoretic properties of SD, and contrast this approach to others in the literature.</p>

	]]>
</description>

<author>Peter-Michael Osera et al.</author>


</item>






<item>
<title>Motion Primitive-Based Graph Planning for Mobile Manipulation with Closed-Chain Systems</title>
<link>http://repository.upenn.edu/cis_reports/968</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/968</guid>
<pubDate>Wed, 28 Mar 2012 11:30:31 PDT</pubDate>
<description>
	<![CDATA[
	<p>Motion primitive-based (lattice-based) graphs have been used extensively in navigation, but application to high-dimensional state-spaces has remained limited by computational complexity. In this work, we show how these graphs can be applied to mobile manipulation. The formation of closed chains in tasks that involve contacts with the environment may reduce the number of available degrees of freedom but add complexity in terms of constraints in the high-dimensional state space. We propose a novel method to reduce dimensionality by abstracting away the constraints associated with closed-chain systems. Proofs are introduced for the application to graph-search and its theoretical guarantees of optimality. The dimensionality-reduction is done in a manner that enables finding optimal solutions to low-dimensional problems which map to correspondingly optimal full-dimensional solutions. We demonstrate the usefulness of our method with simulation results; we apply our approach to moving an object in 2D using a mobile manipulation platform with a planar arm.</p>

	]]>
</description>

<author>Steven R. Gray et al.</author>


</item>






<item>
<title>High-Level Model Extraction via Symbolic Execution</title>
<link>http://repository.upenn.edu/cis_reports/967</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/967</guid>
<pubDate>Wed, 18 Jan 2012 07:49:01 PST</pubDate>
<description>
	<![CDATA[
	<p>We study the problem of extracting high-level state machine models from software source code. Our target domain is GUI-driven applications for small hand-held devices such as cell phones and PDAs. In such systems, a natural high-level model is captured by a state machine, where states are GUI screens and button/menu item tappings are actions that trigger transitions between states. The paper presents a symbolic execution technique that allows us to identify states and transitions from the application source code. We discuss an implementation of this technique that operates on a large subset of the C# language and apply as a case study to the subsystem of a decision support tool for medical diagnosis.</p>

	]]>
</description>

<author>Shaohui Wang et al.</author>


</item>






<item>
<title>Reduction-based Formal Analysis of BGP Instances</title>
<link>http://repository.upenn.edu/cis_reports/966</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/966</guid>
<pubDate>Thu, 12 Jan 2012 13:12:37 PST</pubDate>
<description>
	<![CDATA[
	<p>Today’s Internet interdomain routing protocol, the Border Gateway Protocol (BGP), is increasingly complicated and fragile due to policy misconfigurations by individual autonomous systems (ASes). These misconfigurations are often difficult to manually diagnose beyond a small number of nodes due to the state explosion problem. To aid the diagnosis of potential anomalies, researchers have developed various formal models and analysis tools. However, these techniques do not scale well or do not cover the full set of anomalies. Current techniques use oversimplified BGP models that capture either anomalies within or across ASes, but not the interactions between the two. To address these limitations, we propose a novel approach that reduces network size prior to analysis, while preserving crucial BGP correctness properties. Using Maude, we have developed a toolkit that takes as input a network instance consisting of ASes and their policy configurations, and then performs formal analysis on the reduced instance for safety (protocol convergence). Our results show that our reduction based analysis allows us to analyze significantly larger network instances at low reduction overhead.</p>

	]]>
</description>

<author>Anduo Wang et al.</author>


</item>






<item>
<title>Experiences in Teaching an Educational User-Level Operating Systems Implementation Project</title>
<link>http://repository.upenn.edu/cis_reports/965</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/965</guid>
<pubDate>Thu, 12 Jan 2012 09:54:58 PST</pubDate>
<description>
	<![CDATA[
	<p>The importance of a comprehensive implementation component for undergraduate Operating Systems (OS) courses cannot be understated. Students not only develop deep insight and understanding of OS fundamentals, but they also learn key software engineering skills that only a large development project, such as implementing an OS, can teach. There are clear benefits to traditional OS projects where students program or alter real (Linux) kernel source or extend educational OS implementations; however, in our experience, bootstrapping such a project is a huge undertaking that may not be accessible in many classrooms. In this paper, we describe a different approach to the OS implementation assignment: A user-level Operating System simulation based on UNIX preemptive signaling and threading constructs called ucontext. We believe that this variation of the implementation assignment provides many of the same educational benefits as traditional low-level projects without many of the expensive start-up costs. This project has been taught for a number of years at the University of Pennsylvania and was recently overhauled for the Fall 2011 semester. This paper describes the current version of the project and our experiences teaching it to a class of 54 students.</p>

	]]>
</description>

<author>Adam J. Aviv et al.</author>


</item>






<item>
<title>Compositional Analysis of Multi-Mode Systems</title>
<link>http://repository.upenn.edu/cis_reports/964</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/964</guid>
<pubDate>Thu, 22 Dec 2011 08:34:28 PST</pubDate>
<description>
	<![CDATA[
	<p>The paper presents a model for multi-mode realtime applications and develops new techniques for the compositional analysis of systems that contain multiple such applications. An algorithm for constructing an interface for a single multimode application is presented. Then, a method for computing an interface of a composite application is presented, which uses only the interfaces of constituent applications. A case study of an adaptive streaming system demonstrates that multi-mode analysis offers more precise results compared to a unimodal worst-case analysis.</p>

	]]>
</description>

<author>Linh T.X. Phan et al.</author>


</item>






<item>
<title>An Optimal Labeling Scheme for Workflow Provenance Using Skeleton Labels</title>
<link>http://repository.upenn.edu/cis_reports/963</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/963</guid>
<pubDate>Thu, 22 Dec 2011 08:34:19 PST</pubDate>
<description>
	<![CDATA[
	<p>We develop a compact and efficient reachability labeling scheme for answering provenance queries on workflow runs that conform to a given specification. Even though a workflow run can be structurally more complex and can be arbitrarily larger than the specification due to fork (parallel) and loop executions, we show that a compact reachability labeling for a run can be efficiently computed using the fact that it originates from a fixed specification. Our labeling scheme is optimal in the sense that it uses labels of logarithmic length, runs in linear time, and answers any reachability query in constant time. Our approach is based on using the reachability labeling for the specification as an effective skeleton for designing the reachability labeling for workflow runs. We also demonstrate empirically the effectiveness of our skeleton-based labeling approach.</p>

	]]>
</description>

<author>Zhuowei Bao et al.</author>


</item>





</channel>
</rss>
