<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>Technical Reports (CIS)</title>
<copyright>Copyright (c) 2012 University of Pennsylvania All rights reserved.</copyright>
<link>http://repository.upenn.edu/cis_reports</link>
<description>Recent documents in Technical Reports (CIS)</description>
<language>en-us</language>
<lastBuildDate>Fri, 20 Jan 2012 01:47:25 PST</lastBuildDate>
<ttl>3600</ttl>


	
		
	







<item>
<title>High-Level Model Extraction via Symbolic Execution</title>
<link>http://repository.upenn.edu/cis_reports/967</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/967</guid>
<pubDate>Wed, 18 Jan 2012 07:49:01 PST</pubDate>
<description>
	<![CDATA[
	<p>We study the problem of extracting high-level state machine models from software source code. Our target domain is GUI-driven applications for small hand-held devices such as cell phones and PDAs. In such systems, a natural high-level model is captured by a state machine, where states are GUI screens and button/menu item tappings are actions that trigger transitions between states. The paper presents a symbolic execution technique that allows us to identify states and transitions from the application source code. We discuss an implementation of this technique that operates on a large subset of the C# language and apply as a case study to the subsystem of a decision support tool for medical diagnosis.</p>

	]]>
</description>

<author>Shaohui Wang et al.</author>


</item>






<item>
<title>Reduction-based Formal Analysis of BGP Instances</title>
<link>http://repository.upenn.edu/cis_reports/966</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/966</guid>
<pubDate>Thu, 12 Jan 2012 13:12:37 PST</pubDate>
<description>
	<![CDATA[
	<p>Today’s Internet interdomain routing protocol, the Border Gateway Protocol (BGP), is increasingly complicated and fragile due to policy misconfigurations by individual autonomous systems (ASes). These misconfigurations are often difficult to manually diagnose beyond a small number of nodes due to the state explosion problem. To aid the diagnosis of potential anomalies, researchers have developed various formal models and analysis tools. However, these techniques do not scale well or do not cover the full set of anomalies. Current techniques use oversimplified BGP models that capture either anomalies within or across ASes, but not the interactions between the two. To address these limitations, we propose a novel approach that reduces network size prior to analysis, while preserving crucial BGP correctness properties. Using Maude, we have developed a toolkit that takes as input a network instance consisting of ASes and their policy configurations, and then performs formal analysis on the reduced instance for safety (protocol convergence). Our results show that our reduction based analysis allows us to analyze significantly larger network instances at low reduction overhead.</p>

	]]>
</description>

<author>Anduo Wang et al.</author>


</item>






<item>
<title>Experiences in Teaching an Educational User-Level Operating Systems Implementation Project</title>
<link>http://repository.upenn.edu/cis_reports/965</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/965</guid>
<pubDate>Thu, 12 Jan 2012 09:54:58 PST</pubDate>
<description>
	<![CDATA[
	<p>The importance of a comprehensive implementation component for undergraduate Operating Systems (OS) courses cannot be understated. Students not only develop deep insight and understanding of OS fundamentals, but they also learn key software engineering skills that only a large development project, such as implementing an OS, can teach. There are clear benefits to traditional OS projects where students program or alter real (Linux) kernel source or extend educational OS implementations; however, in our experience, bootstrapping such a project is a huge undertaking that may not be accessible in many classrooms. In this paper, we describe a different approach to the OS implementation assignment: A user-level Operating System simulation based on UNIX preemptive signaling and threading constructs called ucontext. We believe that this variation of the implementation assignment provides many of the same educational benefits as traditional low-level projects without many of the expensive start-up costs. This project has been taught for a number of years at the University of Pennsylvania and was recently overhauled for the Fall 2011 semester. This paper describes the current version of the project and our experiences teaching it to a class of 54 students.</p>

	]]>
</description>

<author>Adam J. Aviv et al.</author>


</item>






<item>
<title>Compositional Analysis of Multi-Mode Systems</title>
<link>http://repository.upenn.edu/cis_reports/964</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/964</guid>
<pubDate>Thu, 22 Dec 2011 08:34:28 PST</pubDate>
<description>
	<![CDATA[
	<p><strong> The paper presents a model for multi-mode realtime applications and develops new techniques for the compositional analysis of systems that contain multiple such applications. An algorithm for constructing an interface for a single multimode application is presented. Then, a method for computing an interface of a composite application is presented, which uses only the interfaces of constituent applications. A case study of an adaptive streaming system demonstrates that multi-mode analysis offers more precise results compared to a unimodal worst-case analysis. </strong></p>

	]]>
</description>

<author>Linh T.X. Phan et al.</author>


</item>






<item>
<title>An Optimal Labeling Scheme for Workflow Provenance Using Skeleton Labels</title>
<link>http://repository.upenn.edu/cis_reports/963</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/963</guid>
<pubDate>Thu, 22 Dec 2011 08:34:19 PST</pubDate>
<description>
	<![CDATA[
	<p>We develop a compact and efficient reachability labeling scheme for answering provenance queries on workflow runs that conform to a given specification. Even though a workflow run can be structurally more complex and can be arbitrarily larger than the specification due to fork (parallel) and loop executions, we show that a compact reachability labeling for a run can be efficiently computed using the fact that it originates from a fixed specification. Our labeling scheme is optimal in the sense that it uses labels of logarithmic length, runs in linear time, and answers any reachability query in constant time. Our approach is based on using the reachability labeling for the specification as an effective skeleton for designing the reachability labeling for workflow runs. We also demonstrate empirically the effectiveness of our skeleton-based labeling approach.</p>

	]]>
</description>

<author>Zhuowei Bao et al.</author>


</item>






<item>
<title>Labeling Recursive Workflow Executions On-the-Fly</title>
<link>http://repository.upenn.edu/cis_reports/962</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/962</guid>
<pubDate>Thu, 22 Dec 2011 08:34:11 PST</pubDate>
<description>
	<![CDATA[
	<p>This paper presents a compact labeling scheme for answering reachability queries over workflow executions. In contrast to previous work, our scheme allows nodes (processes and data) in the execution graph to be labeled on-the-fly, i.e., in a dynamic fashion. In this way, reachability queries can be answered as soon as the relevant data is produced. We first show that, in general, for workflows that contain recursion, dynamic labeling of executions requires long (linear-size) labels. Fortunately, most real-life scientific workflows are linear recursive, and for this natural class we show that dynamic, yet compact (logarithmic-size) labeling is possible. Moreover, our scheme labels the executions in linear time, and answers any reachability query in constant time. We also show that linear recursive workflows are, in some sense, the largest class of workflows that allow compact, dynamic labeling schemes. Interestingly, the empirical evaluation, performed over both real and synthetic workflows, shows that our proposed dynamic scheme outperforms the state-of-the-art static scheme for large executions, and creates labels that are shorter by a factor of almost 3.</p>

	]]>
</description>

<author>Zhuowei Bao et al.</author>


</item>






<item>
<title>Differencing Provenance in Scientific Workflows</title>
<link>http://repository.upenn.edu/cis_reports/961</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/961</guid>
<pubDate>Thu, 22 Dec 2011 08:34:04 PST</pubDate>
<description>
	<![CDATA[
	<p>Scientific workflow management systems are increaingly providing the ability to manage and query the provenance of data products. However, the problem of differencing the provenance of two data products produced by executions of the same specification has not been adequately addressed. Although this problem is NP-hard for general workflow specifications, an analysis of real scientific (and business) workflows shows that their specifications can be captured as series-parallel graphs overlaid with well-nested forking and looping. For this natural restriction, we present efficient, polynomial-time algorithms for differencing executions of the same specification and thereby understanding the difference in the provenance of their data products. We then describe a prototype called PDiffView built around our differencing algorithm. Experimental results demonstrate the scalability of our approach using collected, real workflows and increasingly complex runs.</p>

	]]>
</description>

<author>Zhuowei Bao et al.</author>


</item>






<item>
<title>Structure from Motion with Directional Correspondence for Visual Odometry</title>
<link>http://repository.upenn.edu/cis_reports/960</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/960</guid>
<pubDate>Thu, 22 Dec 2011 08:33:56 PST</pubDate>
<description>
	<![CDATA[
	<p>This report presents two efficient solutions to the two-view, relative pose problem from three image point correspondences and one common reference direction. This three-plus-one problem can be used either as a substitute for the classic five-point algorithm using a vanishing point for the reference direction, or to make use of an inertial measurement unit commonly available on robots and mobile devices, where the gravity vector becomes the reference direction. We provide a simple closed-form solution and a solution based on techniques from algebraic geometry and investigate numerical and computational advantages of each approach. In a set of real experiments, we demonstrate the power of our approach by comparing it to the five-point method in a hypothesize-and-test visual odometry setting.</p>

	]]>
</description>

<author>Oleg Naroditsky et al.</author>


</item>






<item>
<title>Unsupervised Models of Text Structure</title>
<link>http://repository.upenn.edu/cis_reports/959</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/959</guid>
<pubDate>Thu, 22 Dec 2011 08:33:48 PST</pubDate>
<description>
	<![CDATA[
	<p>Models of text structure are necessary for applications that generate text. These models provide information about what content fits together and how to organize the content as coherent text. In some domains such as newswire, biographies and stories for children, texts tend to have similar content and structure. Such regularities have allowed the development of unsupervised methods to learn text structure using human-written examples from such domains. We survey some of the recently proposed approaches in this area and review their use in different text generation tasks.</p>
<p>First, we consider approaches with a focus on computational semantics. We review work aiming to discover patterns of related events from news articles and children’s stories. We consider one application of such knowledge–an automatic story-telling system.</p>
<p>Next, we move to methods which focus on coherence and organization. We describe these in the context of two generation tasks–sentence ordering and the creation of long articles. In view of the sentence ordering problem, we survey approaches targeted at learning properties of coherent transitions between adjacent sentences in texts. Then, we consider the generation of long biographical descriptions. Here we survey recent work on automatically generating such articles using higher level patterns in text structure such as subtopics and their organization.</p>

	]]>
</description>

<author>Annie Louis</author>


</item>






<item>
<title>Conditional entropies as over-segmentation and under-segmentation metrics for multi-part image segmentation</title>
<link>http://repository.upenn.edu/cis_reports/958</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/958</guid>
<pubDate>Thu, 22 Dec 2011 08:33:40 PST</pubDate>
<description>
	<![CDATA[
	<p>In this paper, we define two conditional entropy measures for performance evaluation of general image segmentation. Given a segmentation label map and a ground truth label map, our measures describe their compatibility in two ways. The first one is the conditional entropy of the segmentation given the ground truth, which indicates the oversegmentation rate. The second one is that of the ground truth given the segmentation, which indicates the under-segmentation rate. The two conditional entropies indicate the trade-off between smaller and larger granularities like false positive rate and false negative rate in ROC, and precision and recall in PR curve. Our measures are easy to implement, and involve no threshold or other parameter, have very intuitive explanation and many good theoretical properties, e.g., good bounds, monotonicity, continuity. Experiments show that our measures work well on Berkeley Image Segmentation Benchmark using three segmentation algorithms, Efficient Graph- Based segmentation, Mean Shift and Normalized Cut. We also give an asymmetric similarity measure based on the two entropies and compared it with Variation of Information. The comparison revealled that our method has advantages in many situations.We also checked the coarse-to-fine compatibility of segmentation results with changing parameters and ground truths from different annotators.</p>

	]]>
</description>

<author>Haifeng Gong et al.</author>


</item>






<item>
<title>Steganographic Timing Channels</title>
<link>http://repository.upenn.edu/cis_reports/957</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/957</guid>
<pubDate>Thu, 22 Dec 2011 08:33:32 PST</pubDate>
<description>
	<![CDATA[
	<p>This paper describes steganographic timing channels that use cryptographic primitives to hide the presence of covert channels in the timing of network traffic. We have identified two key properties for steganographic timing channels: (1) the parameters of the scheme should be cryptographically keyed, and (2) the distribution of input timings should be indistinguishable from output timings. These properties are necessary (although we make no claim they are sufficient) for the undetectability of a steganographic timing channel. Without them, the contents of the channel can be read and observed by unauthorized persons, and the presence of the channel is trivially exposed by noticing large changes in timing distributions – a previously proposed methodology for covert channel detection. Our steganographic timing scheme meets the secrecy requirement by employing cryptographic keys, and we achieve a restricted form of input/output distribution parity. Under certain distributions, our schemes conforms to a uniformness property; input timings that are uniformly distributed modulo a timing window are indistinguishable from output timings, measured under the same modulo. We also demonstrate that our scheme is practical under real network conditions, and finally present an empirical study of its covertness using the firstorder entropy metric, as suggested by Gianvecchio and Wang [8], which is currently the best published practical detection heuristic for timing channels.</p>

	]]>
</description>

<author>Adam Aviv et al.</author>


</item>






<item>
<title>PUMA: Policy-based Unified Multi-radio Architecture for Agile Mesh Networking</title>
<link>http://repository.upenn.edu/cis_reports/956</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/956</guid>
<pubDate>Tue, 25 Oct 2011 14:38:59 PDT</pubDate>
<description>
	<![CDATA[
	<p>This paper presents the design and implementation of <em>PUMA</em>, a declarative constraint-solving platform for policy-based routing and channel selection in multi-radio wireless mesh networks.  In PUMA, users formulate channel selection policies as optimization goals and constraints that are concisely declared using the <em>PawLog</em> declarative language. To efficiently execute <em>PawLog</em> programs in a distributed setting, PUMA integrates a high performance constraint solver with a declarative networking engine. We demonstrate the capabilities of PUMA in defining distributed protocols that cross-optimize across channel selection and routing. We have developed a prototype of the PUMA system that we extensively evaluated in simulations and on the ORBIT testbed. Our experimental results demonstrate that PUMA can flexibly and efficiently implement a variety of centralized and distributed channel selection protocols that result in significantly higher throughput compared to single channel and identical channel assignment solutions.</p>

	]]>
</description>

<author>Changbin Liu et al.</author>


</item>






<item>
<title>Realizing Compositional Scheduling through Virtualization</title>
<link>http://repository.upenn.edu/cis_reports/955</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/955</guid>
<pubDate>Tue, 26 Jul 2011 06:06:26 PDT</pubDate>
<description>
	<![CDATA[
	<p>We present a co-designed scheduling framework and platform architecture that support compositional scheduling of real-time systems. The architecture is built on Xen virtualization platform, and relies on compositional scheduling theory that uses periodic resource models as component interfaces. We implement resource models as periodic servers and consider enhancements to periodic server design that significantly improve response times of tasks and resource utilization in the system while preserving theoretical schedulability results. We present an extensive evaluation of our implementation using workloads from an avionics case study as well as synthetic ones.</p>

	]]>
</description>

<author>Jaewoo Lee et al.</author>


</item>






<item>
<title>FSR:  Formal Analysis and Implementation Toolkit for Safe Inter-domain Routing</title>
<link>http://repository.upenn.edu/cis_reports/954</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/954</guid>
<pubDate>Thu, 26 May 2011 12:02:51 PDT</pubDate>
<description>
	<![CDATA[
	<p>Inter-domain routing stitches the disparate parts of the Internet together, making protocol stability a critical issue to both researchers and practitioners. Yet, researchers create safety proofs and counter-examples by hand, and build simulators and prototypes to explore protocol dynamics. Similarly, network operators analyze their router configurations manually, or using home-grown tools. In this paper, we present a comprehensive toolkit for analyzing and implementing routing policies, ranging from high-level guidelines to specific router configurations. Our Formally Safe Routing (<em>FSR</em>) toolkit performs all of these functions from the same algebraic representation of routing policy. We show that routing algebra has a natural translation to both <em>integer constraints</em> (to perform safety analysis with SMT solvers) and <em>declarative programs</em> (to generate distributed implementations).  Our extensive experiments with realistic topologies and policies show how <em>FSR</em> can detect problems in an AS's iBGP configuration, prove sufficient conditions for BGP safety, and empirically evaluate convergence time.</p>

	]]>
</description>

<author>Anduo Wang et al.</author>


</item>






<item>
<title>Tracking by Planning</title>
<link>http://repository.upenn.edu/cis_reports/953</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/953</guid>
<pubDate>Thu, 14 Apr 2011 08:18:22 PDT</pubDate>
<description>
	<![CDATA[
	<p>We introduce a method for tracking multiple people in a<br>cluttered street scene. We use global context to address the challenge of<br>long occlusion by endowing each tracked object with a planning agent.<br>This planner uses context of the street scene, people and other moving<br>objects to reason about pedestrian intended behavior for tracking under<br>occlusion and ambiguity.</p>
<p>We extract short but robust trajectories called tracklets by tracking people<br>with a simple appearance model. We formulate the tracking problem<br>as a batch mode optimization, linking tracklets into paths, each with<br>supporting evidence by an agent’s goal directed behavior, and image<br>partial matching along the trajectory gap. We propose a global criteria<br>for consistent linking of the tracklet with planning that can correct local<br>ambiguity in linking. We test our algorithm in a challenging real world<br>setting, where we automatically estimate scene context and intended<br>goals, then track multiple people from a moving camera.</p>

	]]>
</description>

<author>Haifeng Gong et al.</author>


</item>






<item>
<title>Analyzing BGP Instances in Maude</title>
<link>http://repository.upenn.edu/cis_reports/952</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/952</guid>
<pubDate>Fri, 08 Apr 2011 07:00:11 PDT</pubDate>
<description>
	<![CDATA[
	<p>Analyzing Border Gateway Protocol (BGP) instances is a crucial step<br>in the design and implementation of safe BGP systems. Today, the analysis is a<br>manual and tedious process. Researchers study the instances by manually constructing<br>execution sequences, hoping to either identify an oscillation or show<br>that the instance is safe by exhaustively examining all possible sequences. We<br>propose to automate the analysis by using Maude, a tool based on rewriting logic.<br>We have developed a library specifying a generalized path vector protocol, and<br>methods to instantiate the library with customized routing policies. Protocols can<br>be analyzed automatically by Maude, once users provide specifications of the<br>network topology and routing policies. Using our Maude library, protocols or<br>policies can be easily specified and checked for problems. To validate our approach,<br>we performed safety analysis of well-known BGP instances and actual<br>routing configurations.</p>

	]]>
</description>

<author>Anduo Wang et al.</author>


</item>






<item>
<title>General versus specific sentences: automatic identification and application to analysis of news summaries</title>
<link>http://repository.upenn.edu/cis_reports/951</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/951</guid>
<pubDate>Mon, 21 Feb 2011 06:04:28 PST</pubDate>
<description>
	<![CDATA[
	<p>In this paper, we introduce the task of identifying general and specific sentences in news articles. Instead of embarking on a new annotation effort to obtain data for the task, we explore the possibility of leveraging existing large corpora annotated with discourse information to train a classifier. We introduce several classes of features that capture lexical and syntactic information, as well as word specificity and polarity. We then use the classifier to analyze the distribution of general and specific sentences in human and machine summaries of news articles. We discover that while all types of summaries tend to be more specific than the original documents, human abstracts contain a more balanced mix of general and specific sentences but automatic summaries are overwhelmingly specific. Our findings give strong evidence for the need for a new task in (abstractive) summarization: identification and generation of general sentences.</p>

	]]>
</description>

<author>Annie Louis et al.</author>


</item>






<item>
<title>Maintaining Distributed Recursive Views Incrementally</title>
<link>http://repository.upenn.edu/cis_reports/950</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/950</guid>
<pubDate>Tue, 08 Feb 2011 07:28:18 PST</pubDate>
<description>
	<![CDATA[
	<p>Distributed logic programming languages, that allow both<br>facts and programs to be distributed among different nodes<br>in a network, have been recently proposed and used to declaratively<br>program a wide-range of distributed systems, such as<br>network protocols and multi-agent systems. However, the distributed<br>nature of the underlying systems poses serious challenges<br>to developing efficient and correct algorithms for evaluating<br>these programs. This paper proposes an efficient asynchronous<br>algorithm to compute incrementally the changes to<br>the states in response to insertions and deletions of base facts.<br>Our algorithm is formally proven to be correct in the presence<br>of message reordering in the system. To our knowledge, this<br>is the first formal proof of correctness for such an algorithm.</p>

	]]>
</description>

<author>Vivek Nigam et al.</author>


</item>






<item>
<title>On Effective Testing of Health Care Simulation Software</title>
<link>http://repository.upenn.edu/cis_reports/949</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/949</guid>
<pubDate>Mon, 07 Feb 2011 07:37:45 PST</pubDate>
<description>
	<![CDATA[
	<p>Health care professionals rely on software to simulate anatomical and physiological elements of the human body for purposes of training, prototyping, and decision making. Software can also be used to simulate medical processes and protocols to measure cost effectiveness and resource utilization. Whereas much of the software engineering research into simulation software focuses on validation (determining that the simulation accurately models real-world activity), to date there has been little investigation into the testing of simulation software itself, that is, the ability to effectively search for errors in the implementation. This is particularly challenging because often there is no test oracle to indicate whether the results of the simulation are correct. In this paper, we present an approach to systematically testing simulation software in the absence of test oracles, and evaluate the effectiveness of the technique.</p>

	]]>
</description>

<author>Christian Murphy et al.</author>


</item>






<item>
<title>Litmus Tests for Comparing Memory Consistency Models: How Long Do They Need to Be?</title>
<link>http://repository.upenn.edu/cis_reports/948</link>
<guid isPermaLink="true">http://repository.upenn.edu/cis_reports/948</guid>
<pubDate>Wed, 19 Jan 2011 12:48:22 PST</pubDate>
<description>
	<![CDATA[
	<p>Even though the general problem of comparing two memory models is infeasible, in this paper we show that checking the equivalence of two memory models becomes feasible when we consider a more restricted class of memory models. We define a class of memory models that is expressive enough to include most known hardware memory models, and we establish a bound of two threads and no more than six memory access instructions for contrasting litmus tests in this class of models. Thus, we can compare memory models in this class by checking a small number of litmus tests. We build a tool for comparing memory models based on this theorem and use the tool to explore and map the space of this class of models.</p>

	]]>
</description>

<author>Sela Mador-Haim et al.</author>


</item>





</channel>
</rss>

