Search results

Now showing 1 - 10 of 2090
  • Publication
    Lessons Learned from a PLTL-CS Program
    (2011-01-01) Murphy, Christian; Powell, Rita; Parton, Kristen; Cannon, Adam
    The Peer-Led Team Learning (PLTL) approach has previously been shown to be effective in recruiting and retaining students, particularly under-represented students, in undergraduate introductory CS courses. In PLTL, small groups of students are led by an undergraduate peer and work together to solve problems related to CS. At Columbia University, the Columbia Emerging Scholars Program has used PLTL in an effort to increase enrollment in CS courses beyond the introductory level, and to increase the number of students who select Computer Science as their major, by demonstrating that CS is necessarily a collaborative activity that focuses more on problem solving and algorithmic thinking than on programming. Over the past five semesters, 68 students have completed the program, and preliminary results indicate that this program has had a positive effect on increasing participation in the major. This paper discusses our experiences of building and expanding the Columbia Emerging Scholars program, and addresses such topics as recruiting, training, scheduling, student behavior, and evaluation. We expect that this paper will provide a valuable set of lessons learned to other educators who seek to launch or grow a PLTL program at their institution as well.
  • Publication
    Processing Data-Intensive Workflows in the Cloud
    (2012-01-01) Zhang, Zhuoyao
    In the recent years, large-scale data analysis has become critical to the success of modern enterprise. Meanwhile, with the emergence of cloud computing, companies are attracted to move their data analytics tasks to the cloud due to its exible, on demand resources usage and pay-as-you-go pricing model. MapReduce has been widely recognized as an important tool for performing large-scale data analysis in the cloud. It provides a simple and fault-tolerance framework for users to process data-intensive analytics tasks in parallel across dierent physical machines. In this report, we survey alternative implementations of MapReduce, contrasting batched-oriented and pipelined execution models and study how these models impact response times, completion time and robustness. Next, we present three optimization strategies for MapReduce-style work- ows, including (1) scan sharing across MapReduce programs, (2) work- ow optimizations aimed at reducing intermediate data, and (3) schedul- ing policies that map work ow tasks to dierent machines in order to minimize completion times and monetary costs. We conclude with a brief comparison across these optimization strate- gies, and discuss their pros/cons as well as performance implications of using more than one optimization strategy at a time.University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-12-07.
  • Publication
    Lessons Learned From a PLTL-CS Program
    (2010-01-01)
    The Peer-Led Team Learning (PLTL) approach has previously been shown to be effective in recruiting and retaining students, particularly under-represented students, in undergraduate introductory CS courses. In PLTL, small groups of students are led by an undergraduate peer and work together to solve problems related to CS. At Columbia University, the Columbia Emerging Scholars Program has used PLTL in an effort to increase enrollment in CS courses beyond the introductory level, and to increase the number of students who select Computer Science as their major, by demonstrating that CS is necessarily a collaborative activity that focuses more on problem solving and algorithmic thinking than on programming. Over the past five semesters, 68 students have completed the program, and preliminary results indicate that this program has had a positive effect on increasing participation in the major. This paper discusses our experiences of building and expanding the Columbia Emerging Scholars program, and addresses such topics as recruiting, training, scheduling, student behavior, and evaluation. We expect that this paper will provide a valuable set of lessons learned to other educators who seek to launch or grow a PLTL program at their institution as well.
  • Publication
    Maintaining Distributed Recursive Views Incrementally
    (2011-01-01) Nigam, Vivek; Loo, Boon Thau; Jia, Limin; Scedrov, Andre
    Distributed logic programming languages, that allow both facts and programs to be distributed among different nodes in a network, have been recently proposed and used to declaratively program a wide-range of distributed systems, such as network protocols and multi-agent systems. However, the distributed nature of the underlying systems poses serious challenges to developing efficient and correct algorithms for evaluating these programs. This paper proposes an efficient asynchronous algorithm to compute incrementally the changes to the states in response to insertions and deletions of base facts. Our algorithm is formally proven to be correct in the presence of message reordering in the system. To our knowledge, this is the first formal proof of correctness for such an algorithm.
  • Publication
    Fault Management in Distributed Systems
    (2010-01-05) Zhou, Wenchao
    In the past decade, distributed systems have rapidly evolved, from simple client/server applications in local area networks, to Internet-scale peer-to-peer networks and large-scale cloud platforms deployed on tens of thousands of nodes across multiple administrative domains and geographical areas. Despite of the growing popularity and interests, designing and implementing distributed systems remains challenging, due to their ever- increasing scales and the complexity and unpredictability of the system executions. Fault management strengthens the robustness and security of distributed systems, by detecting malfunctions or violations of desired properties, diagnosing the root causes and maintaining verifiable evidences to demonstrate the diagnosis results. While its importance is well recognized, fault management in distributed systems, on the other hand, is notoriously difficult. To address the problem, various mechanisms and systems have been proposed in the past few years. In this report, we present a survey of these mechanisms and systems, and taxonomize them according to the techniques adopted and their application domains. Based on four representative systems (Pip, Friday, PeerReview and TrInc), we discuss various aspects of fault management, including fault detection, fault diagnosis and evidence generation. Their strength, limitation and application domains are evaluated and compared in detail.
  • Publication
    Spectrum Sharing in Dynamic Spectrum Access Networks: WPE-II Written Report
    (2009-06-22) Liu, Changbin
    A study by Federal Communication Commission shows that most of the spectrum in current wireless networks is unused most of the time, while some spectrum is heavily used. Recently dynamic spectrum access (DSA) has been proposed to solve this spectrum inefficiency problem, by allowing users to opportunistically access to unused spectrum. One important question in DSA is how to efficiently share spectrum among users so that spectrum utilization can be increased and wireless interference can be reduced. Spectrum sharing can be formalized as a graph coloring problem. In this report we focus on surveying spectrum sharing techniques in DSA networks and present four representative techniques in different taxonomy domains, including centralized, distributed with/without common control channel, and a real case study of DSA networks --- DARPA neXt Gen- eration (XG) radios. Their strengths and limitations are evaluated and compared in detail. Finally, we discuss the challenges in current spectrum sharing research and possible future directions.
  • Publication
    Declarative Network Verification
    (2008-01-01) Wang, Anduo; Loo, Boon Thau; Basu, Prithwish; Sokolsky, Oleg
    In this paper, we present our initial design and implementation of a declarative network verifier (DNV). DNV utilizes theorem proving, a well established verification technique where logic-based axioms that automatically capture network semantics are generated, and a user-driven proof process is used to establish network correctness properties. DNV takes as input declarative networking specifications written in the Network Datalog (NDlog) query language, and maps that automatically into logical axioms that can be directly used in existing theorem provers to validate protocol correctness. DNV is a significant improvement compared to existing use case of theorem proving which typically require several man-months to construct the system specifications. Moreover, NDlog, a high-level specification, whose semantics are precisely compiled into DNV without loss, can be directly executed as implementations, hence bridging specifications, verification, and implementation. To validate the use of DNV, we present case studies using DNV in conjunction with the PVS theorem prover to verify routing protocols, including eventual properties of protocols in dynamic settings.
  • Publication
    Clifford Algebras, Clifford Groups, and a Generalization of the Quaternions: The Pin and Spin Groups
    (2013-11-09) Gallier, Jean H
    One of the main goals of these notes is to explain how rotations in Rn are induced by the action of a certain group, Spin(n), on Rn, in a way that generalizes the action of the unit complex numbers, U(1), on R2, and the action of the unit quaternions, SU(2), on R3 (i.e., the action is denied in terms of multiplication in a larger algebra containing both the group Spin(n) and R(n). The group Spin(n), called a spinor group, is defined as a certain subgroup of units of an algebra, Cln, the Clifford algebra associated with Rn. Since the spinor groups are certain well chosen subgroups of units of Clifford algebras, it is necessary to investigate Clifford algebras to get a firm understanding of spinor groups. These notes provide a tutorial on Clifford algebra and the groups Spin and Pin, including a study of the structure of the Cliord algebra Clp;q associated with a nondegenerate symmetric bilinear form of signature (p; q) and culminating in the beautiful \8-periodicity theorem" of Elie Cartan and Raoul Bott (with proofs).
  • Publication
    A Clustering Coefficient Network Formation Game
    (2011-10-04) Brautbar, Michael; Kearns, Michael J
    Social and other networks have been shown empirically to exhibit high edge clustering: that is, the density of local neighborhoods, as measured by the clustering coefficient, is often much larger than the overall edge density of the network. In social networks, a desire for tightknit circles of friendships the colloquial social clique is often cited as the primary driver of such structure. We introduce and analyze a new network formation game in which rational players must balance edge purchases with a desire to maximize their own clustering coefficient. Our results include the following: -Construction of a number of specific families of equilibrium networks, including ones showing that equilibria can have rather general binary tree-like structure, including highly asymmetric binary trees. This is in contrast to other network formation games that yield only symmetric equilibrium networks. Our equilibria also include ones with large or small diameter, and ones with wide variance of degrees. -A general characterization of (non-degenerate) equilibrium networks, showing that such networks are always sparse and paid for by low degree vertices, whereas high-degree "free riders" always have low utility. -A proof that for edge cost a ¥ 1/2 the Price of Anarchy grows linearly with the population size n while for edge cost less than 1/2, the Price of Anarchy of the formation game is bounded by a constant depending only on, and independent of n. Moreover, an explicit upper bound is constructed when the edge cost is a simple rational (small numerator) less than 1/2. -A proof that for edge cost less than 1=2 the average vertex clustering coefficient grows at least as fast as a function depending only on, while the overall edge density goes to zero at a rate inversely proportional to the number of vertices in the network. -Results establishing the intractability of even weakly approximating best response computations. Several of our results hold even for weaker notions of equilibrium, such as those based on link stability.
  • Publication
    Binders Unbound
    (2011-09-19) Weirich, Stephanie; Yorgey, Brent A; Sheard, Tim
    Implementors of compilers, program refactorers, theorem provers, proof checkers, and other systems that manipulate syntax know that dealing with name binding is difficult to do well. Operations such as -equivalence and capture-avoiding substitution seem simple, yet subtle bugs often go undetected. Furthermore, their implementations are tedious, requiring boilerplate code that must be updated whenever the object language definition changes. Many researchers have therefore sought to specify binding syntax declaratively, so that tools can correctly handle the details behind the scenes. This idea has been the inspiration for many new systems (such as Beluga, Delphin, FreshML, FreshOCaml, C ml, FreshLib, and Ott) but there is still room for improvement in expressivity, simplicity and convenience. In this paper, we present a new domain-specific language, UNBOUND, for specifying binding structure. Our language is particularly expressive it supports multiple atom types, pattern binders, type annotations, recursive binders, and nested binding (necessary for telescopes, a feature found in dependently-typed languages). However, our specification language is also simple, consisting of just five basic combinators. We provide a formal semantics for this language derived from a locally nameless representation and prove that it satisfies a number of desirable properties. We also present an implementation of our binding specification language as a GHC Haskell library implementing an embedded domain specific language (EDSL). By using Haskell type constructors to represent binding combinators, we implement the EDSL succinctly using datatype-generic programming. Our implementation supports a number of features necessary for practical programming, including flexibility in the treatment of user-defined types, best effort name preservation (for error messages), and integration with Haskell's monad transformer library.