Khanna, Sanjeev

Email Address

ORCID

Disciplines

relationships.isProjectOf

relationships.isOrgUnitOf

Position

Introduction

Research Interests

Search Results

Now showing 1 - 10 of 58
  • Publication
    Selection with Monotone Comparison Costs
    (2003-01-12) Kannan, Sampath; Khanna, Sanjeev
    We consider the problem of selecting the rth -smallest element from a list of nelements under a model where the comparisons may have different costs depending on the elements being compared. This model was introduced by [3] and is realistic in the context of comparisons between complex objects. An important special case of this general cost model is one where the comparison costs are monotone in the sizes of the elements being compared. This monotone cost model covers most "natural" cost models that arise and the selection problem turns out to be the most challenging one among the usual problems for comparison-based algorithms. We present an O(log2 n)-competitive algorithm for selection under the monotone cost model. This is in contrast to an Ω (n)lower bound that is known for arbitrary comparison costs. We also consider selection under a special case of monotone costs—-the min model where the cost of comparing two elements is the minimum of the sizes. We give a randomized O(1)-competitive algorithm for the min model.
  • Publication
    Improved Approximation Results for Stochastic Knapsack Problems
    (2011-01-01) Bhalgat, Anand; Goel, Ashish; Khanna, Sanjeev
    In the stochastic knapsack problem, we are given a set of items each associated with a probability distribution on sizes and a profit, and a knapsack of unit capacity. The size of an item is revealed as soon as it is inserted into the knapsack, and the goal is to design a policy that maximizes the expected profit of items that are successfully inserted into the knapsack. The stochastic knapsack problem is a natural generalization of the classical knapsack problem, and arises in many applications, including bandwidth allocation, budgeted learning, and scheduling. An adaptive policy for stochastic knapsack specifies the next item to be inserted based on observed sizes of the items inserted thus far. The adaptive policy can have an exponentially large explicit description and is known to be PSPACE-hard to compute. The best known approximation for this problem is a (3 + є)-approximation for any є > 0. Our first main result is a relaxed PTAS (Polynomial Time Approximation Scheme) for the adaptive policy, that is, for any є > 0, we present a poly-time computable (1+є)-approximate adaptive policy when knapsack capacity is relaxed to 1+є. At a high-level, the proof is based on transforming an arbitrary collection of item size distributions to canonical item size distributions that admit a compact description. We then establish a coupling that shows a (1+є)- approximation can be achieved for the original problem by a canonical policy that makes decisions at each step by observing events drawn from the sample space of canonical size distributions. Finally, we give a mechanism for approximating the optimal canonical policy. Our second main result is an (8/3 + є)-approximate adaptive policy for any є > 0 without relaxing the knapsack capacity, improving the earlier (3+є)-approximation result. Interestingly, we obtain this result by using the PTAS described above. We establish an existential result that the optimal policy for the knapsack with capacity 1 can be folded to get a policy with expected profit 3OPT/8 for a knapsack with capacity (1-є), with capacity relaxed to 1 only for the first item inserted. We then use our PTAS result to compute the (1 + є)-approximation to such policy. Our techniques also yield a relaxed PTAS for non- adaptive policies. Finally, we also show that our ideas can be extended to yield improved approximation guarantees for multidimensional and fixed set variants of the stochastic knapsack problem.
  • Publication
    A PTAS for Minimizing Average Weighted Completion Time With Release Dates on Uniformly Related Machines
    (2000-01-01) Chekuri, Chandra; Khanna, Sanjeev
    A classical scheduling problem is to find schedules that minimize average weighted completion time of jobs with release dates. When multiple machines are available, the machine environments may range from identical machines (the processing time required by a job is invariant across the machines) at one end, to unrelated machines (the processing time required by a job on any machine is an arbitrary function of the specific machine) at the other end of the spectrum. While the problem is strongly NP-hard even in the case of a single machine, constant factor approximation algorithms have been known for even the most general machine environment of unrelated machines. Recently, a polynomial-time approximation scheme (PTAS) was discovered for the case of identical parallel machines [1]. In contrast, it is known that this problem is MAX SNP-hard for unrelated machines [10]. An important open problem is to determine the approximability of the intermediate case of uniformly related machines where each machine i has a speed si and it takes p/si time to executing a job of processing size pIn this paper, we resolve this problem by obtaining a PTAS for the problem. This improves the earlier known ratio of (2 + ∈) for the problem.
  • Publication
    Perfect Matchings in O(n log n) Time in Regular Bipartite Graphs
    (2010-06-01) Goel, Ashish; Kapralov, Michael; Khanna, Sanjeev
    In this paper we consider the well-studied problem of finding a perfect matching in a d-regular bipartite graph on 2n nodes with m = nd edges. The best-known algorithm for general bipartite graphs (due to Hopcroft and Karp) takes time O(m√n). In regular bipartite graphs, however, a matching is known to be computable in O(m) time (due to Cole, Ost, and Schirra). In a recent line of work by Goel, Kapralov, and Khanna the O(m) time bound was improved first to Ō(min{m, n2.5/d}) and then to Ō(min{m, n²/d}). In this paper, we give a randomized algorithm that finds a perfect matching in a d-regular graph and runs in O(n log n) time (both in expectation and with high probability). The algorithm performs an appropriately truncated alternating random walk to successively find augmenting paths. Our algorithm may be viewed as using adaptive uniform sampling, and is thus able to bypass the limitations of (nonadaptive) uniform sampling established in earlier work. Our techniques also give an algorithm that successively finds a matching in the support of a doubly stochastic matrix in expected time O(n log² n), with O(m) pre-processing time; this gives a simple O(m+mnlog² n) time algorithm for finding the Birkhoff-von Neumann decomposition of a doubly stochastic matrix. We show that randomization is crucial for obtaining o(nd) time algorithms by establishing an Ω(nd) lower bound for deterministic algorithms. We also show that there does not exist a randomized algorithm that finds a matching in a regular bipartite multigraph and takes o(n log n) time with high probability.
  • Publication
    Reconstructing Strings from Random Traces
    (2004-01-11) Kannan, Sampath; Batu, Tugkan; Khanna, Sanjeev; McGregor, Andrew
    We are given a collection of m random subsequences (traces) of a string t of length n where each trace is obtained by deleting each bit in the string with probability q. Our goal is to exactly reconstruct the string t from these observed traces. We initiate here a study of deletion rates for which we can successfully reconstruct the original string using a small number of samples. We investigate a simple reconstruction algorithm called Bitwise Majority Alignment that uses majority voting (with suitable shifts) to determine each bit of the original string. We show that for random strings t, we can reconstruct the original string (w.h.p.) for q = O(1/ log n) using only O(log n) samples. For arbitrary strings t, we show that a simple modification of Bitwise Majority Alignment reconstructs a string that has identical structure to the original string (w.h.p.) for q = O(1/n1/2+ε) using O(1) samples. In this case, using O(n log n) samples, we can reconstruct the original string exactly. Our setting can be viewed as the study of an idealized biological evolutionary process where the only possible mutations are random deletions. Our goal is to understand at what mutation rates, a small number of observed samples can be correctly aligned to reconstruct the parent string. In the process of establishing these results, we show that Bitwise Majority Alignment has an interesting selfcorrecting property whereby local distortions in the traces do not generate errors in the reconstruction and eventually get corrected.
  • Publication
    Algorithms for the Generalized Sorting Problem
    (2011-10-01) Kannan, Sampath; Huang, Zhiyi; Khanna, Sanjeev
    We study the generalized sorting problem where we are given a set of n elements to be sorted but only a subset of all possible pairwise element comparisons is allowed. The goal is to determine the sorted order using the smallest possible number of allowed comparisons. The generalized sorting problem may be equivalently viewed as follows. Given an undirected graph G(V,E) where V is the set of elements to be sorted and E defines the set of allowed comparisons, adaptively find the smallest subset E' E of edges to probe such that the directed graph induced by E' contains a Hamiltonian path. When G is a complete graph, we get the standard sorting problem, and it is well-known that Θ(n log n) comparisons are necessary and sufficient. An extensively studied special case of the generalized sorting problem is the nuts and bolts problem where the allowed comparison graph is a complete bipartite graph between two equal-size sets. It is known that for this special case also, there is a deterministic algorithm that sorts using Θ(n log n) comparisons. However, when the allowed comparison graph is arbitrary, to our knowledge, no bound better than the trivial O(n2) bound is known. Our main result is a randomized algorithm that sorts any allowed comparison graph using Õ(n3⁄2) comparisons with high probability (provided the input is sortable). We also study the sorting problem in randomly generated allowed comparison graphs, and show that when the edge probability is p, Õ(min{n/p2 , n3⁄2p √p}) comparisons suffice on average to sort.
  • Publication
    Approximation Schemes for Preemptive Weighted Flow Time
    (2002-05-19) Chekuri, Chandra; Khanna, Sanjeev
    We present the first approximation schemes for minimizing weighted flow time on a single machine with preemption. Our first result is an algorithm that computes a (1 + ε)- approximate solution for any instance of weighted flow time in O(nO(ln W ln P/ε3) time; here P is the ratio of maximum job processing time to minimum job processing time, and W is the ratio of maximum job weight to minimum job weight. This result directly gives a quasi-PTAS for weighted flow time when P and W are poly-bounded, and a PTAS when they are both O(1). We strengthen the former result to show that in order to get a quasi-PTAS it suffices to have just one of P and W to be poly-bounded. Our result provides strong evidence to the hypothesis that the weighted flow time problem has a PTAS. We note that the problem is strongly NP-hard even when P and W are O(1). We next consider two important special cases of weighted flow time, namely, when P is O(1) and W is arbitrary, and when the weight of a job is inverse of its processing time referred to as the stretch metric. For both of the above special cases we obtain a (1 + ε)-approximation for any ε > 0 by using a randomized partitioning scheme to reduce an arbitrary instance to several instances all of which have P and W bounded by a constant that depends only on ε.
  • Publication
    On Broadcast Disk Paging
    (1999-03-13) Khanna, Sanjeev; Liberatore, Vincenzo
    Broadcast disks are an emerging paradigm for massive data dissemination. In a broadcast disk, data is divided into n equal-sized pages, and pages are broadcast in a round-robin fashion by a server. Broadcast disks are effective because many clients can simultaneously retrieve any transmitted data. Paging is used by the clients to improve performance, much as in virtual memory systems. However, paging on broadcast disks differs from virtual memory paging in at least two fundamental aspects: - A page fault in the broadcast disk model has a variable cost that depends on the requested page as well as the current state of the broadcast. - Prefetching is both natural and a provably essential mechanism for achieving significantly better competitive ratios in broadcast disk paging. In this paper, we design a deterministic algorithm that uses prefetching to achieve an O(n log k) competitive ratio for the broadcast disk paging problem, where k denotes the size of the client's cache. We also show a matching lower bound of Ω(n log k) that applies even when the adversary is not allowed to use prefetching. In contrast, we show that when prefetching is not allowed, no deterministic online algorithm can achieve a competitive ratio better than Ω(nk). Moreover, we show a lower bound of Ω(n log k) on the competitive ratio achievable by any nonprefetching randomized algorithm against an oblivious adversary. These lower bounds are trivially matched from above by known results about deterministic and randomized marking algorithms for paging. An interpretation of our results is that in the broadcast disk paging, prefetching is a perfect substitute for randomization.
  • Publication
    Archiving Scientific Data
    (2002-06-04) Buneman, Peter; Khanna, Sanjeev; Tajima, Keishi; Tan, Wang-Chiew
    We present an archiving technique for hierarchical data with key structure. Our approach is based on the notion of timestamps whereby an element appearing in multiple versions of the database is stored only once along with a compact description of versions in which it appears. The basic idea of timestamping was discovered by Driscoll et al. in the context of persistent data structures where one wishes to track the sequences of changes made to a data structure. We extend this idea to develop an archiving tool for XML data that is capable of providing meaningful change descriptions and can also efficiently support a variety of basic functions concerning the evolution of data such as retrieval of any specific version from the archive and querying the temporal history of any element. This is in contrast to diff-based approaches where such operations may require undoing a large number of changes or significant reasoning with the deltas. Surprisingly, our archiving technique does not incur any significant space overhead when contrasted with other approaches. Our experimental results support this and also show that the compacted archive file interacts well with other compression techniques. Finally, another useful property of our approach is that the resulting archive is also in XML and hence can directly leverage existing XML tools.
  • Publication
    Why and Where: A Characterization of Data Provenance
    (2001-01-01) Buneman, Peter; Khanna, Sanjeev; Tan, Wang-Chiew
    With the proliferation of database views and curated databases, the issue of data provenance - where a piece of data came from and the process by which it arrived in the database - is becoming increasingly important, especially in scientific databases where understanding provenance is crucial to the accuracy and currency of data. In this paper we describe an approach to computing provenance when the data of interest has been created by a database query. We adopt a syntactic approach and present results for a general data model that applies to relational databases as well as to hierarchical data such as XML. A novel aspect of our work is a distinction between "why" provenance (refers to the source data that had some influence on the existence of the data) and "where" provenance (refers to the location(s) in the source databases from which the data was extracted).