Data Management

Oceans of data are generated every day by businesses and enterprises, and all of it must be prioritized, analyzed, and safeguarded with the right architecture, tools, polices, and procedures. TechRepublic provides the resources you need.

  • White Papers // Aug 2009

    Laconic Schema Mappings: Computing the Core With SQL Queries

    A schema mapping is a declarative specification of the relationship between instances of a source schema and a target schema. The data exchange (or data translation) problem asks: given an instance over the source schema, materialize an instance (or solution) over the target schema that satisfies the schema mapping. In...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Streams on Wires - A Query Compiler for FPGAs

    Taking advantage of many-core, heterogeneous hardware for data processing tasks is a difficult problem. In this paper, the authors consider the use of FPGAs for data stream processing as co-processors in many-core architectures. The authors present Glacier, a component library and compositional compiler that transforms continuous queries into logic circuits...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    XPEDIA: XML Processing for Data Integration

    Data Integration engines increasingly need to provide sophisticated processing options for XML data. In the past, it was adequate for these engines to support basic shredding and XML generation capabilities. However, with the steady growth of XML in applications and databases, integration platforms need to provide more direct operations on...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Stop Word and Related Problems in Web Interface Integration

    The goal of recent research projects on integrating Web databases has been to enable uniform access to the large amount of data behind query interfaces. Among the tasks addressed are: source discovery, query interface extraction, schema matching, etc. There are also a number of tasks that are commonly ignored or...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    An Evaluation of Checkpoint Recovery for Massively Multi-Player Online Games

    Massively Multiplayer Online games (MMOs) have emerged as an exciting new class of applications for database technology. MMOs simulate long-lived, interactive virtual worlds, which proceed by applying updates in frames or ticks, typically at 30 or 60 Hz. In order to sustain the resulting high update rates of such games,...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Probabilistic Histograms for Probabilistic Data

    There is a growing realization that modern DataBase Management Systems (DBMSs) must be able to manage data that contains uncertainties that are represented in the form of probabilistic relations. Consequently, the design of each core DBMS component must be revisited in the presence of uncertain and probabilistic information. In this...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Discovering Relative Importance of Skyline Attributes

    Querying databases with preferences is an important research problem. Among various approaches to querying with preferences, the skyline framework is one of the most popular. A well known deficiency of that framework is that all attributes are of the same importance in skyline preference relations. Consequently, the size of the...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Truth Discovery and Copying Detection in a Dynamic World

    Modern information management applications often require integrating data from a variety of data sources, some of which may copy or buy data from other sources. When these data sources model a dynamically changing world (e.g., people's contact information changes over time, restaurants open and go out of business), sources often...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    GConnect: A Connectivity Index for Massive Disk-Resident Graphs

    The problem of connectivity is an extremely important one in the context of massive graphs. In many large communication networks, social networks and other graphs, it is desirable to determine the minimum-cut between any pair of nodes. The problem is well solved in the classical literature, since it is related...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    iNextCube: Information Network-Enhanced Text Cube

    Nowadays, most business, administration, and/or scientific databases contain both structured attributes and text attributes. The authors call a database that consists of both multi-dimensional structured data and narrative text data as multidimensional text database. Searching, OLAP, and mining such databases pose many research challenges. To enhance the power of data...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Modeling and Querying Possible Repairs in Duplicate Detection

    One of the most prominent data quality problems is the existence of duplicate records. Current duplicate elimination procedures usually produce one clean instance (repair) of the input data, by carefully choosing the parameters of the duplicate detection algorithms. Finding the right parameter settings can be hard, and in many cases,...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Column-Oriented Database Systems

    Column-oriented database systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional database systems that store entire records (rows)...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    DataCell: Building a Data Stream Engine on Top of a Relational Database Kernel

    Stream applications gained significant popularity in recent years, which lead to the development of specialized datastream engines. They often have been designed from scratch and are tuned towards the specific requirements posed by their initial target applications, e.g., network monitoring and financial services. However, this also meant that they lack...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Full-Fidelity Flexible Object-Oriented XML Access

    Developers need to programmatically access persistent XML data. Object-oriented access is often the preferred method. Translating XML data into objects or vice-versa is a hard problem due to the data model mismatch and the difficulty of query translation. The authors propose a framework that addresses this problem by transforming object-based...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Cooperative Update Exchange in the Youtopia System

    Youtopia is a platform for collaborative management and integration of relational data. At the heart of Youtopia is an update exchange abstraction: changes to the data propagate through the system to satisfy user-specified mappings. The authors present a novel change propagation model that combines a deterministic chase with human intervention....

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Workload Aware Indexing of Continuously Moving Objects

    The increased deployment of sensors and data communication networks yields data management workloads with update loads that are intense, skewed, and highly bursty. Query loads resulting from location-based services are expected to exhibit similar characteristics. In such environments, index structures can easily become performance bottlenecks. The authors address the need...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Tagging Stream Data for Rich Real-Time Services

    In recent years, data streams have become ubiquitous as technology is improving and the prices of portable devices are falling, e.g., sensor networks, location-based services. Most data streams transmit only data tuples based on which continuous queries are evaluated. In this paper, the authors propose to enrich data streams with...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Effectively Indexing Uncertain Moving Objects for Predictive Queries

    Moving object indexing and query processing is a well studied research topic, with applications in areas such as intelligent transport systems and location-based services. While much existing work explicitly or implicitly assumes a deterministic object movement model, real-world objects often move in more complex and stochastic ways. This paper investigates...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Query Mesh: Multi-Route Query Processing Technology

    The authors propose to demonstrate a practical alternative approach to the current state-of-the-art query processing techniques, called the "Query Mesh" (or QM, for short). The main idea of QM is to compute multiple routes (i.e., query plans), each designed for a particular subset of data with distinct statistical properties. Based...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Exact Cardinality Query Optimization for Optimizer Testing

    The accuracy of cardinality estimates is crucial for obtaining a good query execution plan. Today‟s optimizers make several simplifying assumptions during cardinality estimation that can lead to large errors and hence poor plans. In a scenario such as query optimizer testing it is very desirable to obtain the "Best" plan,...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Managing Massive Time Series Streams with Multi-Scale Compressed Trickles

    The authors present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Declarative Management in Microsoft SQL Server

    This paper describes the principles and practice of Declarative Management - a new approach to the management of database systems. The standard approach to database systems management involves a brittle coupling of interactive operations and procedural scripts. Such ad hoc approach results in incorrect administration, which leads to increased management...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Efficient Outer Join Data Skew Handling in Parallel DBMS

    Large enterprises have been relying on Parallel DataBase Management Systems (PDBMS) to process their ever-increasing data volume and complex queries. The scalability and performance of a PDBMS comes from load balancing on all nodes in the system. Skewed processing will significantly slow down query response time and degrade the overall...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Path Oracles for Spatial Networks

    The advent of location-based services has led to an increased demand for performing operations on spatial networks in real time. The challenge lies in being able to cast operations on spatial networks in terms of relational operators so that they can be performed in the context of a database. A...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    On Chase Termination Beyond Stratification

    The authors study the termination problem of the chase algorithm, a central tool in various database problems such as the constraint implication problem, Conjunctive Query optimization, rewriting queries using views, data exchange, and data integration. The basic idea of the chase is, given a database instance and a set of...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Mining Document Collections to Facilitate Accurate Approximate Entity Matching

    Many entity extraction techniques leverage large reference entity tables to identify entities in documents. Often, an entity is referenced in document collections differently from that in the reference entity tables. Therefore, the authors study the problem of determining whether or not a substring "Approximately" matches with a reference entity. Similarity...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Reference-Based Alignment in Large Sequence Databases

    This paper introduces a novel method, called Reference-Based String Alignment (RBSA), that speeds up retrieval of optimal subsequence matches in large databases of sequences under the edit distance and the Smith-Waterman similarity measure. RBSA operates using the assumption that the optimal match deviates by a relatively small amount from the...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Predictable Performance for Unpredictable Workloads

    This paper introduces Crescando: a scalable, distributed relational table implementation designed to perform large numbers of queries and updates with guaranteed access latency and data freshness. To this end, Crescando leverages a number of modern query processing techniques and hardware trends. Specifically, Crescando is based on parallel, collaborative scans in...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    SIMD-Scan: Ultra Fast In-Memory Table Scan Using On-Chip Vector Processing Units

    The availability of huge system memory, even on standard servers, generated a lot of interest in main memory database engines. In data warehouse systems, highly compressed column-oriented data structures are quite prominent. In order to scale with the data volume and the system load, many of these systems are highly...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Lazy Updates: An Efficient Technique to Continuously Monitoring Reverse k-NN

    In this paper, the authors study the problem of continuous monitoring of reverse k nearest neighbor queries. Existing continuous reverse nearest neighbor monitoring techniques are sensitive towards objects and queries movement. For example, the results of a query are to be recomputed whenever the query changes its location. They present...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Adaptively Parallelizing Distributed Range Queries

    The authors consider the problem of how to best parallelize range queries in a massive scale distributed database. In traditional systems the focus has been on maximizing parallelism, for example by laying out data to achieve the highest throughput. However, in a massive scale database such as the authors' PNUTS...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Synergy-Based Workload Management

    Workload management aims at the efficient execution of queries on a database. In this paper, scheduling plays a crucial role. A vast number of scheduling approaches have been developed, most of them belonging to one of two categories: analysis and monitoring. However, they mainly either focus only on one possible...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Linkage Query Writer

    The authors present Linkage Query Writer (LinQuer), a system for generating SQL queries for semantic link discovery over relational data. The LinQuer framework consists of LinQL, a language for specification of linkage requirements; a web interface and an API for translating LinQL queries to standard SQL queries; an interface that...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    Tuning Database Configuration Parameters With iTuned

    Database systems have a large number of configuration parameters that control memory distribution, I/O optimization, costing of query plans, parallelism, many aspects of logging, recovery, and other behavior. Regular users and even expert database administrators struggle to tune these parameters for good performance. The wave of research on improving database...

    Provided By VLDB Endowment

  • White Papers // Aug 2009

    MCC-DB: Minimizing Cache Conflicts in Multi-Core Processors for Databases

    In a typical commercial multi-core processor, the Last Level Cache (LLC) is shared by two or more cores. Existing studies have shown that the shared LLC is beneficial to concurrent query processes with commonly shared data sets. However, the shared LLC can also be a performance bottleneck to concurrent queries,...

    Provided By VLD Digital

  • White Papers // Aug 2009

    Break The Golden Rule

    It's not hard for most of the people to recall a time when one lets people down by inaccurately anticipating their preferences or expectations. Whether it's the wrong birthday gift for a spouse or sales approach for a customer, the culprit is often thinking that what one would want for...

    Provided By American Express

  • White Papers // Aug 2009

    A Computational Framework for Certificate Policy Operations

    The trustworthiness of any Public Key Infrastructure (PKI) rests upon the expectations for trust, and the degree to which those expectations are met. Policies, whether implicit as in PGP and SDSI/SPKI or explicitly required as in X.509, document expectations for trust in a PKI. The widespread use of X.509 in...

    Provided By Dartmouth College

  • Webcasts // Aug 2009

    Don't Gamble With Your Recovery: Protect More and Store Less (Part 4 of 4)

    Backup vs. archiving? Recovery vs. discovery? Which protects the data and which keeps the legal department off the back? This webcast looks at backup, recovery, and archiving. The presenter will discuss the business needs driving these areas of IT, as well as technical differentiation to demonstrate which information management solutions...

    Provided By Symantec

  • Case Studies // Aug 2009

    IBM Reduces Cost and Supports Growth With a Consolidated Solution for Wells' Dairy

    Founded in 1913 and based in Le Mars, Iowa, Wells' Dairy, Inc., is the largest family-owned and managed dairy processor in the United States, known for their popular BLUE BUNNY ice cream and novelties. Wells' Dairy was looking for an infrastructure refresh to improve performance while providing operating system flexibility,...

    Provided By IBM

  • White Papers // Aug 2009

    Generic Interactive Natural Language Interface to Databases (GINLIDB)

    To override the complexity of SQL and to facilitate the manipulation of data in databases for common people (not SQL professionals), many researches have turned out to use natural language instead of SQL. The idea of using natural language instead of SQL has prompted the development of new type of...

    Provided By NORTH ATLANTIC UNIVERSITY UNION