Overview

Sensor-based observing systems face a number of significant challenges in design and operations, including heterogeneity of instrumentation and complexity of data stream processing. These systems incorporate instruments from across the spectrum of complexity, from temperature sensors to acoustic Doppler current profilers, to streaming video cameras. They typically require integration of instruments from various vendors and R&D labs. Managing these instruments and their data streams is a serious challenge. Requirements on the cyberinfrastructure include scalable and secure support for real-time data acquisition, instrument and data stream management, and analysis and visualization. Many applications address these issues by building custom systems that are inevitably complex and difficult to support. Others buy into a commercial proprietary system. Extensibility, scalability, and interoperability are often sacrificed under these approaches. DataTurbine was developed to address the challenges faced in building and managing sensor-based observing systems.

DataTurbine is a robust real-time streaming data engine. It is an open-source middleware product supported by NSF, NASA, and private industry. It has been tested in a variety of real-world streaming data applications, from civil engineering, to environmental monitoring, to autonomous vehicles. The DataTurbine middleware satisfies a core set of infrastructure requirements that are common in sensor-based systems, including reliable data transport, a framework for integrating heterogeneous instruments, and a comprehensive suite of services for data management, routing, synchronization, monitoring, and visualization. From the perspective of distributed systems, the DataTurbine middleware is a "black box" to which applications and devices send and receive data. DataTurbine handles all data management operations between data sources and sinks, including reliable transport, routing, scheduling, and security. DataTurbine accomplishes this through the innovative use of flexible network bus objects combined with memory and file-based ring buffers. Network bus objects perform data stream multiplexing and routing. Ring buffers provide tunable persistent storage at key network nodes to facilitate reliable data transport.  The DataTurbine software is open source and free.  There is also an active developer and user community that continues to evolve the software and assist in application development.  An overview of DataTurbine is provided in our e-Science 2007 paper, available here: e-Science 2007 Paper

 

NSF logo