First Open Source DataTurbine Workshop Report

First Open Source DataTurbine Workshop Report
October 7 2008,
8:00 am - 5:30 pm
La Jolla room,
La Jolla Shores Hotel,
San Diego.
- 8:00 am - 9:00 am breakfast
- Welcome, Introduction, and Overview -- Tony Fountain, UCSD [Slides]
- The Open Source DataTurbine (OSDT) Initiative is an international community of scientists and engineers who share a common interest in real-time streaming data middleware and applications (www.dataturbine.org). Community members are drawn from academia and industry, and represent a variety of science and engineering domains, from ecology to aerospace. The technology base of the OSDT Initiative is the DataTurbine open source software. On 7 October 2008, the First Open Source DataTurbine (OSDT) Initiative workshop was held in La Jolla, CA. This workshop celebrated the one-year anniversary of the Initiative as an open source software community. It provided a forum for presentations and discussions on the current status and future plans for the OSDT Initiative. The workshop was sponsored by National Science Foundation award number OCI-0722067. There were 25 workshop participants.
- The Origins of DataTurbine -- Matt Miller, Erigo Technologies [Slides]
- The primary goal of this discussion is not so much to provide a history lesson, but to show the key and unique features of RBNB via a discussion of how and why it came into being. With this understanding, current and future users may better understand and leverage its capabilities. This discussion is from the perspective of the author and primary inventor of RBNB, and shows the experience and thought process leading up to the RBNB invention.
- DataTurbine Activities in GLEON, CREON, and MoveBank -- Sameer Tilak, UCSD [Slides]
- Since its inception, Open Source DataTurbine Initiative team members have been involved in providing streaming data middleware component to a number of observing systems domains and communities.The Global Lake EcologicalObservatory Network (GLEON) is a grassroots network of limnologists, information technology experts, and engineers who have a common goal of building a scalable, persistent network of lake ecology observatories understanding of lake processes at local, regional, continental and global scales. The Coral Reef Environmental Observatory Network (CREON) is a collaborating association of scientists and engineers from around the world striving to design and build marine sensor networks. MoveBank is a grassroots network of animal tracking researchers collecting data using a variety of collection methodologies including satellite-based tags, radio-transmitters, and camera traps.In this talk we give a quick overview of GLEON, CREON, and MoveBank communities and our experiences with DataTurbine deployments at various GLEON and CREON sites over the last one year and our plans for DataTurbine deployment for managing live animal tracking data in MoveBank.
- DataTurbine Activities at NASA Dryden -- Larry Freudinger, NASA [Slides]
- NASA's Global Test Range Development Laboratory at NASA Dryden serves the airborne science and aeronautics research community. Online, near realtime network computing infrastructure is enabled on the ground with an extensible hierarchy of a half dozen DataTurbine servers supporting acquisition, transport, processing, and display functions for multiple simultaneous aircraft and missions. DataTurbine servers fill similar roles on two airborne laboratory aircraft that carry research teams in addition to instrumentation. This overview brief included a video clip showing a recent Google Earth mash-up with ad hoc display applications enabled through DataTurbine.
- DataTurbine Activities in Connecticut Bridge Monitoring Program -- Richard Chirstenson, University of Connecticut [Slides] [Video1] [Video2] [Video3]
- Bridge monitoring in Connecticut is a combined effort between the University of Connecticut and Connecticut Department of Transportation. This program of short and long term monitoring currently has a network of six bridges with long-term monitoring systems. DataTurbine meets a need to provide fully automated continuous monitoring from remote locations and can be used to effectively convey the results of bridge monitoring to the end user. DataTurbine, streaming data and video, is currently being installed on two of the highway bridges in Connecticut.
- Applications of the DataTurbine at Creare Inc. -- Bill Finger, Creare Inc. [Slides] [Video]
- Creare continues to leverage their RBNB(TM) DataTurbine software in a wide variety of application areas. The presentation describes three current activities. The first is distributed battlefield surveillance using web cameras and the DataTurbine as the backbone of a hierarchical intelligent detection network. The second application is real-time monitoring of aircraft flight tracks, for use in airborne science. Lastly the presentation describes Creare's Flight Test Assistant, asoftware and hardware integrated system for use by aircraft test pilots to reduce the cost of instrumenting test aircraft and improve in-flight productivity.
- DataTurbine Activities in CUASHI -- Ilya Zaslavsky, UCSD [Slides]
- The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities across the country and internationally. As part of its mission, CUAHSI supports the development of cyberinfrastructure for the hydrologic sciences. In particular, the CUAHSI HIS (Hydrologic Information System) project has focused on consistent management of observations data available from several federal agencies (USGS, EPA, USDA, NOAA, etc.) as well as published by individual investigators. Management of real time data is an important component of the project, in particular support for publication and analysis of real time observational data generated at NSF-supported hydrologic observatory test beds, and their integration with datasets from other agencies. Experimental deployments showing DataTurbine integration with hydrologic cyberinfrstructure, have been demonstrated at several recent CUAHSI and WATERS meetings and other conferences (Baltimore in Oct 2007, AGU Fall 2007, Boulder in July 2008, Chicago in Oct 2007). The demonstrations included an online map interface called DASH (Data Access System for Hydrology) configured to
access several observation networks for San Diego area, including real-time stations. Data streams from these stations were ingested in CUAHSI Observations Data Model (ODM) and made available for querying via CUAHSI WaterML-compliant web services. As part of further development, we envision DataTurbine servers having a WaterML-based API and supporting GetSites, GetSiteInfo, GetVariableInfo and GetValues methods - this will streamline integration of realtime hydrologic data sources with observational data from many federal, state and local sources already registered in CUAHSI HIS.
- The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities across the country and internationally. As part of its mission, CUAHSI supports the development of cyberinfrastructure for the hydrologic sciences. In particular, the CUAHSI HIS (Hydrologic Information System) project has focused on consistent management of observations data available from several federal agencies (USGS, EPA, USDA, NOAA, etc.) as well as published by individual investigators. Management of real time data is an important component of the project, in particular support for publication and analysis of real time observational data generated at NSF-supported hydrologic observatory test beds, and their integration with datasets from other agencies. Experimental deployments showing DataTurbine integration with hydrologic cyberinfrstructure, have been demonstrated at several recent CUAHSI and WATERS meetings and other conferences (Baltimore in Oct 2007, AGU Fall 2007, Boulder in July 2008, Chicago in Oct 2007). The demonstrations included an online map interface called DASH (Data Access System for Hydrology) configured to
- DataTurbine Activities and UCSD Smart buildings initiative -- Jan Kleissl, UCSD [Slides]
- Demand response describes control methodologies that enhance load shedding or load shifting during times when the electric grid is near its capacity, e.g. coincident with a drop in renewable energy production. Due to their large electric load, thermal storage opportunities, and automation, Heating, Ventilating, and Air Conditions (HVAC) systems are ideal for DR programs. Coupling HVAC control to dense building cyberinfrastructure and heterogeneous outdoor monitoring networks will be achieved with the DataTurbine. Event detection algorithms will be applied to trigger HVAC control loops for DR.
- DataTurbine Activities in REAP and Kepler-- Derik Barseghian, NCEAS [Slides]
- The REAP project is focused on creating technology in which scientific workflow tools can be used to access, monitor, analyze and present information from field-deployed sensor networks, for both the oceanic and terrestrial environments, and across multiple spatiotemporal scales. Initial development for a terrestrial usecase uses DataTurbine and the scientific workflow software Kepler. In this usecase Kepler workflows are used to develop and test models exploring the impacts of abiotic factors (real-time light, temperature, and rainfall measurements) on the dynamics of plant host populations and their susceptibility to viral pathogens. REAP has developed a DataTurbine Source program to parse and push data into DataTurbine from our remote weather station, and within Kepler a DataTurbine Sink has been developed in the form of a Kepler actor (workflow component) allowing workflow authors a versatile means of requesting and retrieving data from DataTurbine servers. Additional work in these areas is planned, including improvement of the Kepler DataTurbine actor, additional and improved workflows, and the deployment of an on-site data buffer in the form of a DataTurbine server running on a low-power computer at the weather station.
- DataTurbine Activities in COMET -- Quinn Hart, UCD [Slides]
- The Coast to Mountain Environmental Transect (COMET) is a practical cyberinfrastructure prototype to facilitate how environmental factorslike climate climatic variability affect ecosystems along an elevation gradient from the coast of California to the summit of the Sierra Nevada. COMET is particularly interested in GIS data and standards including Web Map Server (WMS); Web Feature Server (WFS); Web Coverage Server (WCS) and the Sensor Observation Systems (SOS). We are particularly interested in Data Turbine to serve as a backend to sensor observations, as most SOS implementations using traditional database solutions suffer from performance and maintenance issues in our experience. Preliminary experiments with Data Turbine show that is might be a robust and effective backend for sensor data. Since data turbine offers limited default accessibility to the data stream, application specific layering over data turbine needs to be investigated.
- DataTurbine Activities in PISCO -- Chris Jones, UCSB [Slides]
- Researchers at the Hawaii Ocean Observing System and at the Partnership for Interdisciplinary Studies of Coastal Oceans have been exploring the use of near real-time data acquisition from oceanographic sensor arrays. A prototype system of the Open Source Data Turbine has been deployed at the Kilo Nalu Observatory off the coast of Honolulu, and streams oceanographic data including ocean currents, temperature, pressure, wave spectra, and water quality characteristics. Shore side client applications archive, process and display the data in near real-time, producing web-based graphics and summaries that provide public information on the coastal environment.
- DataTurbine Activities at NCHC, Kenting Coral Reef, Floodgrid -- Hsiu-Mei Chou, NCHC [Slides]
- Dataturbine is adapted in Kenting Coral Reefs Observation project to stream real time video from monitoring site to researchers' viewing desktop. Video streams are shared to co-developers, through Dataturbine's routing mechanism, and displayed on Tiled Display Wall at UCSD/CALIT2. For Flood Mitigation project we adapted Dataturbine as data buffer for retrieving historical data from storage facilities.
- DataTurbine and Semantic Web Technologies -- Alejandro Rodriguez, NCSA [Slides]
- The exchange of data among large scientific communities require the use of self-describing data formats that keep provenance and other metadata, for which semantic web technologies are especially suitable, except when these data aretime related. We use Open Source DataTurbine to bridge time series data and semantic web repositories, leveraging the high throughput, real-time performance of stream managers, while making the data available for online and off-line manipulation and exchange.
- Lunch: 12:30 pm - 2 pm
- Discussion: Open Source DataTurbine Testbed Activities Session leads – Shava Smallen, Sameer Tilak [Slides] [Slides]
- Discussion: Open Source DataTurbine Software Extensions Session leads - Larry Freudinger, Matt Miller
- Application Note: LabVIEW/DataTurbine Interface [Slides] [Video]
- Protocol Buffer Study [Slides] [Whitepaper] [Appendix-A] [Appendix-B]
- Discussion: Open Source DataTurbine Code Management – Session leads - Tony Fountain, Paul Hubbard [Slides]
- Working Dinner: 6:30 pm
![]()