Welcome to the Earth Science Information Partners (ESIP) 2018 Summer Meeting! The 2018 theme is Realizing the Socioeconomic Value of Data. The theme is based on one of the goals in the 2015 - 2020 ESIP Strategic Plan, which provides a framework for ESIP’s activities over the next three years.

If you haven’t already, register here!

Room Block Update: Our block is full. We recommend the AC Hotel Tucson Downtown, which is about 5 minutes by car and is accessible via the Tucson Streetcar in about fifteen minutes.
View analytic

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Sabino [clear filter]
Tuesday, July 17


Introduction to Jupyter technologies and how they are used in the ESIP community
You’ve heard a lot about Jupyter. There are Notebooks and Hubs, but what are they? Do they make it easier for you to do or share your work?

Participants in this session will be given an overview on how ESIP members are using the Jupyter Project’s applications to accelerate their own research. This breakout session is intended as an introduction not only to Jupyter applications and their usage in ESIP member organizations. Workshops using the technologies via ESIPhub later in the Meeting will also be discussed. We will hold a ten minute discussion after the presentations on the topics brought up during the talks and how we as a community can use the ESIPhub resource.

Frank Greguska, NASA JPL (15min)
Title: Using Apache Science Data Analytics Platform from Jupyter
Description: Apache Science Data Analytics Platform (SDAP) is an open source Apache Incubator project that, among other things, allows for analysis of scientific data on the cloud. SDAP consists of a collection of webservices that enable science and allow user interaction through Jupyter notebooks. This talk will introduce the Apache SDAP project and walk attendees through some of the algorithms that are available for use.

Tyler Erickson, Google (15min)
Title: Jupyter and Google Earth Engine
Description: Google Earth Engine is a cloud-based geospatial analysis platform that supports analysis of multi-petabyte archives via JavaScript and Python APIs. For users of the JavaScript API. the Earth Engine team maintains an online GUI. For the Python API, we promote the use of Jupyter project tools (JupyterLab, JupyterHub, Jupyter Widgets) for accessing data and developing algorithms.
Presentation: g.co/earth/esip2018-jupyter

John Readey, HDF Group (15min)
Title: HDF Kita Lab
Description: HDF Kita Lab is a Jupyter environment hosted on AWS that provides the ability to easily read and write large HDF datasets.  Users have the ability to utilize HDF Server to access data that would otherwise be too large to copy to the user disk volume.  Data used by HDF Server is stored in AWS S3, which is provides cost-effective and reliable storage.  HDF Kita Lab can be access at: https://hdflab.hdfgroup.org (HDFGroup registration is required).

Rich Signell, USGS (15min)
Title: Jupyter Success Stories from IOOS and USGS
Description: The Integrated Ocean Observing System and the US Geological Survey have been using Jupyter technologies since 2012 to help spread the use of effective and efficient tools across their communities.  These notebooks often demonstrate reproducible workflows based on catalog and data web services and come with reproducible environments made possible by the conda-forge project.  A series of notebooks will be demonstrated, from notebooks demonstrating catalog-driven workflows, to notebooks on binder that appear like web applications.

Keith Maull, NCAR Library (15min)
Title: ESIPHub Pilot | Exploring services and infrastructure to support computational geosciences research and collaboration with JupyterHub
Description: ESIPHub, a JupyterHub-based infrastructure for the ESIP community, is now available and being used within several workshops during the summer meeting.
In this talk, I will discuss the pilot of ESIPHub with UCAR/NCAR's highly successful Research Experiences for Undergraduates (REU) program, SOARS (Significant Opportunities in Atmospheric Research and Science; https://www.soars.ucar.edu). Over the last three years, we have been developing computational workshops to introduced SOARS Protégés to Python, Jupyter, computational thinking and data analysis, and this summer, we piloted ESIPHub within these workshops. I will report on the exciting potential the platform has not only for education and training, but also collaborative research.

Discussion (10)

Learn more about Jupyter and attend the other workshops using ESIPhub:

* Directly after this session is the Metadata Improvement Lab where participants will learn how to translate their xml into JSON-LD using the schema.org vocabulary Google recommends for datasets.
* Wednesday afternoon is a workshop for cloud-based analysis.
* Thursday morning we'll learn about some custom widgets for earth science.

Speakers & Moderators
avatar for Tyler Erickson

Tyler Erickson

Developer Advocate, Google
avatar for Sean Gordon

Sean Gordon

Metadata Developer, The HDF Group
Talk to me about the ESIP Labs project, ESIPhub a JupyterHub based shared computational environment for workshops at Meetings.My research focuses on the connections between documentation structures and the evaluation of content for the metadata needs of diverse communities of practice... Read More →
avatar for Rich Signell

Rich Signell

Oceanographer, USGS
Ocean Modeling, Python, NetCDF, THREDDS, ERDDAP, UGRID, SGRID, CF-Conventions, Jupyter, JupyterHub, CSW, TerriaJS

Tuesday July 17, 2018 9:30am - 11:00am
Wednesday, July 18


Publishing schema.org Dataset: Lessons Learned and Paths Forward
Progress surrounding the schema.org type Dataset have made it an attractive way for repositories to expose dataset metadata to search engines. The NSF EarthCube initiative funded a short-term project, P418, to explore what could be achieved if repositories could adopt schema.org as a mechanism for self-publishing information using a common schema. As part of this project a number of repositories volunteered to try publishing schema.org by embedding it in their websites.

In this session, we will:
introduce the P418 project goals and the philosophy of behind using schema.org, (15min)
and then explore some real-word schema.org publishing stories to: (30min)
hear about the various techniques used and challenges encountered for embedding the schema.org markup in web pages,
understand how well schema.org covers a repository’s own metadata model,
discuss where schema.org needs extensions and how the geoscience community can collectively move forward to improving the quality of the markup.

For more schema.org sessions see:
Tuesday, July 17 • 11:30am - 1:00pm Metadata Evaluation Lab at ESIP: Assessing if community metadata is ready for a Schema.org
Tuesday, July 17 • 4:00pm - 5:30pm Semantics in Action

Speakers & Moderators
avatar for John Relph

John Relph

Disruptor, NESDIS/NCEI
OneStop, Metadata, Archival, Automation, Data Management, Canaan Dogs
avatar for Adam Shepherd

Adam Shepherd

Technical Director, BCO-DMO
Linked Data | Semantic Web | Vocabularies

Wednesday July 18, 2018 2:00pm - 3:30pm


TaskAPI - A Scalable Computing Platform for Large Scientific Data Systems
TaskAPI is a workflow platform and DSL (Domain Specific Language) that provides automatic horizontal and vertical scaling of multi-language data-intensive scientific software systems using a functionally declarative workflow paradigm. TaskAPI is capable of quickly wrapping legacy systems, provides structured guidance for best-practices in continued or new development via its JSON DSL, and automatically provides system components with a unified, straightforward API for centralized logging, job and task killing, and configurable property use.

TaskAPI was developed to serve as the backbone for the reengineered US ASOS Ingest software system and exists as its own distributable package for use by other large polyglot systems.

This session will begin by providing a broad overview (surface skim) of the TaskAPI platform, including motivation, capabilities, current and potential use cases, and design and performance characteristics.

The summary will lead into a more detailed look at the TaskAPI structure, including the DSL setup, workflow branching, task types, multi-language parallelization techniques in Java, C, and Fortran, and current and planned language support (Python via Jep, Clojure, Scala via drivers), and other features (Kafka messaging queues).

After we thoroughly explain the system and its capabilities, we will deep dive into a live example of TaskAPI as it was implemented in ASOS, examining real-life challenges we faced and how to think about and implement best use practices.

Finally, we will assist session attendees and participants in determining if this system could serve their own projects, provide assistance in TaskAPI download and setup, and solicit feature requests and needs.

Speakers & Moderators
avatar for Ryan Berkheimer

Ryan Berkheimer

Software Research, GST at NOAA NCEI

Wednesday July 18, 2018 4:00pm - 5:30pm
Thursday, July 19


Machine Learning Working Session
Machine Learning engagement activities to increase the connectivity among data providers, Earth scientists, machine learning practicioners and computer service providers

Speakers & Moderators
avatar for Erin Robinson

Erin Robinson

Executive Director, ESIP Federation
Erin Robinson works at the intersection of community informatics, Earth science and non-profit management. Over the last 10 years, she has honed an eclectic skill set both technical and managerial, creating communities and programs with lasting impact around science, data, and technology... Read More →

Thursday July 19, 2018 11:30am - 1:00pm
Friday, July 20


Metadata Times, They Are Changing - New Capabilities and Applications
We will cover new developments in metadata standards from a variety of communities: ISO, DataCite, DataOne, NASA.

Speakers & Moderators
avatar for Ted Habermann

Ted Habermann

The HDF Group
avatar for Matt Jones

Matt Jones

Director of Informatics, UC Santa Barbara
Data Federation | Open Science | Provenance and Semantics
avatar for Tyler Stevens

Tyler Stevens

Senior Discipline Engineer, NASA EED-2 / SGT

Friday July 20, 2018 9:30am - 11:00am


HDF Townhall
Data in HDF continues to play an important role for Earth Scientists in the U.S. and around the world. The HDF Group will update ESIP members on interesting projects that have come to fruition during the last year, including the TerraFusion project which brings the entire history of Terra as well as recent releases of HDF5. We will also demonstrate how HDF tools support HDF-EOS data from product design to production and standards compliance testing to user support.

Suggestion to include:

Potential benefits of shuffle
Third-party compression filters

Speakers & Moderators

Friday July 20, 2018 11:30am - 1:00pm