Focus: Campus Bridging Networking and Data-centric Issues

Indiana University is hosting a workshop focusing on Networking and Data-centric Issues as they apply to the NSF-sponsored Campus Bridging Taskforce charter. The workshop is being conducted in Indianapolis, IN on April 7th from 8am until 5pm. Dinner will be provided that evening beginning at 7pm. The conference resumes on April 8th from 8am until 4pm.

We are inviting the submission of position papers from the software and scientific community. Position papers serve several functions. They provide a mechanism for people who cannot attend the workshop to provide input to the workshop. They provide a way to enter your viewpoints into the final workshop report (all position papers will be included in the workshop report as appendices). And last but not least.... position papers are not required for participation. You may register for the conference through the registration page but a position paper accompanying a registration will be appreciated. At present the workshop registration is roughly half full, so please register if you are interested!

In order to provide good opportunities for input, please submit position papers by 31 March 2010 if at all possible.

Please register to participate as soon as possible - this is an important workshop that will help guide NSF plans for the future!

The workshop will be held at the University Place Conference Center (UPCC) on the Indiana University Purdue University Indianapolis (IUPUI) campus in Indianapolis, Indiana on April 7th & 8th, 2010. For more information about the workshop location, use the desired link: UPCC and IUPUI.

If you would like to add a position paper or comment on papers, you must first login or create a new account.

Campus Bridging Taskforce

Overview
Cyberinfrastructure may be defied as " …computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people, all linked together by software and high performance networks to improve research productivity and enable breakthroughs not otherwise possible." There is widespread agreement that U.S. investments in research cyberinfrastructure are not coordinated well enough to deliver optimum benefit to the science and engineering research communities.

The NSF has established a hierarchy of Tier 1, 2, and 3 centers. Tier 1 center is the largest NSF-funded supercomputer in the US (located at NCSA). Tier 2 Centers are very large systems roughly one order of magnitude less powerful (e.g. systems at TACC, NICS, and systems planned at PSC and SDSC). Datanet awardees may also establish Tier 2 data storage systems. Tier 3 systems are still very powerful systems operated at the regional or state levels, or at individual universities and colleges. The existing NSF-funded Tier 2 systems are part of the current TeraGrid. The Open Science Grid does not fit particularly well within this heirarchy. It is widely agreed that interoperability of Tier 3 and Tier 2 systems is inadequate to serve the best interests of the U.S. science and engineering communities. From the standpoint of an individual researcher, it is viewed as far too difficult to migrate from a Tier 3 facility to a Tier 2 facility. Problems of coordinating cyberinfrastructure are not limited to the national level; organizing cyberinfrastructure within a university or even a single campus can be challenging.

This taskforce is meant to address the broad issues involving improving campus interactions with CyberInfrastructure, broadly construed. It will include a number of different types of bridging:

  • Campus grids to national infrastructure (both compute and data oriented approaches)
  • Campus networks to state, regional and national
  • Departmental cluster to campus HPC infrastructure
  • Campus-to-campus and campus-to-state/regional approaches

Goals
Proposed goals for the taskforce include:

  • Identification of best practices for general process of bridging to national infrastructure
  • Identification of best practices for interoperable identification and authentication
  • Identification of best practices for dissemination of and use of shared data collections
  • Identification of best practices to vetting and sharing definitive, open use educational materials
  • Suggest common elements of software stacks widely usable across nation/world to promote interoperability/economy of scale
  • Suggested policy documents that any research university have in place
  • Identify solicitations to support this work

The first three items above will be the most immediate priority items. The key overarching goal is to implement processes, tools, and solicitations that will achieve better coordination of cyberinfrastructure to optimize innovation and discovery by the U.S. science and engineering communities.

Networking Workshop Focus - Lead, Dave Jent (IU):

  • Find best practices where campus networking is done well
  • From the campus perspective: how to identify best practices in campus networking and end-to-end computing architecture (where that may mean lab to campus to RON to national backbone to resource hanging off of national backbones)
  • Can we also identify best practices in avoiding (or safely sitting outside) campus CI when campus cybersecurity policies are at odds with meeting research needs
  • How do you design a network for researchers when you are designing overall for the masses or for campus business operations? One particular area of discussion was how to make sensible recommendations to solve networking bottlenecks in a targeted way, rather than trying to create a campus CI that delivers the services researchers need to everyone. From the researcher’s standpoint, how do you discover what resources are available and what rational expectations are?
  • Campus intellectual property issues and safeguarding?
  • What is the role of IPv6?
  • What is the role of wireless?
  • Particular objective: create a document describing the minimal networking facilities to support research within campus and from campus to RON to national backbone networking

Data-centric Workshop Focus - Lead, Guy Almes (Texas A&M):

  • Networking design within campus and how it interacts with national trends, international trends (including technology trends of different scaling rates for different areas of technology). Specifically, how must the end-to-end network architecture meet the needs (including data access and remote visualization) stemming from Campus Bridging? Due to trends in data storage and instrumentation, the volumes of data to be moved are increasing in an aggressive exponential fashion – the (end-to-end) network will need to keep up.
  • From a researcher’s standpoint, how do I know where my data are and how do I get them? The answer to these questions must not depend on whether the data are (currently) local or remote.
  • Data storage infrastructure - what are the expectations on campuses? Issues of reliability and data management become more difficult as the quantity and complexity of data continue to grow.
  • How to handle both, a large number of small files AND a small number of large files
  • Discovery of data resources generally (in the sense of finding public and/or shareable data resources). Again, from a researcher’s standpoint, how this is requested must not depend on data location.
  • Each of the various forms of remote visualization involve some participation by the campus (perhaps ‘merely’ with excellent networking or perhaps in other ways).
  • Note: data do not equal files; services and streams matter a lot and may become very important in the future.