Science Of Future

SECTION: Computer Technologies

SCIENTIFIC ORGANIZATION:
NATIONAL RESEARCH CENTRE "KURCHATOV INSTITUTE"

REPORT FORM:
«Oral report»

AUTHOR(S)
OF THE REPORT:
Alexei Klimentov

SPEAKER:
Alexei Klimentov

REPORT TITLE:
A Scalable Workload Management System For Data Intensive Science

TALKING POINTS:

The LHC experiments are today at the leading edge of large scale distributed data-intensive computational science. The LHC's ATLAS experiment processes data volumes which are particularly extreme, over 150 PB to date, distributed worldwide at over of 120 sites. An important element in the success of the exciting physics results from ATLAS is the highly scalable integrated workflow and data flow management afforded by the PanDA workload management system, used for all the distributed computing needs of the experiment. The PanDA design is not experiment specific and PanDA is now being extended to support other data intensive scientific applications. In this talk, a description of the new program of work to develop a generic version of PanDA will be given, as well as the progress in extending PanDA's capabilities to support supercomputers, clouds, leverage intelligent networking, while accommodating the ever growing needs of current users. In particular we will present our plans to refactor PanDA and to develop VO neutral WMS package to be used by new experiments, such as LSST, LBNE and NICA, as well as running LHC experiments. PanDA has already demonstrated at a very large scale the value of automated data-aware dynamic brokering of diverse workloads across distributed computing resources.

The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from HENP experience and proven tools in highly scalable processing. We will present our current accomplishments with running PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.