Susana Sánchez Expósito Instituto de Astrofísica de Andalucía - CSIC Pablo Martin, Jose Enrique...

Click here to load reader

download Susana Sánchez Expósito Instituto de Astrofísica de Andalucía - CSIC Pablo Martin, Jose Enrique Ruiz, Lourdes Verdes-Montenegro, Julian Garrido, Raül Sirvent,

of 15

Transcript of Susana Sánchez Expósito Instituto de Astrofísica de Andalucía - CSIC Pablo Martin, Jose Enrique...

  • Slide 1
  • Susana Snchez Expsito Instituto de Astrofsica de Andaluca - CSIC Pablo Martin, Jose Enrique Ruiz, Lourdes Verdes-Montenegro, Julian Garrido, Ral Sirvent, Antonio Ruz Falc and Rosa Badia Web services as building Blocks for Science Gateways in Astronomy IWSG 2015, Budapest
  • Slide 2
  • Context: Big Data Era The scientific community is facing a data deluge. In particular in Astronomy ! SKA1 is currently in the detailed design phase. Construction phase: 2018-2023 SKA2 design and construction phases: 2018 2030 Paradigmatic Big Data Instrument: Square Kilometre Array (SKA) Credit: skatelescope.org (SKA organisation) The aperture arrays in the SKA could produce more than 100 times the global internet traffic.
  • Slide 3
  • Context: Facing the data deluge GRID Super Computing clusters Clouds grid engine Taverna Visualization tools They are built upon the scientific applications. They adapt the scientific applications to the DCIs (using the DCI API and/or tools for making a good use of them) They are designed aiming to be re- used (e.g. they can be customized with a specific parameters/inputs) They usually are workflows, visualization or management data tools WFs expose the scientific method, facilitating they can be reused Custom-made scripts Custom-made portals Data deluge, a twofold challenge: Technological and Scientific Populated with Advanced Tools: This work: advanced tools (workflows) for analysis tasks of interest for SKA use cases.
  • Slide 4
  • Context: Facing the data deluge in Astronomy Virtual Observatory Standards Data interoperability Uniform access to multiple archives The Virtual Observatory (VO) is a network of Web data services which provides access to astronomical archives Tools for local visualization and analysis Standards for analysis services: Avoid moving the data Work in progress www.ivoa.net Big Data technics for data reduction pipelines Transforming raw data into science-ready data E.g.: Pipeline to detect transient sources LOFAR (a SKA pathfinder) Data stored in a MonetDB to accelerate the data access. A transient source: the Crab pulsar An effort is still needed To improve the tools for analysing large volumes of science-ready data To publish these tools in collaborative environments
  • Slide 5
  • Science use case. Analysis of HI datacubes for kinematical modelling of galaxies RA DEC Frequency Radio interferometers as the SKA generate data with two spatial coordinate axes and a spectral axis The astronomers analyse these HI datacubes to produce kinematical models of galaxies This kind of studies are included in the SKA use cases Based on GIPSY: Powerful software for analysing radio interferometric data.
  • Slide 6
  • Science use case. Analysis of HI datacubes for kinematical modelling of galaxies Parameter space exploration (parameter sweep workflow): The astronomer usually sweeps a parameter range in order to find the best model ROTCUR derives the kinematical parameters from the datacube including the rotation curve ELLINT calculate the radial profile of the galaxy emission GALMOD builds a model from the kinematical parameters
  • Slide 7
  • Framework: AMIGA for GTC, ALMA and SKA pathfinders Amiga4GAS presented at IWSG 2013, Zurich The here presented work Collaborators of the SKA SDP consortium in charge of designing the systems for processing the SKA data and for delivering them to the astronomers GOAL: Facilitate astronomers launching their workflows in heterogeneous DCIs http://amiga.iaa.es/p/263-federated-computing.htm
  • Slide 8
  • Two-level workflow system ROTCURGALMODROTCURELLINTGALMODROTCUR ELLINTGALMODROTCURELLINTGALMOD AMIGA4GAS web services as interface of one or several GIPSY tasks GIPSY tasks orchestrated by COMPSs Infrastructure level workflows Connecting AMIGA4GAS services to build (Taverna) Wfs User level workflows
  • Slide 9
  • Infrastructure level workflows: Workflows as Web Services ArchitectureUser Interface designGranularity Goals: Improve the service performance & Make a good use of the infrastructures COMPSs role Improve the User Experience UI design & Granularity. GIPSY tasks have a large number of input parameters. Handling all of them in a graphical workflow manager can be tedious. Setting some of them with a default value would reduce the amount of required inputs but would constrain the specific operation that the service will perform. Compromise between flexibility and usability Granularity levels ranging from: Services with 1 GIPSY task to Services with 3 GIPSY tasks High granularity Lower execution time: Task communication inside the web server COMPSs orchestration capability
  • Slide 10
  • User level workflows: Taverna workflows https://srv-prj-wsamiga.fcsc.es:8444/Amiga4GasServiceLadon/soap11/description AMIGA4GAS services as blocks that the users can combine to build their experiments
  • Slide 11
  • User level workflows: Taverna workflows The workflows have been built using Taverna Workbench Astronomy Edition. Astrotaverna plugin (AstroTaverna - Building workflows with Virtual Observatory services. Ruiz, JE. Et al A&C 2014) : VO services for accessing the data Tools for managing VOTables
  • Slide 12
  • User level workflows: Taverna workflows The workflows have been published on myExperiment
  • Slide 13
  • User level workflows: Taverna workflows The DataLink (1) service delivers related complementary metadata, including the datacubes location. The SIAv2 (1) service queries a particular archive, searching for datacubes inside a sky region Using VO services to get the data location Specific formats to define the data location: In clusters: @: In gLite grids: the Logical File Name The AMIGA4GAS service is able to discover the computing infrastructure where finally the GIPSY task will be executed. Business rules and Federation Systems: A federated access to DCIs for executing scientific workflows, P. Martin et al. demo presented at EGI2015 (1) SIAv2 and Datalink protocols are in the state of Proposed Recommendation in IVOA
  • Slide 14
  • Conclusions The astronomical community is paving the way to the SKA : VO services for accessing astronomical data Innovative solutions to the data transmission and reduction. But astronomers need also innovative tools to support them to analyse science-ready data in collaborative environments. We have developed a set of advanced tools: the AMIGA4GAS services. They implement analysis tasks of interest for the SKA use cases focused on galaxies. These tools implement a two-level workflow system: Infrastructure level: efficient exploitation of the DCIs (COMPSs) User level: Integration in Taverna where the astronomers can combine them with VO tools
  • Slide 15
  • Thanks! [email protected] http://amiga.iaa.es/p/263-federated-computing.htm http://www.fcsc.es
  • Slide 16
  • WEB SERVICES Federated layer Business Rules Authentication COMPSs JavaGAT OCCI connector Ibergrid adaptor SGE adaptor GRID Super Computer / Cluster Clouds Authentication : LDAPs clusters & Grid certificates & Cloud accounts Business Rules: Scheduling among DCIs Latency, energy efficiency, time to complete, probability of completion or costs COMPSs [1] : Job launcher Infrastructure level workflows [1] http://www.bsc.es/es/publications/comp-superscalar-bringing-grid-superscalar-and-gcm-togetherhttp://www.bsc.es/es/publications/comp-superscalar-bringing-grid-superscalar-and-gcm-together
  • Slide 17
  • Outcomes of the project Set of web services for analysing interferometric multidimensional data running on heterogeneous DCIs 1) Specific (ready-to-use) outcomes How to schedule among DCIs: Business rules Statistics of performance of the applications on different infrastructures How to get astrophysics Software as a Service running on DCI Installation and deployment issues Web service design: efficiency and reusability Characterization of analysis services IVOA International Virtual Observatory Alliance PDL Parameter Description Language 2) General outcomes