Open Databases for crystallography and other techniques

Europe/Paris
Description

This satellite meeting is dedicated to existing open databases and proposals for open databases to encourage sharing of research data.

The Crystallography Open Database (COD) will be presented as an example of best practices for Open Databases with presentation by Dr Saulius Grazulis.

Following this presentation other databases or proposals will be presented and discuss (see timetable).

Proposals for presentations welcomed!

Participants
  • Benjamin Watts
  • Chun Li
  • Gilles Renaud
  • Haixing Fang
  • haodong yao
  • Hector Perez Ponce
  • Isabelle Kieffer
  • Jean Baptiste FLorial
  • Jiangtao Zhao
  • LEI WANG
  • Mani Lokamani
  • Manuel Sanchez del Rio
  • Mauro Rovezzi
  • Nicolas Soler
  • Patrick Austin
  • Sebastian Paripsa
  • Stephen Collins
  • Takahiro Matsumoto
  • Vincent Favre Nicolin
  • Yu Hu
  • +16
    • 2:00 PM 3:00 PM
      Crystallography Open Databases
      • 2:00 PM
        Crystallography Open Database - how to use it 1h

        This talk will take an in depth look at COD and how it can be integrated into the scientific workflow at photon and neutron sources. The talk will cover:

        a) The COD founding principles (openness, scientific rigour, FAIRness, availability on-line)
        b) COD history and development;
        c) COD data collection, its scope, data definitions;
        d) COD curation practices;
        e) COD links with other databases;
        f) derived data from the COD;
        g) data deposition to the COD;
        h) data search and retrieval from the COD;
        i) data extraction and processing tools in the COD.

        Speaker: Dr Saulius Grazulis
    • 3:00 PM 3:30 PM
      Discussion 30m
    • 3:30 PM 4:00 PM
      Coffee break 30m
    • 4:00 PM 5:30 PM
      Spectroscopy Open Databases
      • 4:00 PM
        Integrated XAFS Databases for Data-Driven Advances in Material Science. 30m

        The advancement of data-driven approaches in materials research emphasizes the critical role of data integration. In the realm of X-ray absorption spectroscopy (XAFS), effective data sharing and integration enhance the reliability of results, revealing intricate atomic-scale structures and electronic states. The MDR XAFS DB project, a collaboration among leading research institutes, leverages the FAIR principles to amalgamate XAFS data into the NIMS Materials Data Repository. This integration allows for seamless cross-searchability and standardized metadata handling, supporting machine learning applications in materials development. The database features comprehensive spectral data accessible via a robust API, promoting automated data processing and advancing the field of materials science. This initiative is underpinned by a CC BY-NC-SA 4.0 license, ensuring open access to the scientific community while fostering collaborative advancements.

        Speaker: Takahiro Matsumoto (Japan Synchrotron Radiation Research Institute)
      • 4:30 PM
        Databases supporting X-ray optics simulations 20m

        We summarize here different databases (DABAX, xraylib, DABAM) that have been
        developed to support our activity in X-ray simulations over two decades. DABAX (DAtaBAse for X-ray applications) is a compilation of tables for x-ray applications. DABAX was created to unify the tabulated data for scattering factors, x-ray atomic cross-sections, refraction indices, structure of crystals used for monochro-mators, etc. The DABAX data files are well-structured and customizable ASCII files (SPEC-structure). Developed for XOP [1], in an old IDL environment, they data files were indexed to allow fast access. XOP provided a collection of computer programs to access, visualize, and process these tables. Today, the files are publically available https://github.com/oasys-kit/DabaxFiles and a python library allows to retrieve and manage them https://github.com/oasys-kit/dabax.

        The need of fast access and data simplification for X-ray fluorescence applications drove us to develop a completely new tool for managing these kind of data: xraylib [2]. It is an ANSI C library that provides convenient access to a large number of X-ray related databases, with a focus on quantitative X-ray fluorescence applications. Developed and managed by Tom Schoonjans, it counts today many users and a powerful python interface https://github.com/tschoonj/xraylib. It is well known that the performances of a synchrotron beamline it is limited by the quality of the optical surfaces in mirrors, lenses, etc. To perform realistic simulations it is important to include real measured data of the surface errors. To make available these data and facilitate their access we created DABAM (DAtaBAse for Metrology) [3]. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A proposal for DABAM2D, containg 2D data of the surfaces (instead of single 1D profiles in DABAM) has been launched https://github.com/oasys-esrf-kit/dabam2d.
        All databases described here are extensively used in the OASYS suite [4], an adaptable, customizable and efficient beamline modelling platform.

        References
        [1] Manuel Sanchez del Rio and Roger J. Dejus. XOP: a multiplatform graphical user interface for synchrotron radiation spectral and optics calculations. In Peter Z. Takacs and Thomas W. Tonnessen, editors, Materials, Manufacturing, and Measurement for Synchrotron Radiation Mirrors, volume 3152, pages 148 – 157. International Society for Optics and Photonics, SPIE, 1997.
        [2] Tom Schoonjans, Antonio Brunetti, Bruno Golosio, Manuel Sanchez del Rio, Vicente Armando Sol´e, Claudio Ferrero, and Laszlo Vincze. The xraylib library for x-ray–matter interactions. recent developments. Spectrochimica Acta Part B: Atomic Spectroscopy, 66(11):776–784, 2011.
        [3] Manuel Sanchez del Rio, Davide Bianchi, Daniele Cocco, Mark Glass, Mourad Idir, Jim Metz, Lorenzo Raimondi, Luca Rebuffi, Ruben Reininger, Xianbo Shi, Frank Siewert, Sibylle Spielmann-Jaeggi, Peter Takacs, Muriel Tomasset, Tom Tonnessen, Amparo Vivo, and Valeriy Yashchuk. DABAM: an open-source database of x-ray mirrors metrology. Journal of Synchrotron Radiation, 23(3):665–678, April 2016.
        [4] Luca Rebuffi and Manuel Sanchez del Rio. OASYS (OrAnge SYnchrotron suite): an open-source graphical environment for x-ray virtual experiments. SPIE Proceedings, 10388, Advances in Computational Methods for X-Ray Optics IV, August 2017.

        Speaker: Manuel Sanchez del Rio (ESRF)
      • 4:50 PM
        RefXAS: an open access database of X-ray absorption spectra 20m

        Under DAPHNE4NFDI, the X-ray absorption spectroscopy (XAS) reference database, RefXAS, has been set up. For this purpose, we developed a method to enable users to submit a raw dataset, with its associated metadata, via a dedicated website for inclusion in the database. Implementation of the database includes an upload of metadata to the scientific catalogue and an upload of files via object storage, with automated query capabilities through a web server and visualization of the data and files. Based on the mode of measurements, quality criteria have been formulated for the automated check of any uploaded data. In the present work, the significant metadata fields for reusability, as well as reproducibility of results (FAIR data principles), are discussed. Quality criteria for the data uploaded to the database have been formulated and assessed. Moreover, the usability and interoperability of available XAS data/file formats have been explored. The first version of the RefXAS database prototype is presented, which features a human verification procedure, currently being tested with a new user interface designed specifically for curators; a user-friendly landing page; a full list of datasets; advanced search capabilities; a streamlined upload process; and, finally, a server-side automatic authentication and (meta-) data storage via MongoDB, PostgreSQL and (data-) files via relevant APIs.

        [1] Paripsa, S., Gaur, A., Forste, F., Doronkin, D. E., Malzer, W., Schlesiger, C., Kanngiesser, B., Welter, E., Grunwaldt, J.-D. & Lutzenkirchen-Hecht, D. (2024). J. Synchrotron Rad. 31, 1105-1117.

        Speaker: Sebastian Paripsa (Uni Wuppertal)
    • 5:30 PM 6:00 PM
      Discussion 30m