Sep 23 – 27, 2024
ESRF Auditorium
Europe/Paris timezone

20 years of the COD – disseminating crystallographic data

Sep 25, 2024, 8:40 AM
45m
Hybrid event (ESRF Auditorium)

Hybrid event

ESRF Auditorium

EPN Campus ESRF - ILL 71 Av. des Martyrs, 38000 Grenoble
Invited Talk FAIR data management Invited Speakers

Speaker

Saulius Grazulis (VU Institute of Biotechnology, Life Science Center)

Description

For more than 20 year, the Crystallography Open Database (COD) collects published crystal structure data and makes it available on the Web under the CC0 license in an organised, machine readable and searchable for. Currently, the collection of the COD has over 500 thous. records and is used for crystal analysis, material identification, DFT calculations, machine learning, teaching and much more. To be usable for such applications, the COD data must satisfy certain quality criteria. All data that are deposited to the COD undergo tree levels of checks – syntax checks, semantic validation against the IUCr dictionaries and COD specific crystallographic checks based in the IUCr journal publication rules.

Over the years, software tools were developed for these tasks that are routinely used in the COD pipelines but can also be used in other applications. The software is developed using high standards of development, undergoes systematic testing, code review and is versioned using SemVer principles. Recently, we are exploring possibilities to apply methods of formal validation to the COD and other crystallographic software, using durable time-proven systems such as Ada/SPARK, Perl, SQL and C.

The use of Open Source software, the support of the community, European and Lithuanian funders and the Vilnius University allowed us to sustain the COD for more than two decades. The COD now is becoming an essential part of the Open Science FAIR data infrastructure, as attested by the Vilnius University Open Science policy roadmap and by the inclusion of the COD in numerous open or database catalogues. The goal is to collect and make openly searchable all crystallographic data that was published in reliable sources. This will open paths to implement crystal structure and property prediction and better understanding of how matter is organised in its crystalline form.

Primary author

Saulius Grazulis (VU Institute of Biotechnology, Life Science Center)

Presentation materials