Cyprus Open / Astroparticle Physics meets Data Science

Open Lecture & Workshop Schedule

11 May

6:30 pm – 8:00 pm

D. Kostunin "Astroparticle Physics and Multi-Messenger Astronomy: New Challenges in the Petascale Era"

Astroparticle physics studies high-energy particles with cosmic origins that are detected on Earth. Emerging more than a century ago with the detection of cosmic rays, this field now deals with high-energy gamma-rays and neutrinos. Over the last few decades, astroparticle physics has evolved towards multi-messenger astronomy, where information from multiple messengers such as photons, gravitational waves, and cosmic particles is combined to gain a deeper understanding of astrophysical phenomena. The development of new instruments and connecting different telescopes into a common network is bringing the field into the petascale era, where data science techniques are becoming crucial for future discoveries. In this talk, I will give an overview of astroparticle physics and multi-messenger astronomy, highlight the latest achievements, and discuss the synergies between astrophysics and data science, with examples from the Astroparticle Physics Lab at JetBrains Research.

Dr. Dmitriy Kostunin is a researcher and developer in astroparticle physics with over a decade of experience. His primary focus is on improving the sensitivity of instrumentation to detect the highest energy particles with the goal of uncovering the origins of the most extreme events in the universe. He began his career by designing and prototyping instruments for neutrino and cosmic-ray detection, and now primarily focuses on the development and construction of the next-generation gamma-ray observatory, the Cherenkov Telescope Array.

12 May

11:45 am – 12:20 am

A. Alkan "Astro-COLIBRI and NLP: Enhancing Astrophysics Discoveries with Automated Message Analysis"

Observations performed by astronomical facilities worldwide significantly increase the amount of data in different formats: machine-readable messages such as the VOEvents, and human-written reports such as GCN Circulars, Atels and AstroNotes.

Analyzing this data is crucial for understanding the complex phenomena of the universe. However, one of the major remaining challenges is the speed of data analysis. As the amount of data continues to grow, the time required for analyzing this data also increases, leading to a bottleneck in the research process.

In this contribution, we will first present Astro-COLIBRI, a platform that tackles the challenges of fast analysis of machine-readable messages.

Astro-COLIBRI evaluates alerts of transient observations in real time, filters them by user-specified criteria, and puts them into their multiwavelength and multimessenger context.

In addition, we will present the use of Natural Language Processing (NLP) methods for enriching the Astro-COLIBRI platform and enabling the automatic analysis of human-written observation reports for information extraction purposes (Named Entity Recognition).

12:20 am – 12:55 am

A. Chaikova "A Deeper Dive into Astronomy Reports using NIMBUS"

Multi-messenger astronomy has gained significant interest in recent years, resulting in a substantial increase in published observation reports. Human-written reports such as GCN Circulars and ATels provide important interpretations and discussions of observations, but lack a defined format, making it difficult to analyze and cross-link data to enable the discovery of new patterns.

In the field of natural language processing, named entity extraction (NER) refers to the process of identifying named entities such as people, organizations, and other relevant terms in a particular domain. Developing an end-to-end NER system that harnesses the power of large language models is our approach to overcoming the challenges typically associated with NER, including limited labeled datasets and textual ambiguity.

Our research involves the comparison of various techniques for extracting information from reports, using both zero-shot and few-shot learning approaches, as well as incorporating human feedback into these methods. Additionally, we investigate methods for ranking the outputs of LLMs, which is necessary due to the use of multiple model sampling - a common way to improve model performance. To understand how the model architecture and size affect the quality of output, we compare several models, including the recently released Chat-GPT and GPT-4.

The data collected through the research is utilized to develop NIMBUS, a web application designed to search and cross-reference ATels and GCN circulars.

12:55 am – 1:30 pm

S. Ohm "Making cosmic particle accelerators visible and audible"

Communication of scientific results to the public becomes ever more important. In this contribution I will present different outreach projects that are set at the interface between art and science and aims at making cosmic particle accelerators as measured through gamma rays more approachable and accessible. The animations are driven by realistic physics input from measurements and theory, while being visually appealing. Based on three very different cosmic objects: a binary star system, a gamma-ray burst, and a nova explosion, I will discuss which inputs were used in the simulation and how different physical concepts were brought to life. The soundtrack is composed based on scenes, elements and cuts in the video and captures the extreme conditions in cosmic sources, thereby adding another dimension to the experience.

The sonification was based on the final animation, which pre-defined the appearance and disapperance of different sound elements as well their transition to form the final soundtrack. A dedicated social media campaign on Instagram and Twitter highlighting the sound aspect of the cooperation was developed and realised in cooperation between the social media teams of Carsten Nicolai, DESY, and Science Communication Lab.

Collaborators: Konrad Rappaport, Carsten Nicolai