Machine Learning Methods in Software Engineering

Large-scale pre-training of graph neural networks for ML4SE tasks

This project studies how graph neural networks (GNNs) can be pre-trained on source code. It is an umbrella project consisting of several parts:

A tool for mining graph representations from source code in different languages.
Implementation of GNNs for various ML4SE tasks and pre-training objectives. We implemented and evaluated 8 types of GNNs based on the PyTorch-Geometric library with their scaling in mind.
Building a framework/configurable pipeline for convenient experimentation with ML4SE tasks. The framework is already available.
Suggesting new improvements to the GNN architecture and training objectives.

Egor Bogomolov

Olga Petrova

Mikhail Evtikheev