Thesis topics from Simmo Saan

This page lists concrete thesis topics I can offer (as of autumn 2024), but other related topics could be possible on request. You may be inspired by the list of defended theses which I (co-)supervised.

As of autumn 2024, I am at capacity and not taking on additional students to supervise.

Automaadid, keeled ja translaatorid (AKT)

Our compiler course has remained largely unchanged for many years but we would like to make some updates. This is where you can help us if you're interested in Java development.
(These are in Estonian because the course is in Estonian.)

BSc:

  1. AKT koodibaasi moderniseerimine Java 21-le. AKT üheks keskseks teemaks on avaldispuu läbimine, milleks kasutatakse visitor-i disainimustrit. Visitor-id on keerulised ja koodirohked, mistõttu võiks need asendada moodsamate Java 21 võimalustega: mustrisobitus, sealed klassid ja kirjed. Lõputöö eesmärk on AKT koodibaas nende abil moderniseerida. See vajab Java koodi refaktoriseerimist.
  2. Funktsioonide tugi CMa teegile. AKT-s õpetatakse kompileerimist abstraktse masina CMa põhjal. Selleks on loodud CMa teek, mis on inspireeritud ASM teegist Java baitkoodi genereerimiseks. CMa võimaldab funktsioonikutseid, kuid AKT-s seda osa ei õpetata ja meie CMa teek seda ei toeta. Lõputöö eesmärk on täiendada seda CMa teeki funktsioonide toega. See vajab Java teegi disainimist ja programmeerimist.

(Võimalus oleks ka AKT süvenduspraktikumi jaoks mingeid materjale luua OCaml-is.)


Goblint

My research is centered around our Goblint static analyzer for (multi-threaded) C programs. Since we don't (widely) teach program analysis or OCaml, these topics require some learning but only about specific parts of Goblint. These are a good step if you're interested in continuing with a MSc degree and getting deeper into program analysis.
(These are in English because Goblint research and development is in English, but BSc theses are usually written still in Estonian.)

BSc:

  1. Code Coverage Reports for Static Analysis. In software testing, code coverage is used to measure and visualize which parts of the program are executed by a test suite to discover untested code. The goal of this thesis is to produce similar reports from static analysis to discover unanalyzed code. It requires simple OCaml programming and little knowledge of Goblint's analyses.
  2. Coloring Data Race Interference Graphs. Goblint detects and reports data races based on an interference graph of memory accesses. The goal of this thesis is to apply graph coloring algorithms to such graphs and improve Goblint's race warnings using this information. It requires OCaml programming and understanding of Goblint's race analysis.
  3. Bit-Vector Abstract Domain for Goblint. C code, which is analyzed by Goblint, often uses low-level bitwise operations on integers. Goblint currently lacks an abstraction of integers that can reason about individual bits. The goal of this thesis is to add a bit-vector abstract domain to Goblint. This requires OCaml programming and understanding of Goblint's integer domains.
  4. Weak Dependencies for Side-Effecting Constraint Systems. Goblint uses side-effecting constraint systems and the TD3 solver to analyze multi-threaded programs. Due to the demand-driven nature of the solver, Goblint's constraint system has an inefficiency: the spawning thread fully depends on the spawned thread. The goal of this thesis is to avoid this by introducing weak dependencies which hopefully can speed up Goblint. It requires OCaml programming and understanding of the TD3 solver.
  5. Alternative Control-Flow Graphs for Goblint. Goblint represents code to-be analyzed using control-flow graphs (CFGs). Sometimes they are too optimized for debugging, other times they may be too unoptimized for usability or performance. The goal of this thesis is to add alternative CFG constructions to Goblint. It requires OCaml programming.

MSc:

  1. Garbage-Collector–Aware Static Analysis of OCaml C stubs. OCaml programs can be extended using low-level C code (aka C stubs) to integrate existing C libraries or achieve higher performance. Writing C stubs is extremely error-prone, because they have to live in harmony with the OCaml garbage collector (GC), and difficult to debug. Goblint has previously been used to do race detection of OCaml 5 C stubs. The goal of this thesis is to use Goblint for GC-aware analysis of OCaml C stubs. It requires OCaml programming, understanding of Goblint and OCaml GC.

Goblint-adjacent

Goblint is a decently-sized project which requires and uses various supporting software. These topics don't involve OCaml programming and require less program analysis background knowledge, but still help the Goblint project as a whole.
(These are in English because Goblint research and development is in English, but BSc theses are usually written still in Estonian.)

BSc:

  1. Minimization for Octagon Abstract Domain in Apron. Goblint uses the Apron library for relational numeric analysis. Its octagon domain infers two-variable linear inequalities of the form ±x ± y ≤ c . The output from Apron is not very user-friendly: for large programs with many variables there are many redundant constraints which are implied by others. The goal of this thesis is to implement octagon minimization in Apron. It requires C programming and basic linear algebra.
  2. Verified history-based thread IDs. Goblint recently introduced history-based thread IDs which are an abstraction of the stack data structure. Some operations in the paper can be made more precise by relying on additional properties. The goal of this thesis is to formalize these thread IDs/stack abstraction, prove such properties and the correctness of the improved operations. It requires proving in the Coq proof assistant.
  3. GobExec for testing. For over 16 years, Goblint has been tested using a Ruby script which no longer fulfills our testing needs. The goal of this thesis is to extend GobExec, a new benchmarking framework for Goblint, with testing support to replace our existing script. It requires advanced Python programming and minimal knowledge of Goblint itself.