Thesis topics from Simmo Saan

This page lists concrete thesis topics I can offer (as of autumn 2025), but other related topics could be possible on request. You may be inspired by the list of defended theses which I (co-)supervised.


Goblint

My research is centered around our Goblint static analyzer for (multi-threaded) C programs. Since we don't (widely) teach program analysis or OCaml, these topics require some learning but only about specific parts of Goblint. These are a good step if you're interested in continuing with a MSc degree and getting deeper into program analysis.
(These are in English because Goblint research and development is in English, but BSc theses are usually still written in Estonian.)

BSc:

  1. Evaluating Side-Effecting Constraint System Solvers. Goblint analyzes multi-threaded programs using side-effecting constraint systems, which were introduced 22 years ago by Seidl, Vene and Müller-Olm. Many solvers have been proposed for these systems: naïve, worklist, SLR and TD3. They have been evaluated individually, but not against each other. The goal of this thesis is to compare them all experimentally and understand the differences. It requires OCaml programming and understanding of the solvers.
  2. Early Destabilization in the Top-Down Constraint Solver. Goblint uses side-effecting constraint systems and the TD3 solver to analyze multi-threaded programs. This solver has an inefficiency: it keeps computing things which it knows have to be recomputed. The goal of this thesis is to avoid this by introducing early destabilization which hopefully can speed up Goblint. It requires OCaml programming and understanding of the TD3 solver.
  3. Value Analysis of Thread-Local Variables in Goblint. One idiom found in multi-threaded programs is the use of thread-local variables. However, they are poorly supported by static analyzers, including Goblint. The goal of this thesis is to improve the precision of the value analysis of thread-local variables in Goblint. It requires OCaml programming.
  4. Value Analysis of Atomic Variables in Goblint. One idiom found in multi-threaded programs is the use of atomic variables. However, they are poorly supported by static analyzers, including Goblint. The goal of this thesis is to improve the precision of the value analysis of atomic variables in Goblint. It requires OCaml programming.
  5. Subsumption in Data Race Analysis. Goblint detects and reports data races using a combination of locksets, thread IDs, etc. The goal of this thesis is to automatically identify certain redundancies in Goblint's race warnings and simplify them. It requires OCaml programming and understanding of Goblint's race analysis.
  6. Alternative Control-Flow Graphs for Goblint. Goblint represents code-to-be-analyzed using control-flow graphs (CFGs). Sometimes they are too optimized for debugging, other times they may be too unoptimized for usability or performance. The goal of this thesis is to add alternative CFG constructions to Goblint. It requires OCaml programming.
  7. Code Coverage Reports for Static Analysis. In software testing, code coverage is used to measure and visualize which parts of the program are executed by a test suite to discover untested code. The goal of this thesis is to produce similar reports from static analysis to discover unanalyzed code. It requires simple OCaml programming and little knowledge of Goblint's analyses.

MSc:

  1. Self-Refining Abstract Interpretation. Counterexample-guided abstraction refinement (CEGAR) is a general method for automatically improving the precision of program analysis by excluding impossible execution paths. The version currently implemented in Goblint is very limited and relies on an external SMT solver. The goal of this thesis is to use abstract interpretation itself to identify impossible execution paths. It requires OCaml programming and understanding of abstract interpretation.
  2. Why Doesn't Goblint Scale Up? Goblint uses abstract interpretation which is supposed to be fast. Although Goblint is quick on small and medium-sized programs, it does not scale so well to large real-world programs, which other abstract interpreters can handle. The goal of this thesis is to find out why and try to improve the scalability of Goblint. This requires profiling OCaml code and understanding of Goblint's analyses.

Goblint-adjacent

Goblint is a decently-sized project which requires and uses various supporting software. These topics don't involve OCaml programming and require less program analysis background knowledge, but still help the Goblint project as a whole.
(These are in English because Goblint research and development is in English, but BSc theses are usually still written in Estonian.)

BSc:

  1. Minimization for Octagon Abstract Domain in Apron. Goblint uses the Apron library for relational numeric analysis. Its octagon domain infers two-variable linear inequalities of the form ±x ± y ≤ c . The output from Apron is not very user-friendly: for large programs with many variables there are many redundant constraints which are implied by others. The goal of this thesis is to implement octagon minimization in Apron. It requires C programming and basic linear algebra.
  2. SV-COMP Benchmarks from Goblint Regression Tests. The Competition on Software Verification (SV-COMP) has the largest benchmark suite of C programs for automated software verification. Goblint itself uses C programs as regression tests, but they are not directly suitable for SV-COMP. The goal of this thesis is to develop a robust automatic method for converting Goblint regression tests into SV-COMP benchmarks. It requires minimal knowledge of Goblint itself.
  3. Verified History-Based Thread IDs. Goblint uses history-based thread IDs which are an abstraction of the stack data structure. The goal of this thesis is to formalize these thread IDs/stack abstraction and prove them correct. It requires proving in a proof assistant (e.g. Rocq).
  4. GobExec for Testing. For over 16 years, Goblint has been tested using a Ruby script which no longer fulfills our testing needs. The goal of this thesis is to extend GobExec, a new benchmarking framework for Goblint, with testing support to replace our existing script. It requires advanced Python programming and minimal knowledge of Goblint itself.

Automaadid, keeled ja translaatorid (AKT)

Our compiler course has remained largely unchanged for many years but we would like to make some updates. This is where you can help us if you're interested in Java development.
(These are in Estonian because the course is in Estonian.)

BSc:

  1. Funktsioonide tugi CMa teegile. AKT-s õpetatakse kompileerimist abstraktse masina CMa põhjal. Selleks on loodud CMa teek, mis on inspireeritud ASM teegist Java baitkoodi genereerimiseks. CMa võimaldab funktsioonikutseid, kuid AKT-s seda osa ei õpetata ja meie CMa teek seda ei toeta. Lõputöö eesmärk on täiendada seda CMa teeki funktsioonide toega. See vajab Java teegi disainimist ja programmeerimist.

(Võimalus oleks ka AKT süvenduspraktikumi jaoks mingeid materjale luua OCaml-is.)