Meno:Michal
Priezvisko:Linhard
Názov:Data structure for representation of maximal repeats in strings
Vedúci:Mgr. Tibor Hegedüs
Rok:2007
Blok:PPS
Kµúčové slová:maximal repeats, representation of repeats, repetition in strings
Abstrakt:In this diploma thesis we present data structure for representation of maximal repeats in strings - R3 tree, based on well known data structure - suffix tree. It requires O(n) space and it can be constructed in O(n) time and space for string of length n over constant-sized alphabet. We formalize repeat in string S as triple (p1, p2, l), where p1, p2 are two distinct positions in S and l is the length of the repeat. We formulate query for maximal repeats in S in the form of the function findPairs(p1, k, S) that returns all pairs (p2, l) such that (p1, p2, l) is maximal repeat in S with l >= k. R3 tree allows computation of findPairs queries in optimal time O(z), where z is the number of found pairs. We also describe design and functionality of R3lib - library written in C, for finding maximal repeats in arbitrary binary data, that works with proposed structure.

Súbory diplomovej práce:

DiplomaThesis.pdf
r3lib1.0.0.zip