STARK
STARK is a command-line tool for a bottom-up statistical analysis of dependency parsed corpora, complementing the prevailing approaches to treebank browsing based on predefined queries. For a given treebank in the CONLL-U format, the tool extracts all relevant dependency trees based on user’s preferences (from specific phrases to more abstract syntactic patterns) and quantifies them with respect to frequency and other useful statistics. Within SPOT, STARK will be used to identify speech-specific syntactic patterns in the SST treebank by comparing it to its written SSJ counterpart.
Drevesnik
Drevesnik is a web interface for querying Slovenian dependency parsed corpora, which allows linguists and other researchers to explore various grammatical phenomena in Slovenian. Upon entering the query and selecting the corpora of interest, the results are displayed as dependency trees (graphs) and can also be downloaded for further analysis. Within SPOT, Drevesnik is used for qualitative linguistic analysis of syntactic patterns in the treebanks of written and spoken Slovenian.
Q-CAT
Q-CAT is a desktop application for customizible manual linguistic annotation of corpora, offering advanced corpus query capabilities based on these annotations. The tool has been used in numerous annotation campaigns for Slovenian, including the annotation of dependency relations, semantic roles, named entities, or multi-word expressions. Within SPOT, Q-CAT is employed for the manual dependency parsing of new SST transcripts, for which integration of audio recordings has also been enabled.
Označevalnik
Oznacevalnik CJVT is an online service for automatic grammatical annotation of Slovenian texts. based on the CLASSLA-Stanza tool for Slovenian language processing. It assigns various morphological, syntactic, and semantic features to surface words in the text, such as basic forms, parts of speech, or syntactic roles. Texts annotated in this way significantly facilitate their further analysis by enabling quicker retrieval of relevant linguistic phenomena, for purposes such as linguistic research or data mining.