Topics of Wilfried Gansterer (wilfried.gansterer(at)univie.ac.at) focus on various aspects of numerical algorithms. For these topics, interest in numerical algorithms and (large-scale) matrix computations as well as in high performance computing and parallel computing is usually required. You find a list of currently open topics below, but you can also contact me and suggest your own project idea!

  • Efficient check-pointing for achieving fault tolerance in large-scale computations
  • Fault tolerant iterative linear solvers
    • Interpolation-based fault tolerance for the GMRES algorithm
    • Exact state reconstruction for the GMRES algorithm
  • Robustness and fault tolerance in (deep) neural networks
  • Spectral divide-and-conquer algorithms for solving large-scale eigenvalue problems
  • Crime prediction challenge: designing, implementing and evaluating a method for predicting time and location of criminal incidents based on historical data
  • Realtime Breaking News Detection
  • With the recent rise in popularity and size of social media, there is a growing need for systems that can extract useful information from this amount of data. 500 Mio. tweets are produced day by day, many of them being retweets or follow up messages. The challenge is to find initial news corresponding to real world events. To make event detection feasible on web scale corpora, algorithms based on locality sensitive hashing are inspected more closely. These algorithms will be implemented and tested practically, taking a close look on the runtime requirements. The goal is to detect events in real time based on the Twitter streaming API.

    Area: High Performance Computing, Data Mining
    Recommended: FDA, Multimedia Retrieval and Content Based Search
    Language: C++/ Python

    Contact: Univ. Prof. Dipl. Ing. Dr. Wilfried Gansterer, M.Sc. / Markus Tretzmüller

    Note: This project is supported by a startup dedicated to financial time series forecasting.

  • High Performance Topic Models for Short-Text Classification
  • Short texts are popular on today's Web, especially with the emergence of social media. Inferring topics from large scale short texts becomes a critical but challenging task for many content analysis tasks. An extensive collection of Tweets is exploited to test topic models, such as the Mixture of Unigrams or the Latent Dirichlet Allocation, in terms of effectiveness and efficiency. The goal is to extract hidden topics in a text corpus of Tweets in reasonable runtime.

    Area: Data Mining, High Performance Computing
    Requirements: FDA
    Language: C++ / Python

    Contact: Univ. Prof. Dipl. Ing. Dr. Wilfried Gansterer, M.Sc. / Markus Tretzmüller

    Note: This project is supported by a startup dedicated to financial time series forecasting.

  • Outlier Detection for Text Data Streams by Low Rank Approximation
  • Outlier detection is a key task in data stream analysis and extremely challenging in many domains such as text in which the feature vectors attribute values are almost zeros. In such cases it often becomes difficult to separate the outliers from the natural variations. One strategy of outlier detection in the context of text data is an iterative algorithm called TONMF and based on a Block Coordinate Descent method. An advantage of matrix factorization methods is that they decompose the term document structure of the underlying corpus into a set of semantic term clusters and document clusters. The semantic nature of this decomposition provides the context in which a document may be interpreted for outlier analysis. Thus, documents can be decomposed into word clusters, and words are decomposed into document clusters with a low rank approximation. Outlier are therefore defined as data points which cannot be naturally expressed in terms of this decomposition.

    Area: Data Mining, High Performance Computing
    Preferred Requirements: Numerical High Performance Algorithms

    Contact: Univ. Prof. Dipl. Ing. Dr. Wilfried Gansterer, M.Sc. / Michael Trimmel

    Note: This project is supported by a startup dedicated to financial time series forecasting.