2:30-3:15 | Bei Wang (University of Utah) | Stratification Learning with Computational Topology: Overview, Challenges, and Opportunities |
3:15-4:00 | Brittany Fasy (Montana State University) | Topological Data Analysis for Roads and Histology |
4:30-5:15 | Melanie Schmidt (University of Bonn) | Practical Theory for Geometric Center Based Clustering |
5:15-6:00 | Ioannis Z. Emiris (NKU Athens, and ATHENA Research Center) | Randomized Projections for Geometric Search in High Dimension |
2:30-3:15 | |
University of Utah | |
A fundamental question in the study of high-dimensional data is as
follows: Given high-dimensional point cloud samples, how can we infer
the structures of the underlying data?
In manifold learning, we assume the data is supported by a
low-dimensional space with a manifold structure.
However, such an assumption may be too restrictive in practice when we
are given point cloud samples not of a manifold but of a stratified
space, which contain singularities and mixed dimensionality.
Stratification learning can be considered as an unsupervised, exploratory, clustering process that infers a decomposition of data sampled from a stratified space into disjoint subsets that capture recognizable and meaningful structural information. In recent years, there have been significant efforts in computational topology that are relevant to stratification learning. In this talk, I will give an overview of the such efforts, and discuss challenges and opportunities. In particular, I will focus on stratification learning using local homology, persistent homology, sheaves, and discrete stratified Morse theory. Bio: Dr. Bei Wang is an assistant professor at the School of Computing, and a faculty member at the Scientific Computing and Imaging (SCI) Institute, University of Utah. She received her Ph.D. in Computer Science from Duke University. Her research interests include topological data analysis, data visualization, machine learning and data mining. |
3:15-4:00 | |
Montana State University | |
Using topological (and geometric) descriptors of data have the potential to be rich sources if information for analysis and comparison. While we aim to have methods be parameter-free in theory, the reality in practice is that there are tuning parameters and decisions to make among the various topological and geometric descriptors. We motive this talk with two problems (1) road network comparision; and (2) prostate cancer biopsy analysis. In both problems, we see that connections with domain experts is important for motivating and advancing the theory.
Bio: Dr. Brittany Terese Fasy is an assistant professor in the School of Computing, and affiliated with the Department of Mathematical Sciences at Montana State University. She received her Ph.D. in Computer Science from Duke University in 2012, and did two post-docs: at Carnegie Mellon University and at Tulane University. Her research is in computational topology. |
4:30-5:15 | |
University of Bonn | |
Clustering is an important unsupervised learning task. In this talk we discuss geometric center based clustering problems like facility location, k-median and k-means. These are well studied in their standard form, yet our focus lies on extending them to more practical scenarios. Firstly, we are interested in adding constraints to the standard formulations of these problems. While capacitated clustering and clustering with outliers have received considerable attention, constraints like fairness (where points have colors and clusters should be balanced) and anonymity (where centers have a lower bound on the number of assigned points) are less studied. Secondly, we review results on big data clustering, in particular in data streams. It is possible to reduce the number of points in a data set (by so-called coresets) and the dimension of the data set (by dimensionality reduction) to a polynom only depending on k and the precision of the summary. We conclude by posing some open questions that arise when combining constraints and the big data scenario.
Bio: Melanie Schmidt received her PhD from TU Dortmund in Germany. After visiting Carnegie Mellon University in Pittsburgh, USA, for a year, she is now working as a PostDoc at the theoretical computer science group at Bonn University. Her work focuses on clustering problems. |
5:15-6:00 | |
NKU Athens, and ATHENA Research Center | |
Several successful algorithms for approximate geometric search, typically relying on tree-based data structures, exhibit complexity exponential in the ambient dimension, a manifestation of the curse of dimensionality. This has been addressed by Locality Sensitive Hashing and randomized hash-tables. We follow a different tack by employing randomized projections, which are simpler to define and implement, but have long been overlooked since they did not seem powerful enough. We manage to show that they achieve competitive construction time and query time complexities as well as smaller space consumption for the fundamental question of approximate nearest-neighbor search in Euclidean space. We extend our approach to more general topologies, including all LSH-able metrics, as well as products of Euclidean metrics, which generalize the discrete Frechet and dynamic time warping distances. Our work is motivated by H2020 project LAMBDA which focuses on expressive, yet long, vector descriptors of complex shapes, and on manipulating one-dimensional objects such as trajectories or time-series.
Bio: Ioannis Emiris received his PhD in 1994 from the University of California at Berkeley. Since 2007 he is Professor of Theoretical Computer Science at the University of Athens where he heads the Laboratory of Algebraic and Geometric Algorithms. The focus of the lab lies on problems in scientific and geometric computing, in particular computational algebra, the study and solution of polynomial equations, as well as computational, convex, and metric geometry. |