Andrea
Karlova

Poster – Quantile Distortion Measures

Fractal Labs

Andrea Karlova

Andrea
Karlova

Poster – Quantile Distortion Measures

Fractal Labs

Andrea Karlova

Bio

Andrea Karlova is a Head of Data Science in Fractal Labs Ltd. Prior to switching to startup industry, she was funded by Gorrila Science and under supervision of Patrick S. Hagan she co-authored new volatility surface model.


During that time she has been visiting Columbia University, Oxford Man Institute and University College London.


Early in her career Andrea spent 5 years by working at Model Validation Unit at KBC Global Services and as a consultant for PwC.


Her main background is in Mathematics, mainly in Probability Theory. She developed extensive knowledge of deep learning and reinforcement learning during her close collaboration with the Computer Science Department at University College London.

Bio

Andrea Karlova is a Head of Data Science in Fractal Labs Ltd. Prior to switching to startup industry, she was funded by Gorrila Science and under supervision of Patrick S. Hagan she co-authored new volatility surface model.


During that time she has been visiting Columbia University, Oxford Man Institute and University College London.


Early in her career Andrea spent 5 years by working at Model Validation Unit at KBC Global Services and as a consultant for PwC.


Her main background is in Mathematics, mainly in Probability Theory. She developed extensive knowledge of deep learning and reinforcement learning during her close collaboration with the Computer Science Department at University College London.

Abstract

As a part of feature preprocessing, we often need to identify the injective mapping between the space of the original data manifold and some more feasible space, say Euclidian. Example of such a task are dimensionality reduction tasks such as UMAP, t-SNE, PCA. In natural language related tasks this problem relates to different types of word embedding algorithms. In order to measure the quality of the embedding.


it is often helpful to measure the distortion caused by switching from manifold equipped with certain geometric properties into the different metric space. Typically the distortion measures are constructed as a functional of distances in the new metric space compared to distances in original space. The problem hase been actively addressed by defining e.g. sigma-distortion, worst case distortion, avarage case distortion, k-local distortion. All listed measures concentrate on particular regions of density of the ratio of distances, rather provide information about the character of this density. This motivated us to introduce new measure based on the quantiles of the density of functionals of these distances. Therefore we have set of measures which quantify the overall distortion caused by the embedding.

We present novel theoretical results as well as simulation experiments on the problems related to dimensionality reduction algorithms as well as word embeddings.

Abstract

As a part of feature preprocessing, we often need to identify the injective mapping between the space of the original data manifold and some more feasible space, say Euclidian. Example of such a task are dimensionality reduction tasks such as UMAP, t-SNE, PCA. In natural language related tasks this problem relates to different types of word embedding algorithms. In order to measure the quality of the embedding.


it is often helpful to measure the distortion caused by switching from manifold equipped with certain geometric properties into the different metric space. Typically the distortion measures are constructed as a functional of distances in the new metric space compared to distances in original space. The problem hase been actively addressed by defining e.g. sigma-distortion, worst case distortion, avarage case distortion, k-local distortion. All listed measures concentrate on particular regions of density of the ratio of distances, rather provide information about the character of this density. This motivated us to introduce new measure based on the quantiles of the density of functionals of these distances. Therefore we have set of measures which quantify the overall distortion caused by the embedding.

We present novel theoretical results as well as simulation experiments on the problems related to dimensionality reduction algorithms as well as word embeddings.