Adi
Watzman

Poster – How to trust your data and results when there is no ground truth?

Weizmann Institute of Science

Adi Watzman

Adi
Watzman

Poster – How to trust your data and results when there is no ground truth?

Weizmann Institute of Science

Adi Watzman

Bio

Adi is a computer science researcher with passion for data and science. She is currently developing unsupervised methods for human microbiome research as part of her Masters studies in the Weizmann Institute of Science (supervised by Prof. Eran Segal), and for improving public transportation in Israel as part of the Open-Bus team in The Public Knowledge Workshop.

 

She is dealing on a daily basis with challenges of uncovering hidden structures in a complex data that lacks a ground truth.

Bio

Adi is a computer science researcher with passion for data and science. She is currently developing unsupervised methods for human microbiome research as part of her Masters studies in the Weizmann Institute of Science (supervised by Prof. Eran Segal), and for improving public transportation in Israel as part of the Open-Bus team in The Public Knowledge Workshop.

 

She is dealing on a daily basis with challenges of uncovering hidden structures in a complex data that lacks a ground truth.

Abstract

How to trust your data and results when there is no ground truth?
Many data driven research questions share a common challenge – there is no ground truth. All we have is the raw unlabeled data and our research question. In this talk (/poster) I will present my journey for uncovering human microbiome dynamics from a raw, noisy and biased DNA sequencing data. I will show how I combined a few basic domain-based assumptions with simulations and permutation tests to gain confidence in my method and results. I will also demonstrate how I took advantage of some simple properties of the data to reveal new insights.


In my research I studied how environmental agents such as antibiotics, probiotics and the people we live with affect the bacterial population that reside in our gut. I developed a computational method for estimating the DNA sequence similarity between bacterial strains, and applied it to samples acquired from different individuals and from same individuals across time. I then used an unsupervised approach to uncover hidden structures of the pairwise dissimilarity matrices, and reveal bacterial sharing between individuals.

Abstract

How to trust your data and results when there is no ground truth?
Many data driven research questions share a common challenge – there is no ground truth. All we have is the raw unlabeled data and our research question. In this talk (/poster) I will present my journey for uncovering human microbiome dynamics from a raw, noisy and biased DNA sequencing data. I will show how I combined a few basic domain-based assumptions with simulations and permutation tests to gain confidence in my method and results. I will also demonstrate how I took advantage of some simple properties of the data to reveal new insights.


In my research I studied how environmental agents such as antibiotics, probiotics and the people we live with affect the bacterial population that reside in our gut. I developed a computational method for estimating the DNA sequence similarity between bacterial strains, and applied it to samples acquired from different individuals and from same individuals across time. I then used an unsupervised approach to uncover hidden structures of the pairwise dissimilarity matrices, and reveal bacterial sharing between individuals.