Noa
Agiv

Roundtables – Agile Data Science

Check Point

Noa Agiv

Noa
Agiv

Roundtables – Agile Data Science

Check Point

Noa Agiv

Bio

Noa Agiv is Data Science team lead at Check Point, the cyber security company. Until recently Noa focused on securing smartphones and mobile application markets from malicious applications. This was achieved by creating machine learning models based on applications’ code and artifacts, reputation on web, and their behavior detected by Check Point’s security engines. After a great success in empowering mobile security detection, Noa’s team now dedicates its expertise and efforts to enable data science cross-organization. They are building a framework that eases data access, research and productization, and lets researchers from various departments focus on the data itself by using shared resources and in-code methodologies.

 

Bio

Noa Agiv is Data Science team lead at Check Point, the cyber security company. Until recently Noa focused on securing smartphones and mobile application markets from malicious applications. This was achieved by creating machine learning models based on applications’ code and artifacts, reputation on web, and their behavior detected by Check Point’s security engines. After a great success in empowering mobile security detection, Noa’s team now dedicates its expertise and efforts to enable data science cross-organization. They are building a framework that eases data access, research and productization, and lets researchers from various departments focus on the data itself by using shared resources and in-code methodologies.

 

Abstract

2 years ago, as a new data science team inside Check Point, and as developers in heart, we wanted to research, compare and ship maintainable, retrainable cyber detection models to production quickly.

Soon enough, we came to realize that apart from good models, we need a data science framework to help us achieve that.

Now, 2 years later, equipped with the data science framework we built, we are set to embark in a new journey to generalize this framework into a global one for data scientists’ usage cross-organization.

Come hear about how this framework is used to research with minimal code, explore research’s value, and convert it into a valuable production service effortlessly.

Among the topics discussed, you will hear about streamed training, intermediate results caching, combining different data sources into a joint model, human-readable-probability prediction, threshold selection mechanism, advanced automated model error analysis, and continuous delivery of data science production components.

Abstract

2 years ago, as a new data science team inside Check Point, and as developers in heart, we wanted to research, compare and ship maintainable, retrainable cyber detection models to production quickly.

Soon enough, we came to realize that apart from good models, we need a data science framework to help us achieve that.

Now, 2 years later, equipped with the data science framework we built, we are set to embark in a new journey to generalize this framework into a global one for data scientists’ usage cross-organization.

Come hear about how this framework is used to research with minimal code, explore research’s value, and convert it into a valuable production service effortlessly.

Among the topics discussed, you will hear about streamed training, intermediate results caching, combining different data sources into a joint model, human-readable-probability prediction, threshold selection mechanism, advanced automated model error analysis, and continuous delivery of data science production components.

Discussion Points

  • What are the challenges in data science cycle?
  • How can agile data science development be achieved?
  • Approaches for productization infrastructure
  • Approaches for data pipeline infrastructure
  • Approaches for model evaluation infrastructure
  • Approaches for data exploration infrastructure
  • In-code methodologies as a concept
  • Can data & research functionalities be managed like in source control?

Discussion Points

  • What are the challenges in data science cycle?
  • How can agile data science development be achieved?
  • Approaches for productization infrastructure
  • Approaches for data pipeline infrastructure
  • Approaches for model evaluation infrastructure
  • Approaches for data exploration infrastructure
  • In-code methodologies as a concept
  • Can data & research functionalities be managed like in source control?

Planned Agenda

Planned Agenda