Lightning Talks: 3 talks ranging from genomics, KNIME, and wrangling with NLP

October 7, 2020 @ 6:30 pm - 8:00 pm UTC-4

Join us for three short talks.

—-Drag & Drop Data Science with KNIME Open Source, presented by Pedro Alexander Medina

—- Multi-language Data Wrangling Translations, presented by Anton Antonov

—- High-dimensional Genomics and COVID-19: Analysis of Patient Lung Cells, presented by James Choi

More info about the talks and the speakers:

—- Drag & Drop Data Science with KNIME Open Source
Medina is Founder & Chief Analytics Officer at Haystack Data Solutions, an Advanced Analytics Agency specializing in custom managed solutions across the data value chain. With deep expertise in information management and advanced analytics, his mission is to help organizations optimize their strategic data assets by converting complex data into intelligence; intelligence into innovation; innovation into success.

—- Multi-language Data Wrangling Translations
This presentation discusses how to facilitate the rapid specification of data wrangling programming code using natural language commands.

We want to do that because:

1. Often, we have to apply the same data wrangling workflows within different programming languages and/or packages

2. It might be time consuming to express those workflows with the concrete language/package logic and syntax

3. Natural language workflows are “universal”

We will demonstrate data transformation code generation for different programming languages/packages.
We focus on these three: R-base, R-tidyverse, Python-pandas.

In addition to code generation examples, we will also outline the utilized software strategy and architecture and the unit testing procedures.

Anton Antonov, Senior Research Scientist at Accendo Data, LLC is an applied mathematician (PhD) with 28+ years of experience in algorithm development, scientific computing, mathematical modeling, operations research, natural language processing, machine learning, data science, and data mining. In the last twelve years, he focused on developing machine learning algorithms and workflows for different industries (music, movies, points of interests, recruiting, and healthcare). Currently, he is working on operations research and data science applications to manufacturing and healthcare. Anton is a former kernel developer of Mathematica.

—- High-dimensional Genomics and COVID-19: Analysis of Patient Lung Cells
High-dimensional datasets in biology are on the rise. For example, genetic data can be extracted from each individual cell in the body to understand what genes are important for fighting disease. James will be analyzing genomics data from a COVID-19 dataset to demonstrate some of the techniques used in the field.

James Choi is a research associate at the University of Miami and studies inflammation after neurological injury using a mix of laboratory and computational methods. He has a general interest in applying machine learning techniques in medicine


October 7, 2020
6:30 pm - 8:00 pm UTC-4


Data Science Study Group: South Florida

