Biography: Taeho Jo works currently as a faculty member in Hongik University, South Korea. He received his Bachelor degree from Korea University in 1994, his Master degree from Pohang University of Science and Technology in 1997, and his PhD degree from University of Ottawa in 2006. His research area spans mainly over text mining, neural networks, machine learning, and information retrieval. He has the four year experience of working for industrial organizations and ten year experience of working for academic ones. Recently, he is awarded in the world wide biography dictionary, Marquis Who’s Who in the World, two times in 2016 and 2018.
Speech Title: Types of Text Clustering and Text Encoding Schemes
Abstract: This tutorial is concerned with the modifications of machine learning algorithms for processing textual data. Among tasks of processing symbolic data, we set the text classification as the scope of this tutorial, and explore the various types of classification tasks. In this tutorial, we describe the process of encoding texts into tables, string vectors, and graphs as well as numerical vectors as the structured forms. We present the schemes of modifying the machine learning algorithms such as k nearest neighbor, Naïve Bayes, learning vector quantization, and perceptron, into the version which can receive the alternative structured forms to numerical vectors as the input data. Therefore, the goal of this research is to improve the word classification performance by solving problems such as dimensionality and sparse distribution in encoding textual data into numerical vectors.