05 Dec Analytic and Machine Learning
Innovative and effective analytic techniques and technologies are required to operate, continuously and in real-time, on the data streams and other sources data (Jayanthi, 2016). Machine learning is a discipline that aims to enable computers to, without being explicitly programmed, automate data-driven model building and hidden insights discovery, i.e., to automate behaviour or the logic for the resolution of a particular problem, via iterative learning from example data or past experience (Alpaydın, 2010; SAS, 2018; Bhavsar et al, 2017). In the past, there have existed many successful applications of machine learning, including systems that analyse past sales data to predict customer behaviour, optimize robot behaviour so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data (Bhavsar et al, 2017).
Machine learning is the discipline that aims to make computers and software learn how to program itself and improve with experience/data, with the goal of solving particular problems (Mitchell, 2006). Typically, a machine learning algorithm is a specific recipe that tells a computer/software how to improve itself from experience. A model is the result of training a machine learning algorithm with a set of data or experiences of a given problem, and it can be employed to solve future related problems.
Machine learning algorithms fall into one of the following categories: supervised learning, unsupervised learning, and reinforcement learning. Next, we briefly discuss each of these categories and describe some of the most relevant techniques for each category:
- In supervised learning, the aim is learning a mapping from an input to an expected output that is provided by a supervisor or oracle (i.e., labelled data) (Alpaydın, 2010). Depending on the type of output, we say that we either have a classification or a regression problem. In the first case, we aim to produce a discrete and finite number of possible outputs, while in the second case the range of possible outputs are infinite and numeric (Alpaydın, 2010). In unsupervised learning, there is no such supervisor and only the input data is present. The aim of these algorithms is finding regularities in the input (Alpaydın, 2010;Bhavsar et al, 2017)
- Finally, reinforcement learning applies to the cases where the learner is a decision-makingagent that takes actions in an environment and receives reward (or penalty) for its actions in trying to solve a problem. Thus, the learning process is guided by a series of feedback/reward cycles (Bhavsar et al, 2017). Here, the learning algorithm is not based on given examples of optimal outputs, in contrast to supervised learning, but instead it must discover them by a process of trial and error (Bishop, 2006; Fei, 2019).
- Alpaydın, E. (2010). “Introduction to Machine Learning”. Second ed.
- Bhavsar, P., Safro, I., Bouaynaya, N., Polikar, R., Dera, D. (2017). “Machine learning in transportation data analysis, in: Data Anal. Intell. Transp. Syst”. pp. 283–307.
- Bishop, C.M. (2006). “Pattern Recognition and Machine Learning”. Springer, New York, New York, USA.
- Fei, , Shah, N., Verba, N., Chao, K. M., Sanchez-Anguix, V., James, A., Usman, Z. (2019). “CPS data streams analytics based on machine learning for Cloud and Fog Computing: A survey”. Future Generation Computer Systems, Volume 90, Pages 435-450.
- Jayanthi, M.D. (2016). “A framework for real-time streaming analytics using machine learning approach”. pp. 85–92.
- Mitchell, T.M. (2006). “The Discipline of Machine Learning”. Carnegie Mellon University, School of Computer Science, Machine Learning Department.
- “Machine Learning: What it is & why it matters”. (n.d.). https://www.sas.com/it_it/insights/analytics/machine-learning.html.