Ml.NET makes machine learning accessible to the .NET developer community. Up until now, the machine learning space has been dominated by other languages such as Python and C++. With ML.NET, .NET developers get access to a whole host of machine learning techniques for solving various problems. ML.NET makes integrating intelligent systems into an existing codebase simple and intuitive.
What is Machine Learning and ML.NET
Machine Learning is a type of algorithm in which the outcome is not specifically defined and instead is progressively improved upon. Machine Learning is generally used to make sense of data, which may be difficult to do otherwise. ML.NET is a machine learning framework from Microsoft that is free, open source, and cross platform.
Common Machine Learning Techniques
ML.NET includes many machine learning techniques to solve a variety of machine learning problems. These techniques range from the very simple, to the very complex, but each technique provides value in different ways. While ML.NET does not include every machine learning technique inside of its framework, it includes enough to get any .NET developer started with machine learning.
Linear Regression
Linear Regression is one of the simplest machine learning algorithms. Linear regression is used to model the relationship between one or more independent variables and a single dependent variable. It is often used to make predictions based on data points or to generate a trend line. ML.NET includes techniques to solve regression problems that are more complex than the simple linear regression algorithm.
Decision Trees
Decision Trees are very intuitive and easy to understand. There are many algorithms to create decision trees. Decision trees often have the advantage of being quick, accurate, and easy to understand.
Decision trees are represented as binary trees where the nodes are decisions based on an input variable until arriving at a leaf which represents a prediction. ML.NET includes the FastTree trainer which is an efficient implementation of Multiple Additive Regression Trees (MART). A regression tree is similar to a decision tree, but regression trees contain scalar values in its leaves. The FastTree trainer is effective against binary classification, regression, and ranking problems.
Artificial Neural Networks
Artificial neural networks are one of the most popular machine learning algorithms. Artificial neural networks are inspired by biological equivalents and can be visualized as layers of neurons connected by synapses.
The underlying math behind neural networks involves linear algebra and calculus. A vector of inputs is multiplied by a matrix of weights and then combined with a constant bias and sent through an activation function to become the input for the next layer. This step of an artificial neural network is called forward propagation; however, no learning takes place in forward propagation. Learning comes from optimizing the loss of an artificial neural network through back propagation. There are many algorithms to do this, but most of them involve calculating the partial derivative and updating a weight accordingly.
Currently, ML.NET does not have great support for neural networks, but ML.NET plans to expand its support in the future. Right now, ML.NET only allows developers to access some aspects of TensorFlow through ML.NET.
Common Machine Learning Problems
ML.NET can solve a variety of problems using some of the techniques outlined above. There are some machine learning problems that show up repeatedly. Two of the most common problems are classification and regression.
Classification
Classification is the task of classifying a data set into two or more groups. There are two types of classification problems, Binary and Multi-class. Binary classification is where a data set can be classified into just two groups. Some examples of binary classification include, sentiment analysis, spam detection, and fraud detection. Sentiment analysis has just two classifications, positive and negative.
For example, the statement “That was rude” has a negative sentiment. In contrast, the statement “Today is Great!” has a positive sentiment. ML.NET has many Trainers for the binary classification problem; for sentiment analysis, Microsoft’s sample uses the Fast Tree binary classification trainer.
Multi-class classification is where the data set has more than two groups to be classified. A common example of multi-class classification is identifying hand written digits. Hand written digits can be classified into ten different groups, the numbers zero through nine. As with binary classification, there are a variety of techniques used to solve multi-class classification. For identifying hand written digits, Microsoft’s sample uses the Stochastic Dual Coordinate Ascent multi-class classification trainer.
Regression
Regression is the task of predicting numerical values given a set of input data. Regression is used for numerical predictions such as sales forecasting or price prediction. The simplest solution to the regression problem is linear regression where predictions are made based on a line of best fit.
Regression can also get more complex by using Recurrent Neural Networks (RNNs) or a variation of RNN called Long Term Short Term (LSTM) networks. These two types of neural networks can perform sequence to sequence machine learning to predict numerical values. The networks achieve sequence to sequence prediction by using a hidden state to give the network a memory-like effect. Although ML.NET currently does not support RNNs or LSTMs, Microsoft has other effective techniques to solve the regression problem. Microsoft’s price prediction ML.NET sample uses Stochastic Dual Coordinate Ascent to predict prices of products.
Other Machine Learning Problems
Machine learning has seemingly endless applications and can often significantly improve an existing application’s utility. Some other common machine learning problems are natural language processing, computer vision, and recommendations. People experience machine learning everywhere, from search results from popular search engines to advertisements and product recommendations.
While Microsoft does not provide a sample for a natural language processing problem, developers could implement a solution using the wide variety of algorithms provided by ML.NET. Microsoft has multiple ML.NET recommendation examples to provide functionality that is similar to Netflix’s movie recommendation or Amazon’s product recommendation system.
ML.NET vs Other Machine Learning Frameworks
ML.NET is not the only machine learning framework out there. There are many machine learning frameworks from different companies. Some of these machine learning frameworks have different goals then ML.NET and therefore provide slightly different functionality. ML.NET’s goal is to provide .NET developers with an easy to implement machine learning framework.
ML.NET vs TensorFlow
TensorFlow is one of the most popular open source machine learning frameworks and is developed by Google. TensorFlow is written in C++ and supports GPU and TPU acceleration. As of now, ML.NET does not support DNN GPU acceleration, but support will likely be added in future releases. TensorFlow’s main focus is deep learning by providing users with an intuitive way to calculate gradients across complex graphs.
While TensorFlow is mainly used with Python, there are also other language implementations. TensorFlowJS is a powerful implementation of TensorFlow that allows deep learning models to be created inside of a browser. While TensorFlow excels in certain situations, it is less accessible to developers that are not familiar with Python or JavaScript. ML.NET provides a variety of techniques for machine learning while making these techniques accessible to .NET developers. ML.NET’s focus is not just deep learning, although ML.NET uses TensorFlow for some of the deep learning implementations.
ML.NET vs CNTK
Microsoft’s Cognitive Toolkit (CNTK) is another popular open source machine learning framework. Why does Microsoft have two machine learning frameworks?
While ML.NET makes machine learning accessible to .NET developers, CNTK is a machine learning framework that specializes on deep learning and uses a python API to train CNTK models. Once trained, however, a model can be run through a variety of languages, including C#/.NET. ML.NET, in contrast, provides a variety of machine learning techniques that can be trained and evaluated inside of C#/.NET. There may be a future where ML.NET adds support for deep learning by using Microsoft’s existing deep learning framework.
What’s Next For ML.NET
ML.NET is in its infancy and is still in preview. ML.NET’s future is bright and there are many features that are being planned to be added to ML.NET. Some of the features that stand out are better support for deep neural networks (DNNs), and a Graphical User Interface (GUI) for ML.NET. DNNs are currently the hottest machine learning technique and provide the best accuracy in many situations.
Adding DNN support for ML.NET is an important step for ML.NET to be considered as a competitor to existing machine learning frameworks such as TensorFlow or CNTK. Adding a GUI for ML.NET is also a big step for ML.NET because it lowers the barrier of entry for working with Machine Learning. A GUI would allow users with zero development experience to solve problems that were previously out of reach for them, while also giving .NET developers a way to rapidly prototype a solution to a machine learning problem.
If you would like to learn more about ML.NET or need help with other Microsoft products and solutions, contact KTL today.