Machine learning is everywhere these days. Every organization is trying to leverage this technology in some or the other way. I was also curious to learn more about Machine Learning when an opportunity popped up. It was to use Machine Learning in a proof of concept which eventually could turn into a large enterprise initiative if we could prove the benefits of using Machine Learning.
The problem statement – Our organization had a large number of enterprise applications which were live in production and supported by a team of more than 50 people. Thousands of tickets being addressed every day. The management decided to evaluate Machine Learning to help reduce this ticket load and save costs.
Solution – The idea was to extract the ticket history and use this data to train a machine learning model. This model would be used to take some basic decisions of routing the ticket to the correct support team and thus reduce the ticket count of first level support executives.
Our initial approach – Since we were novice in the field of Machine Learning, we started random research on this topic. We were not sure if we should go with open source or use some commercial platform. Few of the platforms we tried –
- Weka – An open source data mining and machine learning platform.
- Accord.net – An open source machine learning framework for .NET
- Tensorflow by Google – Open source machine learning library based on Python.
- Azure ML studio – A commercial platform offered by Microsoft in cloud.
Although Accord.net has an advantage of being based on .NET which is my primary technology, I was not convinced that we should build large enterprise solution using this framework. Same was the case with Weka.
Tensorflow was a good choice but our management wanted us to use Microsoft Azure because it was being adopted in other departments. New IT infrastructure was being provisioned in Azure cloud.
Azure ML Studio – It was easy to get started with ML studio. We used a free Azure account to create initial experiments. The good part about Azure ML Studio is that it guides you from data preparation to deploying your model as web API. The modules available in ML studio are enough for an advanced ML model.
Some of the modules we used were –
1. Data import – ML studio provides various ways of importing data and converting data into different formats like csv, tsv etc.
2. Data transformation – This category of modules provides some essentials data preparation functions. The ones we used are –
- Clean missing data
- Edit metadata
- Remove duplicates
- Select columns in dataset
- Split Data (into training set and test set)
- Normalize data
3. Text Analytics – We used language detection to filter English speaking regions in the data set. We also tried N-grams feature from text and feature hashing was also of use in some cases.
4. Feature Selection – to identify the columns in your input dataset that have the greatest predictive power.
5. Machine Learning modules – This deals with choosing an appropriate algorithm and then evaluating it on a training set. ML Studio provides following ML algorithms to name a few –
- Anomaly detection
- Classification – We used couple of them (Bayes, Multiclass neural network) before we found better results with logistic regression.
6 Python and R language modules – You can add custom modules using python and R.
Summarizing the entire experience with Azure ML Studio, it is a great platform to get started with Machine Learning. Their documentation is solid and easy to follow. We were able to create a working proof of concept in three months time and it was used as a reference for the technology road map.