Skip to content

Latest commit

 

History

History
65 lines (42 loc) · 3.99 KB

File metadata and controls

65 lines (42 loc) · 3.99 KB
layout page-steps
language Python
title Perform customer clustering
permalink /python/customerclustering/
redirect_from
/Python/
/Python/customerclustering/step/
/Python/customerclustering/step/1

In this tutorial, we are going to get ourselves familiar with clustering. Clustering can be explained as organizing data into groups where members of a group are similar in some way. We will be using the Kmeans algorithm to perform the clustering of customers. This can for example be used to target a specific group of customers for marketing efforts. Kmeans clustering is an unsupervised learning algorithm that tries to group data based on similarities. Unsupervised learning means that there is no outcome to be predicted, and the algorithm just tries to find patterns in the data. You will learn how to perform clustering using Kmeans and analyze the results. We will also cover how you can deploy a clustering solution using SQL Server. You can copy code as you follow the tutorial. All code is also available on GitHub.

Step 1.1 Install SQL Server with in-database R / Machine Learning Services

{% include partials/install_sql_server_windows_ML.md %}

Step 1.2 Install SQL Server Management Studio (SSMS)

Download and install SQL Server Management studio: SSMS

Now you have installed a tool you can use to easily manage your database objects and scripts.

Step 1.3 Enable external script execution

Run SSMS and open a new query window. Then execute the script below to enable your instance to run R scripts in SQL Server.

 EXEC sp_configure 'external scripts enabled', 1;
RECONFIGURE WITH OVERRIDE

You can read more about configuring Machine Learning Services here. Don't forget to restart your SQL Server Instance after the configuration! You can restart in SSMS by right clicking on the instance name in the Object Explorer and choose Restart.

Optional: If you want, you can also download SSMS custom reports available on github. The report "R Services - Configuration.rdl" for example provides an overview of the R runtime parameters and gives you an option to configure your instance with a button click. To import a report in SSMS, right click on Server Objects in the SSMS Object Explorer and choose Reports -> Custom reports. Upload the .rdl file.

Now you have enabled external script execution so that you can run Python code inside SQL Server!

Step 1.4 Install and configure your Python development environment

1.You need to install a Python IDE. Here are some suggestions:

Step 1.5 Install remote Python client libraries

Note!!! To be able to use some of the functions in this tutorial, you need to have the revoscalepy package.

Follow instructions here to learn how you can install Python client libraries for remote execution against SQL Server ML Services:

How to install Python client libraries

Terrific, now your SQL Server instance is able to host and run R code and you have the necessary development tools installed and configured! The next section will walk you through how to do clustering using R.