Tips & Tricks
20 September 2023
Simplifying Customer Churn Prediction with Amazon SageMaker Canvas
Blog Hero

Use Case

The Churn Customer dataset used for this analysis is a publicly available dataset obtained from Kaggle Churn Modelling . It encompasses comprehensive information about the customers of a bank, including their churn status (whether they have left the bank or continue to be customers).

Amazon SageMaker Canvas presents a valuable opportunity to employ machine learning techniques for generating predictions without requiring any coding. This platform can be utilized in various practical contexts, one of which is Customer Churn Prediction.

Actually, there are few steps to complete:

  1. Import the dataset
  2. Build the Model
  3. Analyze a Model
  4. Make Predictions

Procedure

Import the Dataset

The file is downloaded from Kaggle to local computer.

Columns Descriptions are:

The file is Uploaded to S3:

To access the Canvas application in Amazon SageMaker, navigate to the left side menu and select the Canvas option.

Once on the Canvas page, wait for the application to load.

From there, you have the option to either choose a pre-built model or create a custom model tailored to your specific needs.

To create a custom model, begin by specifying the model name and determining the problem type. In this instance, we are conducting predictive analysis to determine whether a customer will remain with the bank or leave.

Select S3 to use data to create Dataset.

Upon successfully loading the data, we recommend thoroughly reviewing the columns to ensure their accuracy and correctness. If everything appears to be in order, proceed with creating the dataset.

Build the Model

Select the newly created dataset.

Before proceeding, it is important to review the data thoroughly. Check for any missing data, assess the distribution of different data types, and conduct other necessary quality controls. Additionally, identify and specify the target column, which in this specific case is labeled as 'Exited'. SageMaker Canvas will analyze the target column and automatically determine the appropriate model type. If the target column represents binary or categorized field, SageMaker Canvas will recognize it as a classification model.

Model Type Example Use-Case Supported Data Types Supported Data Sources
Numeric prediction Predicting house prices based on features like square footage Numeric Local upload, Amazon S3, SaaS connectors
2 category prediction Predicting whether or not a customer is likely to churn Binary or Categorical Local upload, Amazon S3, SaaS connectors
3+ category prediction Predicting patient outcomes after being discharged from the hospital Categorical Local upload, Amazon S3, SaaS connectors
Time series forecasting Predicting your inventory for the next quarter Timeseries Local upload, Amazon S3, SaaS connectors
Single-label image prediction Predicting types of manufacturing defects in images Image (JPG, PNG) Local upload, Amazon S3
Multi-category text prediction Predicting categories of products, such as clothing, electronics, or household goods, based on product descriptions Source column: Text Target column: Binary or Categorical Local upload, Amazon S3

Take a thorough look at the column distributions, categories, minimum and maximum values, and other relevant aspects.

Review the features and eliminate any that are deemed irrelevant.

Before removing features, please check the correlations by using Analytics section and determine unrelevant ones.

Then Preview the Model

Check the Feature Impacts and Model Accuracy.

Analyze the Model and understand Feature Impacts to the target field

Here is how `Age` affects the outcome

Here is how `Number of Products` affects the outcome

Here is how `Is Active Member?` affects the outcome

Here is how `Gender` affects the outcome

Check Model Accuracy, False Positive and True Negative percentages:

Make Predictions

Numeric and categorical prediction, image prediction, and text prediction custom models support making the following types of predictions for the data:

For example, you have a CSV file of customer reviews for which you’d like to predict the customer sentiment, or you have a folder of image files that you'd like to classify. You should make predictions with a dataset that matches your input dataset. Canvas provides you with the ability to do manual batch predictions, or you can configure automatic batch predictions that initiate whenever a specified dataset is updated in Canvas.

In order to check the real time results, use Single prediction

Within the Single Prediction functionality, you will encounter different features or columns. By entering the desired values for these features, you can obtain the corresponding prediction result. Additionally, the system provides the probability percentages for both the "No" and "Yes" outcomes.

Here is the results for the our first prediction:

Here is the results for the our second prediction:

How can we help?

Sonne Technology, founded in 2021, is a leading provider of precision-crafted AWS solutions that revolutionize cloud computing. Leveraging our AI-powered expertise , we specialize in tailoring solutions to meet the unique business needs of our clients. With a focus on delivering Function as a Service (serverless) products, we ensure an easy-to-manage, stress-free experience. Our specialized services cater to startups and SMEs , prioritizing flexibility and cost-effectiveness. We thrive in an ultra-agile environment, ready to tackle any challenge. Join us and transform your cloud experience.

Continue Reading