Transparent Machine Learning Predictions
Other machine learning methods provide a prediction – simMachines provides much more. With every machine learning prediction, our technology reveals the justification for the prediction – or “the Why” – providing insights into what factors are driving the prediction, listed in weighted factor sequence. The nearest neighbors for that prediction are also provided by distance, with keys that tie back to each nearest neighbor object in your database.
“Having transparency into why predictions are made is pretty high on the list. That goes back to believability: you need to have transparency to see how you got there. It can’t be a black box, that’s when you lose everybody.”
– Director of Shopper Marketing Initiatives, CPG company
Capture The Customer Moment With Dynamic Predictive Segmentation, a January 2018 commissioned study conducted by Forrester Consulting on behalf of simMachines
What Makes Us Different: Similarity-Based Machine Learning
simMachines uses a proprietary similarity-based machine learning (nearest neighbor) method vs. decision trees or neural networks, to provide the Why behind every machine learning prediction. No other machine learning method can provide “the Why” at a local level.
The similarity of an object to another (nearest neighbor) is commonly employed in customer segmentation, using statistical modeling techniques, as the Why factors behind the similarity of two objects is critical for downstream marketing actions to occur. When applied in machine learning, the same value is provided. However, historically, similarity-based approached could not scale because of the Curse of Dimensionality. simMachines is the first and only technology to solve this problem.
Other machine learning methods do not compare
Inherent in this approach are multiple neurons that input objects to create an output. However, there is no way to know which input factor resulted in the output, so only the prediction can be provided, but not the Why factors behind the predictions.
Inherent in this approach is the requirement to build thousands of trees and then use gradient boosting methods to optimize across them. Single tree predictions aren’t very accurate and gradient boosting is time consuming, hard to maintain, and cannot carry forward the factors behind their predictions, due to complexity.