Global accuracy
Global accuracy Model accuracy refers to the degree to which a machine learning model correctly predicts the target outcome based on the input data. It is a common classification performance metric used to evaluate the effectiveness of a machine learning model. Model accuracy is typically expressed as a percentage, with 100% accuracy meaning that the model has correctly predicted the target outcome for all examples in the evaluation dataset. A Sample Model Accuracy Graph (https://storage.PopularAdvanced Statistics
Machine Learning Evaluation Metrics There are many metrics used to evaluate machine learning models, each with their own pros and cons. We can broadly group metrics as classification and regression metrics. Classification Metrics Classification metrics are used to evaluate the performance of machine learning models that are used for predicting discrete values. Before proceeding to explain some frequently-used classification metrics, it is crucial to understand the confusion matrix showPopularFeature importance
Feature importance NextBrain demonstrates the Sankey diagram for each model. A Sankey diagram is a visual representation of the flow of predictions in a model, showing the relative importance of each feature in the dataset towards the final prediction. The width of each arrow corresponds to the amount of information or resources that is flowing through that particular channel. The colors used in a Sankey diagram indicate which flow separation is most likely. Darker colors are used to indicateSome readersConfusion Matrix
Confusion Matrix A confusion matrix is a table that is used to define the performance of a classification model on a set of test data for which the true values are known. A confusion matrix is a table with four different combinations of predicted and actual values, typically referred to as True Positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN). True Positives (TP) are the cases in which the model correctly predicted the positive class. False Positives (FP) arFew readersVariable Correlation
Variable Correlation Correlation between two features refers to the relationship between two variables in a dataset. By plotting one variable agains another on a graph, which will be colored with respect to the target variable, you can distinguish the relationship between two features more distinctively. For instance, we can observe there is a strong positive correlation between Spent and Clicks features for a particular ROAS range on the variable correlation graph below. We can see that forFew readersVisual Explainability
Visual Explainability Machine learning methods use statistical learning to identify boundaries. One example of a machine learning method is a decision tree. A decision tree uses if-then statements to define patterns in data. In machine learning, these statements are called forks, and they split the data into two branches based on some value. That value between the branches is called a split point. A split point is the decision tree's version of a boundary. Every fork is adding information aboFew readersColumn importance
Column importance Column importance in machine learning refers to the ability to determine which features of the data are most important in predicting the target variable. It is a way to understand the relationship between the input features and the target variable, and to identify which features have the most impact on the model's predictions. We can see a sample feature importance plot below. A Sample Column ImportancFew readersStorytelling
Storytelling Custom user storytelling detailing the problem, objectives, measures taken or any other insights of interest noticed.Few readersPearson Matrix
The Pearson correlation matrix is a square matrix that provides the pairwise Pearson correlation coefficients between all the variables in a dataset. Pearson correlation measures the linear association between two variables, where a value of 1 indicates a perfect positive linear relationship, a value of -1 indicates a perfect negative linear relationship, and a value of 0 indicates no linear relationship. For example, consider a dataset that contains information about marketing and sales. The PFew readersPredictions for Python Developers
After training a new model by leveraging the power of NextBrain, you can make brand-new predictions with Python in a few easy steps. In this tutorial, you will learn how to use NextBrain to make predictions on the Uplift modeling model on the Marketing and Sales workspace. First, upload the nextbrain module: It would be best if you made sure that the set you will provide to nextbrain has the same number of columns as the training data. Moreover, it must haveFew readersInterpreting Data Distribution
Data distribution When building machine learning models, it's essential to understand the distribution of the data you're working with. Understanding data distribution can help you choose appropriate algorithms and model parameters, identify potential biases, and evaluate model performance. In this article, we'll discuss the basics of data distribution and how to interpret it for machine learning models. What is Data Distribution? Data distribution refers to the way that data is spread oFew readersModel Training History
NextBrain.ai provides you the steps taken when training model, such as the number of columns and rows that are removed and the training/test set ratio. These information can inform you better about both your data and model. NextBrain will save you the hassle of optimizing your data to best fit to the model, and will take care of nuances that can yield better metric results. A Sample Model Training History (https://storage.crisp.chat/users/helpdesk/website/ac02aa4ca9b97000/modeltrainhistvvsFew readersPredicted vs Actual
Predicted vs Actual A display of each predicted value respect its real counterpart centered around a linear estimationFew readersColumn distribution
Column distribution Distribution of column values in the training data set.Few readersPredict
Predict Prediction in machine learning refers to the process of using a trained model to make predictions on new, unseen data. Given a set of input features, the model generates a prediction for the target variable based on the relationships it learned during the training process. The data type of the features can vary, but it's important to ensure that the data is in a format that can be processed by the model. Some common data types for features include: Numeric: Numeric data can be coFew readersPredictive Power Score
Predictive Power Score How much each column affects the prediction when generating oneFew readersAlgorithms used
Algorithms used Algorithms used training the model and the workload put on each during training.Few readersModel prediction
Forecast generated over the training data to assess its accuracyFew readersForecast accuracy
Visualization of various error metrics as the forecast horizon expandsFew readersForecast Prediction
Visual representation of the future forecast generated from the model with margins error that diverge more and more as the timeline progresses.Few readersForecast yearly data trends
Forecast model's original data trends grouped by monthFew readersPerformance
Performance How does our model perform with the current parameters compared to a simpler prediction algorithm.Few readersForecast data trends
Forecast dataFew readersSaturation
Line to contrast invest to return rations and visualize the saturation point, the theoretical income limitFew readersOriginal forecast data
The original values from the model's datasetFew readersWeekly data trend
Forecast model's original data trends grouped daily by week daysFew readersSaturation per channel
Lines to contrast the investment/return ratios per channelFew readersSeasonal Trend
Split of contribution per seasonFew readersMedia Contribution
Contribution per channel to the total outcomeFew readers