###### Advanced Statistics

Machine Learning Evaluation Metrics There are many metrics used to evaluate machine learning models, each with their own pros and cons. We can broadly group metrics as classification and regression metrics. Classification Metrics Classification metrics are used to evaluate the performance of machine learning models that are used for predicting discrete values. Before proceeding to explain some frequently-used classification metrics, it is crucial to understand the confusion matrix showPopular###### Global accuracy

Global accuracy Model accuracy refers to the degree to which a machine learning model correctly predicts the target outcome based on the input data. It is a common classification performance metric used to evaluate the effectiveness of a machine learning model. Model accuracy is typically expressed as a percentage, with 100% accuracy meaning that the model has correctly predicted the target outcome for all examples in the evaluation dataset. A Sample Model Accuracy Graph (https://storage.Popular###### Feature importance

Feature importance NextBrain demonstrates the Sankey diagram for each model. A Sankey diagram is a visual representation of the flow of predictions in a model, showing the relative importance of each feature in the dataset towards the final prediction. The width of each arrow corresponds to the amount of information or resources that is flowing through that particular channel. The colors used in a Sankey diagram indicate which flow separation is most likely. Darker colors are used to indicateSome readers###### Confusion Matrix

Confusion Matrix A confusion matrix is a table that is used to define the performance of a classification model on a set of test data for which the true values are known. A confusion matrix is a table with four different combinations of predicted and actual values, typically referred to as True Positives (TP), False Positives (FP), True Negatives (TN) and False Negatives (FN). True Positives (TP) are the cases in which the model correctly predicted the positive class. False Positives (FP) arFew readers###### Column importance

Column importance Column importance in machine learning refers to the ability to determine which features of the data are most important in predicting the target variable. It is a way to understand the relationship between the input features and the target variable, and to identify which features have the most impact on the model's predictions. We can see a sample feature importance plot below. A Sample Column ImportancFew readers###### Variable Correlation

Variable Correlation Correlation between two features refers to the relationship between two variables in a dataset. By plotting one variable agains another on a graph, which will be colored with respect to the target variable, you can distinguish the relationship between two features more distinctively. For instance, we can observe there is a strong positive correlation between Spent and Clicks features for a particular ROAS range on the variable correlation graph below. We can see that forFew readers###### Visual Explainability

Visual Explainability Machine learning methods use statistical learning to identify boundaries. One example of a machine learning method is a decision tree. A decision tree uses if-then statements to define patterns in data. In machine learning, these statements are called forks, and they split the data into two branches based on some value. That value between the branches is called a split point. A split point is the decision tree's version of a boundary. Every fork is adding information aboFew readers###### Pearson Matrix

The Pearson correlation matrix is a square matrix that provides the pairwise Pearson correlation coefficients between all the variables in a dataset. Pearson correlation measures the linear association between two variables, where a value of 1 indicates a perfect positive linear relationship, a value of -1 indicates a perfect negative linear relationship, and a value of 0 indicates no linear relationship. For example, consider a dataset that contains information about marketing and sales. The PFew readers###### Storytelling

Storytelling Custom user storytelling detailing the problem, objectives, measures taken or any other insights of interest noticed.Few readers###### Model Training History

NextBrain.ai provides you the steps taken when training model, such as the number of columns and rows that are removed and the training/test set ratio. These information can inform you better about both your data and model. NextBrain will save you the hassle of optimizing your data to best fit to the model, and will take care of nuances that can yield better metric results. A Sample Model Training History (https://storage.crisp.chat/users/helpdesk/website/ac02aa4ca9b97000/modeltrainhistvvsFew readers###### Column distribution

Column distribution Distribution of column values in the training data set.Few readers###### Predicted vs Actual

Predicted vs Actual A display of each predicted value respect its real counterpart centered around a linear estimationFew readers###### Interpreting Data Distribution

Data distribution When building machine learning models, it's essential to understand the distribution of the data you're working with. Understanding data distribution can help you choose appropriate algorithms and model parameters, identify potential biases, and evaluate model performance. In this article, we'll discuss the basics of data distribution and how to interpret it for machine learning models. What is Data Distribution? Data distribution refers to the way that data is spread oFew readers###### Algorithms used

Algorithms used Algorithms used training the model and the workload put on each during training.Few readers###### Predict

Predict Prediction in machine learning refers to the process of using a trained model to make predictions on new, unseen data. Given a set of input features, the model generates a prediction for the target variable based on the relationships it learned during the training process. The data type of the features can vary, but it's important to ensure that the data is in a format that can be processed by the model. Some common data types for features include: Numeric: Numeric data can be coFew readers###### Performance

Performance How does our model perform with the current parameters compared to a simpler prediction algorithm.Few readers###### Predictions for Python Developers

After training a new model by leveraging the power of NextBrain, you can make brand-new predictions with Python in a few easy steps. In this tutorial, you will learn how to use NextBrain to make predictions on the Uplift modeling model on the Marketing and Sales workspace. First, upload the nextbrain module: It would be best if you made sure that the set you will provide to nextbrain has the same number of columns as the training data. Moreover, it must haveFew readers###### Forecast yearly data trends

Forecast model's original data trends grouped by monthFew readers###### Model prediction

Forecast generated over the training data to assess its accuracyFew readers###### Forecast accuracy

Visualization of various error metrics as the forecast horizon expandsFew readers###### Predictive Power Score

Predictive Power Score How much each column affects the prediction when generating oneFew readers###### Saturation per channel

Lines to contrast the investment/return ratios per channelFew readers###### Forecast Prediction

Visual representation of the future forecast generated from the model with margins error that diverge more and more as the timeline progresses.Few readers###### Saturation

Line to contrast invest to return rations and visualize the saturation point, the theoretical income limitFew readers###### Original forecast data

The original values from the model's datasetFew readers###### Forecast data trends

Forecast dataFew readers###### Seasonal Trend

Split of contribution per seasonFew readers###### Weekly data trend

Forecast model's original data trends grouped daily by week daysFew readers###### Media Contribution

Contribution per channel to the total outcomeFew readers