Report generation

A comprehensive report will be generated upon the successful completion of a project task. The report will encompass detailed information regarding the job.

Upon clicking the report, a popup window will be initiated, presenting an array of comprehensive and detailed reports pertaining to various aspects of the completed job in the project.

In the example mentioned, the columns "amount" and "isfraud" are utilized for the synthesis of data.

The initial report presents the accuracy of the synthesized data. After performing data synthesis, an accuracy report assesses the fidelity of the synthesized data compared to the original data. The report typically includes various metrics and evaluations to measure the accuracy of the synthesis process.

The accuracy of the synthesized data in capturing the characteristics of the "Amount" and "IsFraud" variables can be evaluated using various metrics and comparisons. Here's a brief explanation of how these accuracies can be assessed:

Amount Accuracy: The accuracy of the synthesized data with respect to the "Amount" variable can be evaluated by comparing statistical measures and the distribution of transaction amounts. This involves computing metrics such as mean, median, standard deviation, or skewness for both the original and synthesized data. By comparing these metrics, you can assess how well the synthesized data replicates the statistical properties of the original data. Additionally, visualizations such as histograms or density plots can be used to compare the distribution of transaction amounts between the original and synthesized data. If the statistical measures and the shape of the distributions align closely, it indicates a higher accuracy in capturing the characteristics of the "Amount" variable.

IsFraud Accuracy: The accuracy of the synthesized data in capturing the "IsFraud" variable can be evaluated by comparing the distribution of fraudulent instances between the original and synthesized data. This involves computing the proportion of fraudulent cases (1s) compared to legitimate cases (0s) in both datasets. By comparing these proportions, you can determine if the synthesized data captures a similar distribution of fraud instances as the original data. If the proportions align closely, it indicates a higher accuracy in replicating the fraud distribution.

In addition to the initial report, an overall accuracy report will be provided, which calculates the average accuracy of the synthesized data. These accuracy reports provide insights into the fidelity of the synthesized data, particularly in terms of capturing the distributional properties of transaction amounts and fraudulent cases. Close alignment between the original and synthesized data distributions indicates a higher accuracy in reproducing the respective characteristics, while significant disparities may indicate potential limitations or biases in the synthesis process.

Correlation Matrices Report

This report examines the correlation between variables in the synthesized data. It provides a matrix showing the correlation coefficients between pairs of variables. This helps identify relationships and dependencies among variables and assess if the synthesized data capture similar correlation patterns as the original data.

Distribution Of Amount

This report analyzes the distribution of the "Amount" variable in the synthesized data. It may include visualizations such as histograms or density plots to illustrate the distribution. By comparing it with the distribution in the original data, you can assess how well the synthesized data replicates the distribution of transaction amounts.

Distribution Of IsFraud

This report examines the distribution of the "IsFraud" variable in the synthesized data. It provides insights into the proportion of fraudulent instances compared to legitimate instances. Comparing it with the distribution in the original data helps assess if the synthesized data captures the same distribution of fraudulent cases.

NNDR Report

NNDR stands for Nearest Neighbor Distance Ratio. This report assesses the density or clustering characteristics of the synthesized data. It calculates the NNDR metric, which measures the distance ratio between the nearest neighbor and the second nearest neighbor for each data point. This report helps evaluate if the synthesized data maintains similar density patterns as the original data.

DCR Report

DCR refers to the Density-Based Clustering Report. This report analyzes the clustering structure in the synthesized data using density-based clustering algorithms such as DBSCAN (Density-Based Spatial Clustering of Applications with Noise). It assesses if the synthesized data exhibit similar clustering patterns and cluster characteristics as the original data.

Pair Plot Report

This report presents a visual representation of pairwise relationships between variables in the synthesized data. It typically utilizes scatter plots or other visualizations to show the relationships between different pairs of variables. By comparing it with the pair plot of the original data, you can assess if the synthesized data capture similar relationships and patterns.

PreviousHistogram Distribution report NextHealth Dataset Management

Last updated 2 years ago