Data analytics.jpg

DATA
ANALYTICS

Analyzing raw data in order to unearth actionable insights about that data.

TYPE OF ANALYTICS - SELECTION

The SEDGE platform provides a cloud-based AI and machine learning platform that allows users to choose from several technological capabilities, including Data Analytics, Text Analytics, Image Analytics, Video Analytics, Forecasting, and Optical Character Recognition.

analytics-1.png
 

DATA UPLOAD FILE/DATABASE

2-DATA-UPLOAD.jpg

Users can upload various types of data files including CSV, Tab Separated (TSV), Semicolon, Space, JSON, Parquet, Avro, S3 and HDFS files.
 

Additionally, tables can be loaded into SEDGE from databases such as MySQL, MariaDB, Oracle, Postgres, SQL Server, Cassandra, SQLite, Presto, Redshift, Redis.
 

SQL query builder can be used to connect multiple tables and to normalize data that is then loaded into SEDGE.

 

DATA PREVIEW

The preview screen displays the columns that have been loaded and their datatypes. Data types for each column are automatically identified, such as categorical, numerical, Boolean, date, and text-based data.

6 Data preview.jpg
 

DATA PROFILING AND SAMPLING

4 SAMPLING.jpg

The profile page generates:

  1. Information on detailed statistics

  2. Different sampling algorithms can be used to sample a large dataset.

  3. Visualize the Target variable (a field that is of interest to the user), which may be a categorical, numerical, data, or boolean type

  4. The profiling page provides information about the dataset, such as the numbers of numerical variables, categorical variables, dates and text variables as well as the amount of missing data %. The page also indicates  some warnings about the quality of the data such as variables with more than 90% missing values and variables without any variance.

 

GENERAL DATA PROTECTION REGULATION

GDPR is the European Union's initiative to protect personal information. According to GDPR regulations, EU citizens' personal information may be collected and processed.

SEDGE recognizes the importance of protecting personal information and has a built-in feature that identifies columns of data that needs to be protected. Furthermore, the user can also choose the columns that need to be pseudonymized, which protects sensitive fields from data processors, while allowing advanced analysis not to be compromised by simply ignoring or avoiding these data elements.

 
GDPR.png

STATISTICS, FEATURE CREATION & FORMULA BUILDER

The variables from the data that are important and have the potential to influence the target variable are automatically sorted based on importance value.

 

Large numbers of capabilities are then available to the users in order to achieve the desired level of data transformation and data cleansing. Functions such as changing data types, strings, numerical , mathematical and date operations are also available.

3 FORMULA BUILDER.jpg

Whilst in the Statistics section (page) users can view the description and statistics associated with each data type such as; Mean, Median, Missing Row Count, Missing Row %, Distinct Counts, Minimum, Maximum, 1st Quartile, 3rd Quartile, Inner Fence, Outer Fence, Kurtosis, Skewness, Outlier Data

 

DATA VISUALISATION

Data Visualization allows the user to grasp a better understanding of what the data represents in a short amount of time. There are many chart types supported by the system, and some chart types are better suited to certain data types

7 Data Visualization.jpg
 

DASHPRO

DashPro.png
  • Users can easily create and customize dashboards according to their requirements. Dashpro has a lot of functionalities to get the best insights from the data that allow them to make better decisions. Together, artificial intelligence and business intelligence support one another in understanding data and predicting outcomes.

  • Creating dashboards with this module is so easy with all customizations. Users can create intuitive graphs for quick analysis.

  • In addition, the module contains a data refresh function that automatically updates graphs with the most recent data.

 

CLUSTER ANALYSIS

Cluster analysis refers to arranging observations in a way that similar data points are grouped together, allowing the user to determine the characteristics and statistics of each cluster and to compare it to others.

Clustering.png
 

DECISION TREE ANALYSIS

Decision Tree analysis.png

Decision tree analysis provides a tree-shaped diagram that illustrates the statistical probability of an outcome, which helps the user to understand the decision-making process. Each branch represents a possible outcome.

Since the target variable is predicted by inferring simple decision rules from the prior data, the method is very easy to comprehend. The significant advantage of this method is that the user can consider all possible outcomes of the decision and decide which is best for the business problem and conclude.

 

AUTO ML

SEDGE provides Automated Machine Learning capabilities to help users without doing endless inquiries on data preparation, feature selection, hyperparameters tuning, model comparison, and model selection. The auto-ML algorithm takes all the above factors into account and develops the best model for accurate predictions.

Auto ML.png
 

CUSTOM ML

Custom ML.png

Custom ML module has options for balancing data and cross validating results. It provides state of the art algorithms, automatically selects algorithms based on the target's nature, and runs all at the same time. Few evaluation metrics will allow it to select the optimal model.

 

MODEL EVALUATION METRICS

  • Models can be evaluated and the best one selected for the business problem. SEDGE assesses models in various ways, such as:

  • The Accuracy, Precision , Recall , F1 score & Log loss 

  • The Gain , Lift ,K-S curves

  • The AUC-ROC 

  • The Actual vs Predicted

Evaluation Metrics.png
 

MODEL EXPLAINABILITY

Model explanability.png

Models are traditionally considered as "black boxes," but SEDGE provides a way to explain a model's predictions and to explain why an observation went to a certain value. A model's prediction helps the user understand what all the variables are and what their values are, and how they support or contradict the model to predict a certain value.

 

MODEL PERFORMANCE

The model's performance can be evaluated for the new data. Predictions will be provided with complete data, along with probabilities if applicable. Additionally, the user can view graphical representations of the predictions to better understand which classes or values are usually predicted and which are rarely predicted.

Predicted Output.png
 
data-management-system-dms-with-business

CONCLUSION

Machine Learning is playing a leading role in generating insights and in this role SEDGE unearths the facts from algorithms for a meaningful execution of various decisions and goals predetermined by an Enterprise.

 

SEDGE is redefining and revolutionizing the world of software and analytics, and brings the power of the future into today.