Type of Analytics - Selection
EDGE offers a cloud-based AI and machine learning platform, where users can choose the technological ability of interest such as Data Analytics, Text Analytics, Image Analytics, Video Analytics, Forecasting, and Optical Character Recognition.
Data Upload - CSV / Database
Users can upload various data files CSV, Tab Separated (TSV), Semicolon, Space, JSON, Parquet, Avro, S3 and HDFS files.
In addition, database tables can be loaded into EDGE from databases such as MySQL, MariaDB, Oracle, MS SQL, Postgres and SQL Server.
Also, SQL query builder can be used to connect multiple tables and create normalized data that can then be loaded into EDGE.
The preview screen displays the columns which have been loaded and the datatype of each column. The contents of each column are automatically identified to display the data type such as categorical, numerical, Boolean, date, and text-based data types.
Data Profiling and Sampling
The profile page generates
1. Detailed Statistical information
2. A large dataset can be sampled using different sampling algorithms.
3. Visualize the Target variable (a field that is of interest to the user) which may be of data type categorical, numerical, data, or boolean
4. The Auto Machine Learning functionality can automatically clean the data, perform data transformation, fill in missing values and generate models automatically for prediction.
General Data Protection Regulation
GDPR is the European Union initiative around the protection of personal information. The GDPR regulations define how personal information/data for EU citizens may be collected and processed.
EDGE recognizes the extreme importance of protecting personal information and has a built-in feature identifying columns of Data that need data protection.
Apart from the above, the user also has the option to select the columns which need to be pseudonymized, which helps to protect the sensitive fields from data processors, whilst enabling advanced analysis not to be compromised by simply ignoring or avoiding these data elements.
Statistics, Feature Creation & Formula Builder
The data variables which are important and have the ability to influence the target variables are automatically sorted based on their importance value. The user is then afforded a lot of capability in order to achieve the desired level of data transformation and data cleansing. These functions include being able to change data types, strings, numerical, and date operations.
Whilst in the Statistics section (page) users can view the description and statistics associated with each data type such as;
Missing Row Count
Missing Row %
Data Visualisation enables the user to grasp a better understanding quickly in respect of what the data is revealing.
The system supports many chart types and based on what data one is looking at there are certain chart types that are best. Eg
Data Balancing and Model Creation
The system offers a series of predefined models. The users can select these models to start the systems’ LEARNING. This will trigger the “learning of the target variables”. Once the models have processed the data in “learn” mode, the accuracy of each model will be displayed
Clicking on the metrics will display the metrics such as gain chart, ift chart, confusion matrix, area under the curve (AUC), and other charts.
Model Saving, Deploying and Performance
Once the model with high accuracy and maximum AUC is selected, the user can save the model and give it a name. The model can be deployed by clicking on the deploy button.
When the model is deployed, it needs to be monitored to measure its performance as to whether data drift affects the accuracy of the model.
Machine Learning is playing a leading role in generating insights and in this role EDGE digs out the facts from algorithms for a meaningful execution of various decisions and goals predetermined by an Enterprise.
EDGE is something that is going to redefine the world of software and analytics in the near future.