Define Tables and Columns Load Data into the Knowledge Base (KB) 2. Data Engineering Once the data connectors are created, users can define the tables and columns of the data source. This step involves selecting the relevant tables and columns that are needed for the analysis. Users can also define custom columns, rename columns, and perform other data preparation tasks. This is done by providing the name of the table, column names, data types, and other relevant information. After defining the tables and columns, the data is loaded into the KB. This step involves extracting the data from the data source, cleaning and transforming the data, and loading it into the KB for further analysis. This is done by using SQL commands or by using the provided U I interface to extract the data from the data source and load it into the KB. Data Engineering is a critical component of a data-driven company and an essential phase in the data analytics process. It entails extracting, processing, and loading into a consolidated data repository, which may then be utilized for data analysis and reporting. This stage is essential for ensuring that the data is cleansed, organized, and prepared for future analysis, resulting in accurate and insightful findings. A well-designed data pipeline guarantees that data is secure, consistent, and easily accessible to all stakeholders, so facilitating improved decision-making and, ultimately, enhanced business outcomes. The Data Workbench module is a comprehensive solution for data engineers, providing a range of tools for data connection, management, and governance. The module includes a set of tools that allow data engineers to shape, and transform data, preparing it for analysis. These tools include data merging, filtering, and aggregation, as well as data type conversion and text processing. Data Profiling Data Cleaning www.conversight.ai In this step, the data is profiled to understand the data quality, completeness, and integrity. This step is done by analyzing the data and identifying the data types, missing values, outliers, and other relevant information. This allows users to understand the data and identify any issues that need to be addressed. This step can also be done by using pre-built data profiling scripts or by using the provided U I interface to understand the characteristics of the data. Data cleaning is performed to remove any inconsistencies, errors, or outliers in the data. This step includes tasks such as removing duplicates, handling m1ss1ng values, and standardizing the data. This step can be done by using pre-built data cleaning scripts or by using the provided U I interface to clean the data.
