1. Setup New Job

Step 1

To generate synthetic data, you need to first connect to one or more databases and select the columns you want to use. Once you have selected the data, specify your data generation strategy. Finally, run and monitor the synthetic jobs to ensure that everything is working as expected.

Clicking on the "New Job" icon within the jobs page will open a new setup job screen where users can input the following details:

  • Job Name

  • Select Locale

  • Select Source Type

Step 1 - New Job page

Job Name field - Which allows us to input the job name.

In the Locale section of Brewdata, you will find a dropdown field labeled "Select Locale." This field allows you to choose the specific Locale Region from the available options, such as UAE, US, Germany, French, and so on. The selected locale influences how data will be synthesized within the system.

Select Locale

The strategies for data synthesis may vary based on the rules and regulations specific to each country or region. Each locale has its own unique approach to data selection, ensuring that the synthesized data aligns with the requirements and characteristics of that particular region. By selecting the appropriate locale, you ensure that the data synthesis process takes into account the specific considerations and guidelines associated with the chosen country or region.

The "Select Source" option pertains to the data source from which the synthesized data is derived. When selecting the source type in Brewdata, you have the options to choose from different file formats or sources to extract data for your job. The available options typically include:

Database: This option allows you to connect to a database system such as MySQL, PostgreSQL, Salesforce or Oracle, and extract data directly from the tables within the database.

XML: XML (Extensible Markup Language) is a file format used for structuring and storing data in a hierarchical format. Selecting this option means you will be extracting data from an XML file.

JSON: JSON (JavaScript Object Notation) is a lightweight data interchange format commonly used for web-based APIs. If you choose this option, you will be extracting data from a JSON file or an API that returns JSON data.

CSV (Comma-Separated Values): Think of a CSV file as a list written in a text document. Each line in the file represents a row, and the values (like names, numbers, etc.) in each row are separated by commas. It's like a simple list you can easily create in a text editor.

XLS (Excel Spreadsheet): An XLS file is like a digital table with rows and columns. It's created and opened using software like Microsoft Excel. You can put various types of information in it—like names, dates, numbers—and it allows you to do calculations, create charts, and make it look nice and organized.

Select Source Type

When you select a specific source type like XML, JSON, or CSV in Brewdata, there is typically an option to upload the corresponding file from your desired source. This allows you to provide the necessary data file for extraction.

For example, if you choose XML as the source type, you will have the option to upload an XML file containing the data you want to extract. Similarly, if you select JSON, you can upload a JSON file, and if you choose CSV, you can upload a CSV file.

upload file

By selecting the appropriate source type and uploading the corresponding file, Brewdata will be able to read and extract data from that file, enabling you to work with the specific data format you've chosen.

These different source types provide flexibility in handling various data formats and sources, allowing you to extract data in the format that best suits your needs for the job at hand

DB Connection can be selected from the dropdown.

Select connection option

If the user selects "MSSQL" from the dropdown, they will be presented with an option to select the schema. However, if any option other than "MSSQL" is selected, the select schema option will not be shown.

If the user selects "MSSQL" from the dropdown, they will be presented with an option to select the schema.
Configure Database Connections

Alternatively, we can configure a new connection.

The "Configure New Connection" feature serves the purpose of setting up a fresh connection.

This feature is a valuable tool for users who need to create new connections quickly and easily without having to rely on existing configurations.

Configure Database Connections

After filling in all the details and selecting the database sources, user has four options:

  1. Cancel: This allows you to exit without saving any changes or proceeding further.

  2. Check for PII: This option checks for Personally Identifiable Information (PII) within the data for privacy and compliance purposes.

  3. Configure Synthesis Pattern: This involves setting up a pattern for data synthesis or creation.

  4. Setup Rules: This allows the establishment of specific rules or guidelines for the data sources or their usage.

Last updated