Example with SAID

South Africa ID

In this example, when creating a new job and selecting South Africa as the locale, it signifies that the job will be customized to adhere to the specific regional settings and preferences of South Africa.

When selecting the locale as South Africa when creating a job, it implies that all the columns related to their respective country (South Africa) will be included in the job configuration. By including these columns, the job can handle and process the data in accordance with the specific requirements and characteristics of South Africa.

The South African ID consists of 13 characters. The first 6 characters represent the date of birth in the format YYMMDD. The next 4 characters indicate the individual's gender. The following digit typically denotes citizenship classification. The last digit serves as a checksum to verify the accuracy of the ID number.

In the example where the "South African ID" is selected as a column and the fixed field length is chosen, it means that each entry in the dataset will have a specific field length allocated for the "South African ID" column. In the given example, the "said" column has a fixed length of 13 characters. By selecting a position range within the "said" column, semantics can be assigned to that specified range.

In the given example, the first columnlet is named "DateOfBirth" and is assigned the position range from 1 to 6. The "Date of Birth (YYMMDD)" semantic is assigned to this columnlet, indicating that it represents the individual's date of birth. Similarly, the second columnlet is named "Gender" and assigned a position range from 7 to 10, indicating that it represents the gender information. This enables targeted synthesis or modification of data within the designated range while preserving the rest of the "said" values.

In the provided example, the third columnlet is named "Citizenship" and is assigned the position range from 11 to 11. The fourth columnlet is named "Validation" and has a position range of 12 to 13. Semantic assignments can be made to these columnlets, either by using existing semantics or by creating new patterns specific to the assigned columnlets. Based on the assigned semantics to each columnlet, the data will be synthesized accordingly, ensuring appropriate modifications and transformations within the specified position ranges.

Upon clicking the "Test" button, the system will generate a preview of the synthetic data based on the assigned semantic types and constraints. After clicking the "Test" button, you can view the before and after results, where only the characters in the range 1-6 have been changed to random numbers or values, while the remaining characters in the "said" field will remain unchanged. This allows you to observe the specific modifications made within the designated range while preserving the original data outside that range.

PreviousAssign Semantics And Synthesis Strategy- Process NextExample with RegEx Pattern

Last updated 2 years ago