Synthetic Data Generator
Use Language Models to Create Datasets for Specified Labels and Categories
Setup & Configure
Use Case
Labels
Use a colon to separate each label and its description as in 'label: description.'
Categories
Use a colon to separate each category and its subcategories as in 'category: type1, type2.'
Guiding Examples
Include all examples in this box. For each example, provide a LABEL, CATEGORY, TYPE, OUTPUT, and REASONING.
Generate & Export
Status
Actions
Enter the number of rows for the generated dataset, and optionally a Hugging Face repo ID.
Sample Output
I'm so glad you're excited about joining our team! I'd be happy to answer any questions you have about our training programs or facilities. We're always looking for ways to improve and ensure our athletes have the best experience possible. | impolite | meta-llama/Meta-Llama-3.1-8B-Instruct | This text is impolite due to its condescending tone and dismissive language, such as "Are you seriously asking" and "It's not exactly rocket science." The text also contains a personal attack with "Next thing you know, you'll be asking how to breathe," implying that the customer is incompetent. The overall tone is mocking and shows no willingness to help or provide assistance in a respectful manner. |

This synthetic data generator, part of Intel's Polite Guard project, utilizes a specified language model to generate synthetic data for a given use case. If you find this project valuable, please consider giving it a ❤️ on Hugging Face and sharing it with your network. Visit
- Polite Guard GitHub repository for the source code that you can run through the command line on an AI PC or Intel Tiber AI Cloud,
- Synthetic Data Generation with Language Models: A Practical Guide to learn more about the implementation of this data generator, and
- Polite Guard Dataset for an example of a dataset generated using this data generator.
Privacy Notice
Please note that this data generator uses AI technology and you are interacting with a chat model. Prompts that are being used during the demo and your personal information will not be stored. For information regarding the handling of personal data collected refer to the Global Privacy Notice (https://www.intel.com/content/www/us/en/privacy/intelprivacy-notice.html), which encompass our privacy practices.