Header Risk And Limitations

Data Privacy & Security


Data privacy and security are two elements that always factor when it comes to any technology; more so, when it comes to GenAI. The data used in GenAI processes should be obtained legally, and legal measures should also be exercised by GenAI on the data and the content it produces to be compliant with the prevailing laws.

Most privacy concerns related to data arise from the way it is acquired. These include:

  • Web scraping: AI can collect large chunks of information due to their capability of scraping data from web pages. However, web scraping also collects specific data, some of which may be personal, without the user’s consent.
  • Biometric data: Facial recognition, fingerprint, and other biometric technologies that are now incorporated into AI systems threaten the right to personal privacy because they gather biometrics that are personal, special and infinitely valuable if stolen.
  • IoT devices: Smart objects of the IoT give AI systems actual in-depth information about our homes, offices, and cities. It can expose the most personal information of our day-to-day.
  • Social media monitoring: Social media activity can be interpreted by AI algorithms to capture demographic data, preferences, and in some instances, even affective states, all without the user being fully aware or having provided consent.

Therefore, the issue of protecting information that is private or sensitive is increasingly being raised.

Since generative AI systems rely on large sets of data to train the models, appropriate measures should be taken to ensure that the information does not end up with the wrong persons or organizations or is mishandled. It may lead to data leakage, which is often involving confidential information; a factor that raises the question of how to anonymize the data to prevent identification of individuals. It should be also explained to the users how their data will be used and ask permission before utilizing the data to train the GenAI model. As a result, organizations must regulate how such information is dealt with, while at the same time cultivating the openness of their business.

Additionally, sourcing quality data is also challenging since many organizations experience issues with purchasing commercial licenses for current data or creating detailed datasets that could be used to generate generative models.

If you want to know more about security principles for securing GenAI solutions, read Securing generative AI: An introduction to the Generative AI Security Scoping Matrix>.