What is Generative AI?
Open AI's notorious AI chatbot, ChatGPT, was released in late 2022, inspiring the hype around Generative AI.
Generative AI like ChatGPT can learn from existing artifacts to generate new content that reflect the characteristics of their training data. (Read More)
ChatGPT is trained on a Large Language Model (LLM) containing large amounts of public and some private data, largely dated before 2021.
What are its limitations?
While it may feel like magic, the technology has some inherent limitations. To understand any AI technology, it's important to understand:
a) the data that it was trained on, and
b) the basics of how the technology is designed.
For ChatGPT specifically, here are the main limitations that users should keep in mind:
- Outdated Information: The LLM that ChatGPT is trained on contains data from before September 2021, and it does not learn from experience. For anyone using the tool to conduct research, they should be aware that it lacks recent information.*
- Hallucinations: LLMs can have something called "hallucinations", which is a response that is unjustified by the training data. In other words, it creates fake, false, or unreasoned responses. This often happens when you ask for a citation, only to find that the resource itself does not exist or contains a dead link.
- Biased or Harmful Information: ChatGPT is trained on a language model of over 3 billion words from the internet. Therefore, it will have been trained on all of the same biases and harmful content that exists on the open web. Users should keep this in mind when evaluating its responses.
- Proxied or Paywalled Content: ChatGPT cannot get past proxied links or paywalls, so it is limited to whatever is freely available online, within the dataset it was trained on. This means that it, for the most part, cannot reach our databases and other library resources.