Sense or nonsense?

What culture do we want to produce with language models?

*Since the beginning of 2023, publications around the language model GPT, ChatGPT and GPT4 have been making waves and occupying the scientific community, the educational landscape and the public. The language model from OpenAI enables the simulation of a conversation at an astonishing level and thus further stimulates the debate about the status and possibilities of “artificial intelligence”. But what exactly can the model do, what are its limits and which scenarios are conceivable and useful for use in the field of museum education? The article provides an initial overview from the museum sector on the status and use of language models and develops first thoughts for the project “Creative User Empowerment” - the article is open for discussion and will be supplemented. *

What is GPT?

GPT is the abbreviation for Generative Pretrained Transformer an offering from the US company Open AI. The originally non-profit development was financed by Elon Musk and Microsoft, among others, which has held the rights since 2020. The goal was to develop artificial general intelligence that would benefit humanity. https://openai.com/about

ChatGPT is a variation of the model that makes it possible to enter into AI-supported chat interactions, which is calculated in real time and is not pre-scripted but based on probabilities. A web crawl up to 2021 was used as the data basis, the fundamentals of which are difficult to grasp. GPT is a so-called Large Language Models (LLM), which in version GPT 3.5 has over 175 billion parameters and 800 gigabytes of storage capacity.

GPT-4 is a further development that was published in March 2023. In addition to texts, it can also process images or audio as input and is thus a “Multimodal Large Language Model” (MLLM). In addition, a usage or usage style can be prescribed, so the AI can be used as a tutor, for example, which makes it relevant for the education sector.

In contrast to previous years, the quality of the output is astonishingly good on the grammatical, semantic and content levels. The results have surprised and fascinated 100 million users since January 2023. The company is now considered a competitor to Google.

How has ChatGPT been used so far?

ChatGPT can be used directly via the interface or connected via an API. Registration is required and for some time access was only possible via a waiting list. Plug-ins can also be registered and developed.

A wide range of applications has emerged within a very short time:
One way in which ChatGPT can be used in education is in the app Historical Figures.
Here, users can chat with historical celebrities such as Goethe, Schiller and Hannah Arendt. Users enter questions and the chat AI answers.

Also interesting are the solutions that offer tutorial help, accompanying students in their learning process. (Link follows).

In the meantime, countless applications have been created with it, up to and including improved shopping plug-ins that promise a faster and more personalised shopping experience.

What advantages does the model offer?

The model surprises and challenges. Can it solve my maths homework? Can it write code? Can it structure data? Does it know the right answer to historical questions? Can it make my work easier? Do I no longer have to do homework or even think and work for myself?
People around the world have been using ChatGPT since January 2023 and continue to develop it with every input. The model can provide grammatically and semantically convincing answers to questions, it can translate and is trained for several languages. It already provides convincing support in structuring text, in summarising or translating.

What is critical? A risk assessment

At the same time, knowledge workers and scientists quickly realise: it generates false source information, provides historically false “facts” or answers that narrowly miss the factual. The model does not know the difference between true and false answers because it sounds convincing in the first place and gives probable answers, but not necessarily factually correct ones. It is a language model, not a knowledge base. So in using it, we are challenged to think for ourselves and check results, to do source criticism and decide for ourselves what is right and what is wrong. It is necessary to deal with written text even more critically than before.

Also critical is the data basis and the resulting answers. The data basis of the model goes back to a web crawl from 2021 and its contents are difficult to verify. Racist reproduction of content cannot be ruled out either. The task here will be to develop applications that make sense for museum users on the basis of scientifically proven knowledge.

In addition, ChatGPT can only be operated with a high energy input, not only the training of the model itself, but also every query costs energy and contributes to the CO2 balance. This has to be taken into account for museums and the question of climate neutrality. It is estimated that Open AI used 1024x A100 GPUs for 34 days for training. This means that OpenAI took $4.6 million to train. Although the energy consumption has not been officially confirmed, it is estimated that the training consumed 936 MWh. This is equivalent to the consumption of almost 100,000 average households in Europe per day.

The application is currently under observation by data protection officers. In Italy, the application has already been banned due to a security vulnerability on 20.3.23. The application is also being examined in Germany.
In addition, there are demands to stop the rapid and overrunning development until politics and ethics as well as the social discussion have the opportunity to follow the developments. Instead, the time should be used to make the systems more accurate, secure, interpretable, transparent, robust, coordinated, trustworthy and loyal. Governance and certification and monitoring as well as legal issues (e.g. liability) are to be worked on. So it may well be that the technical possibilities are slowed down for the time being by politics and administration.

Which use cases have been suggested by citizens or have we tested so far?

As part of the Creative User Empowerment project, suggestions were made as early as 2021 during hackathons on how interaction AI and language models can be used for museum data. A chat with historical personalities was suggested or personalised tours based on individual interests.

In the course of internal initial tests, the following application areas in particular have proven useful: Creating social media texts, abbreviations and summaries, lists and quizzes, creating poems from object texts, translating complex curatorial texts into simple language, or even explaining the function of algorithms. At the same time, it quickly became clear that without training, the model was not suitable for content-related, technically correct answers to museum-related content. For example, incorrect facts about existing museum objects were generated or incorrect recommendations about museum visits were given. In this case, the risks of creating wrong contexts are particularly high without training and own data bases as well as quality assurance procedures.

Which use cases are conceivable for the museum and cultural sector?

How can the language models be helpful for museum operations? The American Association for Museums has already given a first assessment in a blog post: 
https://www.aam-us.org/2023/01/25/chatting-about-museums-with-chatgpt/

In a conversation with the interaction AI, suggestions are developed for how GPT can be used in the museum (creating exhibition labels, creating catalogues, creating educational materials, developing Q&A, or creating audio guide texts), how a museum visit can be fun or what a curriculum for Museum Studies should look like. In addition to lists, which can already be created excellently, the system also develops arguments in dealing with the return of cultural property as well as in dealing with NFT (non-fungible tokens). Beyond the gimmicks and supposedly philosophical conversations that are a good way to pass the time (and use electricity), GPT is already used by more than 40% of employees in the US in their daily and professional lives, so it already provides incidental support for applications, assignments, summaries or presentations. 
https://www.fishbowlapp.com/insights/70-percent-of-workers-using-chatgpt-at-work-are-not-telling-their-boss/

So far, the project has conducted some internal tests and expert workshops to better assess the opportunities and risks of ChatGPT for cultural data. An expert workshop was held at the University of Amsterdam, where experts from archives, libraries and museums conducted initial experiments. 
https://netwerkdigitaalerfgoed.nl/nieuws/chatgpt-nog-lang-niet-perfect-maar-met-potentie-voor-het-erfgoedveld/
The results of the explorations show: It is important to ask the right questions. Translations can be helpfully supported, but they have to be checked and that in turn causes a lot of effort. For less common languages (e.g. Malay) or historical languages such as early modern Dutch, 54% of the results could be used. The creation of stories based on metadata could also already achieve good results.

Read about the event:
Netwerk Digitaal Erfgoed: https://netwerkdigitaalerfgoed.nl/nieuws/chatgpt-nog-lang-niet-perfect-maar-met-potentie-voor-het-erfgoedveld/
Stadsarchief Amsterdam: https://www.amsterdam.nl/stadsarchief/organisatie/blog-bronnen-bytes/spelen-chatgpt/

It is also conceivable that the language model will help to supplement incomplete data bases, better describe images or even interact with structured ontologies such as culture-specific thesauri: https://forum.iconclass.org/t/can-chatgpt-help-us-to-describe-and-interpret-images/175

Interim conclusion

So museums, like other fields, will have the task of finding out what impact the technology can and should have on their own productions and workflows. It will be about how museums deal with their data in interaction with the model and other AI models. Not least against the background that powerful AI models rely on large amounts of “good” data, carefully curated, legally secured, and digitally provided museum data will become more important - in the best case, it contains differentiated and scientifically sound content that can be used as a basis for knowledge. At the same time, it will continue to be an important task to protect the rights of artists and scientists, to secure authorship and to find a culturally appropriate way of dealing with the new gold - data - in the race for digital business models.

Meaning or bullshit genrator?

The text model produces the appearance of meaning, not least through the way interaction is developed - through a flashing cursor and evolving letters - we are in the midst of a social perceptual shift in which meaning and significance are ascribed to computational results of the language model (and now and then the question of consciousness and humanity is happily negotiated).
Digitality is changing rapidly and with it the expectations of users and their habits. Catalogues and static environments are no longer desired; instead, the need for lively interaction is increasing.
It is not least cultural actors who serve the need for anthropomorphism by developing avatars and identification figures and thus humanising machines. It will therefore become more important to develop corresponding reflective competences internally and to transform them into media-critical educational offers.

Use of GPT in the context of Creative User Empowerment

Currently (04/2023), the opportunities and risks of the language model are being further examined. Specialist workshops help to further explore the possible application scenarios in the cultural heritage field and present them with concrete examples.
Various surveys were conducted with the AI pilots last year to investigate the use of a generative language model. The users certainly wanted support in the creation of texts, but also placed a high value on professional quality and scientifically validated content - the AI pilots at the Badisches Landesmuseum saw an important role of the museum primarily in a trustworthy knowledge institution and less in digital spaces that enable playing with false or half-true content.
For the development of the xCurator tool, it is therefore conceivable that the connection of the API and the restriction to the museum’s data basis will support the interaction of users with the collection content. Searching and story creation can also be facilitated through suggestions for suitable content. For this, it is necessary to understand the process of creating prompts and re-training the model and how the language model interacts with the museum data.
The creative possibilities that are helpful for museum users should also be further opened up: https://towardsdatascience.com/20-creative-things-to-try-out-with-gpt-3-2aacee3e2abf
For the metadata domain, especially for archives and museums, interesting experiments have also already been performed, including operations relevant to the CUE project such as the extraction of entities like names, places or others: https://thecaglereport.com/2023/03/16/nine-chatgpt-tricks-for-knowledge-graph-workers/
Furthermore, it will be necessary to understand for the education sector how GPT is already changing education and teaching and thus also the access of teachers and students to digital educational content. The question arises whether the museum with its data can also be a good experimental space in which the use of AI models can be learned and critically questioned.
Furthermore, it is already possible to support the creation of social media texts through apps such as Keybot and, for example, to translate curatorially created catalogue texts into common social media formats and automate this process.
And last but not least, it will be the task of the publicly funded project to find open source alternatives that can be used in perspective: https://www.blogmojo.de/chatgpt-alternative/

Events
On Data-Doe Dag on 2023-04-17, datasets will be enhanced together with Dutch colleagues: Data-do-day - Network for Digital Cultural Heritage (netwerkdigitaalerfgoed.nl).

Within the network AI and Museums, conceivable areas of application will be presented and identified in a workshop on 2023-04-20.

  • With: Prof. Dr. Iryna Gurevych, Head of UKP Lab, TU Darmstadt
  • Heleen Wilbrink, Het Utrechts Archief / Aincient
  • Etienne Posthumus, Allard Pierson Amsterdam & Sonja Thiel, Badisches Landesmuseum Karlsruhe
  • Experiments by: Thomas Geissel, Bot or Not & Andreas Reich, AI Tutor
  • ​​​​​​​To register: https://umfrage.landesmuseum.de/s/fvakpap

Want to help shaping AI in Museums?

Register for the AI-Pilot Program 2023