Interactive Prompt Optimization with the Human in the Loop for Natural Language Understanding Model Development and Intervention (INPROMPT)

The paradigm of few-shot or zero-shot learning for the creation of models in algorithmic natural language understanding assumes that little or no annotated text is available for the problem to be solved. Methods in this subject area therefore meet the challenge of relaxing the high data requirements that the optimization of deep neural networks entails. A typical approach is to use pre-trained neural language models and use a prompt to generate a word that describes an instance of text. For example, you can do sentiment polarity classification by entering a text instance such as "The person is very satisfied with the product." associated with a prompt and check whether the sentence "The product is good" or "The product is bad" results in a higher probability. Creating such prompts has the advantage that it does not necessarily require technical expertise, but creating good prompts is still not trivial. Existing research has approached the problem from two perspectives: (1) adapting existing language models using (few) annotated data points and manually generated prompt sets, and (2) using data-driven automatic prompt generation. We combine these two research directions in our project and start with the typical situation in which a language comprehension task is formulated vaguely, a more precise specification is still missing, and no annotated (but certainly non-annotated) texts are available. Our goal is to develop and analyze systems that automatically guide domain experts without technical training in machine learning to create well-functioning prompts. To do this, we use optimization methods that change prompts iteratively and estimate their quality with the help of a target function. This estimation is based on automatic predictions on text instances, based on the readability of the prompt, and based on the conclusiveness of an explanation of the decision-making. In our project, the objective function based on these factors is not automatically evaluated, but replaced by a "human in the loop". However, in order to study the problem of iterative optimization of prompts on a larger scale, we also simulate human decisions using automatic approximations of the human objective function. We expect that our project will significantly improve the transparency of prompt-based models and contribute to the democratization of the use of machine learning algorithms.

The projects starts in July 2024 and is funded by the German Research Foundation (DFG, KL 2869/13-1).

Publications related to this project

Menchaca Resendiz, Yarik/Klinger, Roman (2023a): Emotion-Conditioned Text Generation through Automatic Prompt Optimization. In: Proceedings of the 1st Workshop on Taming Large Language Models: Controllability in the era of Interactive Assistants! Prag: Association for Computational Linguistics. S. 24–30.

Menchaca Resendiz, Yarik/Klinger, Roman (2023b): Affective Natural Language Generation of Event Descriptions through Fine-grained Appraisal Conditions. In: Proceedings of the 16th International Natural Language Generation Conference. Prag: Association for Computational Linguistics. S. 375–387.

Kadiķis, Emīls/Srivastav, Vaibhav/Klinger, Roman (2022): Embarrassingly Simple Performance Prediction for Abductive Natural Language Inference. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle: Association for Computational Linguistics. S. 6031–6037.

Plaza-del-Arco, Flor Miriam/Martín-Valdivia, María-Teresa/Klinger, Roman (2022): Natural Language Inference Prompts for Zero-shot Emotion Classification in Text across Corpora. In: Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju: International Committee on Computational Linguistics. S. 6805–6817.