拉斯维加斯赌城

图片

Tobias Hallmen M.Sc.

Research Assistant
Chair for Human-Centered Artificial Intelligence
Phone: +49 821 598 2322
Fax: +49 821 598 2349
Email:
Room: 2016 (N)
Open hours: Upon Request
Address: Universit?tsstra?e 6a, 86159 Augsburg

Research Interests

I carry out multimodal (audio, video, text) conversation analyses using methods from the field of machine learning and artificial intelligence (AI). I am investigating whether and how these methods can be used to evaluate and assess different conversational situations.
The conversations take place in the context of psychotherapy sessions (TherapAI project), human medicine and teacher training (KodiLL project). The aim is to find correlations and use them to make the quality of these conversations measurable and (automatically) assessable, as well as to improve them in the long term. This benefits both sides - the therapists/medical practitioners/teachers and the patients/parents.
For example, it is conceivable that the characteristics found could be used to intervene in therapies, or that AI-supported feedback and recommendations for action could be given to students during training in order to conduct better conversations in the future.

Final Theses

Here are topics that I envision as my final thesis. Ideally, the results will be implemented as a module for Nova - this makes it easy to reuse and to correlate and evaluate the different characteristics on existing data sets. You are also welcome to contribute your own thematically appropriate suggestions:

  • Speaker diarisation: Often there are no audio recordings separated by speaker, or if there are, you can also hear the other speakers (quieter) in your own recording. This distorts the assignment of audio-based features, e.g. transcription or emotion recognition.
  • Existing methods use purely audio here - it would be conceivable to supplement this modality with video or text or other derived features (points of view) and thus improve speaker classification.
  • Reception signals: While someone is speaking, listeners usually give reception signals (yes, mhm, head nod, etc.). These are indications of whether and to what extent someone is involved in the conversation. Existing methods need to be implemented, improved and evaluated.
  • Remote photoplethysmography: Usually the filmed persons do not wear sensors, but some values would still be interesting, e.g. the “cuff-free blood pressure measurement” via video to determine heart rate and variability. These can be signs of excitement in the conversation and useful for evaluations.
  • Language models as experts: Can (small) language models relieve people of time-consuming annotation and evaluation work, or at least support them? Due to sensitive data, these models must be executable locally, at best on end-user hardware.
  • Language models as training partners: Language models are often used for synthetic data generation. Can they also be used (locally) as a useful training partner for practicing parental conversations at different levels of difficulty?

Publications

2024

2024

Moritz Bauermann, Kathrin Gietl, Tobias Hallmen and Karoline Hillesheim. 2024. KI in Beratungsgespr?chen: Zukunft der Kommunikation [Abstract].
PDF | BibTeX | RIS | URL

Moritz Bauermann, Kathrin Gietl, Karoline Hillesheim, Tobias Hallmen and Andreas Hartinger. 2024. KI-basiertes Feedback für simulierte Elterngespr?che: eine qualitative Analyse studentischer Wahrnehmung und Gestaltungsperspektiven – KI-WaGen [Abstract].
PDF | BibTeX | RIS | URL

Moritz Bauermann, Ann-Kathrin Schindler, Tobias Hallmen, Miriam Kunz, Elisabeth André and Thomas Rotthoff. 2024. Studienprotokoll: "AI Effect – Untersuchung der lernwirksamen Annahme von KI-generierten und durch Avatare vermittelten Feedback und Feedforward zur ?rztlichen Kommunikation bei Medizinstudierenden in einer Simulationsumgebung" [Abstract].
PDF | BibTeX | RIS | URL

Search