If there's enough system resources (RAM/VRAM) to run both the STT model and the LLM, or if LLM is offloaded to another endpoint, there should be an option to perform live summarization of the transcript by calling upon the LLM repeatedly during the recording. It could be time based, or based on a VAD, or a combination, or a setting. That way during longer meetings we can see what has been discussed and expand on things that are unclear or need to be further clarified. To start it can just be a configurable "Run summarization every X minutes" option.