
Live vs post-session: why in-the-moment assistance wins (internal study)
We compared live-generated suggestions against post-session summaries across 84 sessions with 12 therapists. 73 percent of post-session suggestions arrive too late to impact the session where they were needed. Results, methodology, limitations.
Disclaimer up front: this is an internal study with a convenience sample, not peer-reviewed, and the findings serve as working hypotheses, not as generalizable evidence. We share it on this blog because we believe in publishing what we see, with appropriate caveats, rather than waiting for external validation that can take two years.
That said, the numbers are interesting and worth discussing.
The question
Most tools in the space of assistance for clinical and HR professionals work post-session: they record the conversation, process it afterward, and deliver a summary or suggestions the next day.
We built CauceOS with a different strong hypothesis: live, in-session assistance has a value that is qualitatively different from post-session assistance. Not just faster — different in nature.
To put that hypothesis through something more demanding than intuition, we set up an internal study with a group of practicing therapists.
Methodology
- Participants: 12 practicing therapists (8 with individual practice, 4 focused on couples). 9 in Spanish, 3 in English. Average clinical experience: 11 years.
- Sessions: 84 clinical sessions with explicit consent from patients and therapists, de-identified before analysis.
- Protocol: Each session was processed through two parallel pipelines:
- Live: the co-pilot running live during the session, generating alerts and suggestions in real time.
- Post: the same transcript analyzed afterward, generating a summary and suggestions the next day, simulating the flow of a typical post-session tool.
- Main variable: for each suggestion or alert generated, two independent clinicians (not involved in the session) rated it on a four-level rubric:
- A: Would have changed the direction of the session if the therapist had received it in the moment.
- B: Useful if the therapist had received it in the moment.
- C: Useful only for documentation or post-session reflection.
- D: Irrelevant or false positive.
Central result
Of the 3,412 suggestions generated in total (across live and post), 73 percent of the suggestions generated in the post-session pipeline fell into category A or B — meaning they were suggestions that would have changed or improved the session if they had arrived in the moment.
In other words: nearly three out of four suggestions generated post-session arrive too late to do what they could best do.
Suggestions actually appropriate for post-session — category C, typically documentation, administrative follow-up, next steps in the treatment plan — represented 22 percent. The remaining 5 percent were false positives or irrelevant.
When we ran the same analysis on the live pipeline, the distribution flips: 81 percent of live suggestions were used or recognized as useful by the therapist during the session. The ones that were not used were, in most cases, not because they were wrong, but because the therapist had already noticed them.
Three qualitative findings
Beyond the aggregate number, three patterns emerged from video analysis and interviews with the participating therapists.
1. Intervention windows are short
In psychotherapy sessions, a relevant clinical observation has a useful window of 60 to 180 seconds from the moment the signal appears in the patient's speech. Past that window, the opportunity to intervene on that material disappears — the patient changes topics, affect drops, the depth of the moment dilutes.
A post-session suggestion, by definition, arrives outside that window. A live suggestion with sub-2-second latency lands inside it.
2. Therapists adjust their attention when they know an assistive system is running
An unexpected finding: several therapists reported that knowing the co-pilot is monitoring certain signals allowed them to relax their own vigilance on those same signals, and dedicate more attentional bandwidth to other dimensions of the session (therapeutic alliance, non-verbal language, dynamic formulation).
This suggests that the value of live is not only "don't miss signals" — it is redistributing the clinician's attention to where the system cannot reach. The human-machine synergy, in this case, is not competition: it is division of cognitive labor.
3. Patient-perceived quality goes up
In a sub-sample of 24 sessions where we asked the patient for feedback the next day (again, with explicit consent), the average session rating on a 1-10 scale was 8.7 when the therapist used the live co-pilot vs 7.9 when the post-session flow was used (patients did not know which condition was applied to their session).
This difference, although modest and statistically suboptimal given the sample size, is consistent with the qualitative report from therapists: live-assisted sessions "flowed better" because the clinician was not split between listening and taking notes.
Limitations
We want to be explicit about what this study is not:
- Convenience sample. The 12 therapists are contacts of our personal and professional network. They are not representative of the universe of clinicians.
- Not randomized at the patient level. Each session received both analysis pipelines, but only the live one was experienced as live; the post-session pipeline was simulated retrospectively.
- Suggestion rating by uninvolved clinicians helps with objectivity, but introduces its own judgment bias.
- No comparison with unassisted sessions. We did not measure what happens with sessions where no system is used. That would be the next phase of the study.
- Individual and couples psychotherapy sessions — the findings may not transfer directly to HR contexts (interviews, 1-on-1s, performance reviews), although the short-window hypothesis should apply analogously.
Why we're publishing this now
Two reasons.
First: because the decision to buy a tool for your clinical or HR practice should be based on evidence, not just marketing. There are many tools in this space. Some are good. But the dominant model today is post-session, and our finding is that most of the value of assistance is lost when it is executed only post-session. If you are going to invest time and money in a tool, it is worth having that information before choosing.
Second: because we want to attract methodological criticism. This study has serious limitations, we have listed them, and we welcome professionals with empirical methodology training telling us what is missing. We are designing a second phase that addresses the detected issues, ideally with an external research group involved. If you want to participate, write to us.
Provisional conclusion
With all caveats, the data from this internal study supports our design hypothesis: live clinical assistance is not a faster version of post-session assistance. It is a qualitatively distinct capability, with effects the post-session flow cannot reproduce.
The 73 percent "too late" is not a latency detail — it is the majority of the value the technology could have delivered, lost to an architectural decision.
That is exactly the decision CauceOS is changing.
More in this category
ResearchCauceOS · Newsletter
Get the next notes straight to your inbox
Reflections, practices, and updates from CauceOS. No spam. Unsubscribe anytime.
Keep reading
Researchcross-language translation
How accurate is cross-language translation in real sessions? An internal study with 50 bilingual sessions
We measured translation accuracy across 50 bilingual ES↔EN sessions with human evaluation. 92% on neutral phrases, 84% on clinical terms, 78% on regional idioms. What we learned and what still needs work.
Psychologycouples therapy
The Gottman Four Horsemen in virtual sessions: intervening at the exact moment
Criticism, contempt, defensiveness, and stonewalling are the four patterns that best predict the dissolution of a relationship. Detecting them live during a virtual session gives the couples therapist an intervention window the human eye alone cannot always reach.
Psychologycrisis detection
Early detection of crisis signals in therapy: how we assist the clinician without replacing them
How the co-pilot identifies, in real time, language associated with suicidal ideation, self-harm, and domestic violence — and why latency matters as much as sensitivity.