AI-powered transcription tool used in hospitals invents things no one ever said

abcnews.go.com

31 points by MrVandemar a year ago

jqpabc123 a year ago

Expecting intelligence and accuracy to "emerge" from a statistical process is absurb.

In others words, LLMs are only clearly useful if the results don't really matter or they can and will be externally verified.

LLMs negate a fundamental argument for computing --- instead of accurate results at low cost, we now have inaccurate results at high cost.

There is undoubtedly some utility to be had here but it is not at all clear or obvious that this will be widely transformative.

Ukv a year ago

I don't think it's necessarily absurd to expect accuracy from statistical methods - in many domains (including voice transcription) they blow the accuracy of traditional non-statistical approaches out of the water, and in some areas even surpass human-level accuracy.
Main thing is to measure the accuracy of the approach (regardless of whether it's traditional, statistical, or human) to determine if it's fit for purpose. In this case it sounds like the transcription shouldn't be solely relied on for high-risk decisions in its current state, but could be useful for something like searching through the reference audio if it were available.
That the issue tends to be from "pauses, background sounds or music playing" also makes me suspect a lot of the cases could be relatively low hanging fruit - check the noise gate and normalization on the microphones, or potentially have the model output a quality score for each word so that low-confidence background noise can be displayed to the end user as smaller fainter text for instance, instead of part of the conversation.
pachorizons a year ago

Isn't that what is promised though? What is the benefit to automated transcription if each and every single transcription must be manually audited? Where is the cost or labor saving?
- 39896880 a year ago
  
  It is much easier to correct a transcription than to generate it wholesale. As well, the task of audio transcription correction has long since been commoditized because of the deployment of speech recognition on every smartphone.
  It’s not quite a solved problem but it’s close.
  - jqpabc123 a year ago
    
    It’s not quite a solved problem but it’s close.
    As long as the results don't really matter and no one is auditing, it appears more "solved" than it actually is.

logn a year ago

There will always be a need for both human oversight and accountability, and this is a good example. I think the net result will be, eventually, more and better jobs. It's a better job to validate the transcriptions than to actually transcribe.

Another example in medicine, radiologists will start handling orders of magnitude more cases. But the number of scans done might also increase exponentially as costs likewise drop.

jqpabc123 a year ago

It's a better job to validate the transcriptions than to actually transcribe.
In the real world "better" typically translates to lower cost.
Which costs less? 1) Pay someone to transcribe a recording or 2) pay for a LLM transcription + pay someone to verify the transcription from a recording.
It is far from certain or obvious that #2 is actually "better".

rahimnathwani a year ago

  A machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed.

The '100 hours' is almost useless information. 'About half' is meaningless without knowing the sample size. Perhaps he had 5 transcripts averaging 20 hours each, and 2 of the 5 had issues. Or perhaps there were hundreds of short transripts, where the 'almost half' would imply significance.

freilanzer a year ago

> In an example they uncovered, a speaker said, “He, the boy, was going to, I’m not sure exactly, take the umbrella.”

> But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”

> A speaker in another recording described “two other girls and one lady.” Whisper invented extra commentary on race, adding "two other girls and one lady, um, which were Black.”

> In a third transcription, Whisper invented a non-existent medication called “hyperactivated antibiotics.”

I didn't expect it to be this bad.

AStonesThrow a year ago

This clinical service is not something that you, the patient, should want or allow.

I use a digital recorder app to record audio from my clinical consultations. It's important for me, as a patient, to have a record, because I'm alone in there, and I frequently misremember or misunderstand things that were said.

My current recorder app has a transcription feature. It's fairly good at picking out words. It's supposed to recognize and label speakers as well, but that requires a lot of manual editing after the fact.

Still, it's fantastic having my own durable record of what was said to me, and by me. There are usually a few surprises in there!

Now, I've stopped asking for permission to record, because usually they become hostile to it. Nevertheless, it's legal, and it's my right to have.

notjulianjaynes a year ago

I used whisper to create an srt file from some some voice memos I made while I was driving and it 'hallucinated' "subtitles by the amara.org community" at the very end. Re ran as txt and what do you know that line disappeared.

clcaev a year ago

https://dl.acm.org/doi/10.1145/3630106.3658996

add-sub-mul-div a year ago

Oh, you don't say.

(Literally)

sirolimus a year ago

Well, no shit AI isnt meant for anything as serious as medicinal logging

Spivak a year ago

But it's fine for their document OCR? Dragon has been doing dictation for years and years. Either the service works to an acceptable degree or it doesn't. Audio transcription isn't some unknown quantity.

mensetmanusman a year ago

The whisper of death is a risk.