Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Machine translations are often brought up as a gotcha whenever I criticize LLMs.

Machine translations are often brought up as a gotcha whenever I criticize LLMs.

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
255 Indlæg 170 Posters 0 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • gargron@mastodon.socialG gargron@mastodon.social

    Machine translations are often brought up as a gotcha whenever I criticize LLMs. It's worth pointing out two things: Machine translations existed decades before LLMs, and yes, machine translations are useful. However: I would never in my life read a machine translated book. Understanding what a social media post is talking about in rough terms? Sure. Literature? Absolutely not. Hell, have you ever seen machine translated subtitles? It's absolute garbage.

    dat@social.g33ky.deD This user is from outside of this forum
    dat@social.g33ky.deD This user is from outside of this forum
    dat@social.g33ky.de
    wrote sidst redigeret af
    #51
    @Gargron and then there's the question on how it's used

    see firefox that generated new translations and threw awai human written ones
    jrdepriest@infosec.exchangeJ 1 Reply Last reply
    0
    • gargron@mastodon.socialG gargron@mastodon.social

      Machine translations are often brought up as a gotcha whenever I criticize LLMs. It's worth pointing out two things: Machine translations existed decades before LLMs, and yes, machine translations are useful. However: I would never in my life read a machine translated book. Understanding what a social media post is talking about in rough terms? Sure. Literature? Absolutely not. Hell, have you ever seen machine translated subtitles? It's absolute garbage.

      clement@sciences.socialC This user is from outside of this forum
      clement@sciences.socialC This user is from outside of this forum
      clement@sciences.social
      wrote sidst redigeret af
      #52

      @Gargron As an LLM would say to a translator: "All your job are belong to us".

      brad@1040ste.netB 1 Reply Last reply
      0
      • gargron@mastodon.socialG gargron@mastodon.social

        I have the impression that primarily anglophone people don't read as much translated literature, because so much good literature already exists in their language, so this issue may not be as familiar within that demographic. As someone who did not grow up anglophone, I can tell you there is a world of difference between a good and a bad translation even when done by humans. Machine translations are not even on the scale.

        juandesant@mathstodon.xyzJ This user is from outside of this forum
        juandesant@mathstodon.xyzJ This user is from outside of this forum
        juandesant@mathstodon.xyz
        wrote sidst redigeret af
        #53

        @Gargron to make matters worde, at least in the UK when you buy a DVD it only comes with English audio, English audio with descriptions, and maybe original audio, and just English subtitles, and English for the hard of hearing. That’s it. But in Spain, the same DVD, locked to the same region, carried the original audio, audio described English audio, Spanish dubbing, German dubbing, Italian dubbing… and all those languages in subtitles, plus some more.

        So it is really difficult for them to be exposed to non-English content,

        juandesant@mathstodon.xyzJ 1 Reply Last reply
        0
        • tubemeister@mstdn.socialT This user is from outside of this forum
          tubemeister@mstdn.socialT This user is from outside of this forum
          tubemeister@mstdn.social
          wrote sidst redigeret af
          #54

          @aeduna @Gargron Oh yes. Translating the story is one thing, but especially with Pratchett it’s only half the story.

          Puns are horrible to translate, you either just skip them because they just don’t work, or you go to extremes to wring some kind of joke out of them.

          There isn’t necessarily a right approach here. This particular Pratchett translation apparently skipped a lot, but I also remember a HHGTTG translation that took the “a joke at *any* cost” path and um.

          1 Reply Last reply
          0
          • sonikku@techhub.socialS sonikku@techhub.social

            @benroyce @Gargron back in the day I had a bootleg DVD of The Two Towers with the best subtitles China could do.

            A This user is from outside of this forum
            A This user is from outside of this forum
            avincentinspace@furry.engineer
            wrote sidst redigeret af
            #55

            @Sonikku there is absolutely no way this is real

            sonikku@techhub.socialS 1 Reply Last reply
            0
            • sonikku@techhub.socialS sonikku@techhub.social

              @benroyce @Gargron back in the day I had a bootleg DVD of The Two Towers with the best subtitles China could do.

              dudinka@mastodon.worldD This user is from outside of this forum
              dudinka@mastodon.worldD This user is from outside of this forum
              dudinka@mastodon.world
              wrote sidst redigeret af
              #56

              @Sonikku @benroyce @Gargron

              this kind of bad translations /autosubtitles is the best comedy to my neurodivergent brain.

              i speak several languages and always try to find out where it went wrong too, so it's educating too 🙂

              benroyce@mastodon.socialB 1 Reply Last reply
              0
              • aeva@mastodon.gamedev.placeA aeva@mastodon.gamedev.place

                @Gargron I think anglophones experience start difference between good and bad translations more often through video games

                gabboman@gabboman.xyzG This user is from outside of this forum
                gabboman@gabboman.xyzG This user is from outside of this forum
                gabboman@gabboman.xyz
                wrote sidst redigeret af
                #57

                All your bases are belong to Us

                aeva@mastodon.gamedev.placeA alice@mk.nyaa.placeA 2 Replies Last reply
                0
                • juandesant@mathstodon.xyzJ juandesant@mathstodon.xyz

                  @Gargron to make matters worde, at least in the UK when you buy a DVD it only comes with English audio, English audio with descriptions, and maybe original audio, and just English subtitles, and English for the hard of hearing. That’s it. But in Spain, the same DVD, locked to the same region, carried the original audio, audio described English audio, Spanish dubbing, German dubbing, Italian dubbing… and all those languages in subtitles, plus some more.

                  So it is really difficult for them to be exposed to non-English content,

                  juandesant@mathstodon.xyzJ This user is from outside of this forum
                  juandesant@mathstodon.xyzJ This user is from outside of this forum
                  juandesant@mathstodon.xyz
                  wrote sidst redigeret af
                  #58

                  @Gargron and even Netflix shows different audio options in Spain (around five languages audio, plus original English audio for an American or British TV series, and at least the same subtitles) or the UK (just English audio, maybe with audio descriptions).

                  You need to explicitly go to your user settings *on the website* to explicitly add languages you might be interested in. Then those audio and subtitle options appear for those titles that support them.

                  funcrunch@me.dmF 1 Reply Last reply
                  0
                  • dudinka@mastodon.worldD dudinka@mastodon.world

                    @Sonikku @benroyce @Gargron

                    this kind of bad translations /autosubtitles is the best comedy to my neurodivergent brain.

                    i speak several languages and always try to find out where it went wrong too, so it's educating too 🙂

                    benroyce@mastodon.socialB This user is from outside of this forum
                    benroyce@mastodon.socialB This user is from outside of this forum
                    benroyce@mastodon.social
                    wrote sidst redigeret af
                    #59

                    @dudinka @Sonikku @Gargron

                    there's a universe of this stuff out there

                    my favorite is from the 2008 beijing olympics, a restaurant translating its name for foreign visitors, and dutifully announcing what the translation service fed back to them

                    https://boingboing.net/2008/07/15/chinese-restaurant-c.html

                    1 Reply Last reply
                    0
                    • tubemeister@mstdn.socialT This user is from outside of this forum
                      tubemeister@mstdn.socialT This user is from outside of this forum
                      tubemeister@mstdn.social
                      wrote sidst redigeret af
                      #60

                      @aeduna @Gargron From what I’ve heard it’s hit and miss.

                      But I wouldn’t know, I’ve been reading mostly in English for at least 30 years. 😉

                      1 Reply Last reply
                      0
                      • gargron@mastodon.socialG gargron@mastodon.social

                        Machine translations are often brought up as a gotcha whenever I criticize LLMs. It's worth pointing out two things: Machine translations existed decades before LLMs, and yes, machine translations are useful. However: I would never in my life read a machine translated book. Understanding what a social media post is talking about in rough terms? Sure. Literature? Absolutely not. Hell, have you ever seen machine translated subtitles? It's absolute garbage.

                        gbargoud@masto.nycG This user is from outside of this forum
                        gbargoud@masto.nycG This user is from outside of this forum
                        gbargoud@masto.nyc
                        wrote sidst redigeret af
                        #61

                        @Gargron

                        My main use case for machine translations is spot checking words in languages I don't know as well as I should.

                        They are great for that.

                        1 Reply Last reply
                        0
                        • A avincentinspace@furry.engineer

                          @Sonikku there is absolutely no way this is real

                          sonikku@techhub.socialS This user is from outside of this forum
                          sonikku@techhub.socialS This user is from outside of this forum
                          sonikku@techhub.social
                          wrote sidst redigeret af
                          #62

                          @AVincentInSpace it literally is haha. Fellowship was just as bad.

                          I was still dialup back in those days so I’d order my bootleg DVDs from a dude in Hong Kong and I just about died laughing when I turned on subtitles randomly

                          A 1 Reply Last reply
                          0
                          • gargron@mastodon.socialG gargron@mastodon.social

                            Machine translations are often brought up as a gotcha whenever I criticize LLMs. It's worth pointing out two things: Machine translations existed decades before LLMs, and yes, machine translations are useful. However: I would never in my life read a machine translated book. Understanding what a social media post is talking about in rough terms? Sure. Literature? Absolutely not. Hell, have you ever seen machine translated subtitles? It's absolute garbage.

                            qgustavor@urusai.socialQ This user is from outside of this forum
                            qgustavor@urusai.socialQ This user is from outside of this forum
                            qgustavor@urusai.social
                            wrote sidst redigeret af
                            #63

                            @Gargron I worked with subtitle translations for years... I need to comment on this!

                            The main issue people working with machine translated subtitles is that people take models for translating things in a single modal – text – and applying to a multimodal media – video. Of course the results are horrible!

                            There are research on improving that, sure, I did some, even, but even we are FAAAR from getting them any good. Translating "The nurse aided the doctor take care of the patient." to many languages require guessing the gender of three people! LLMs will often default to male, female and male, due to bias.

                            But, the sad thing we have to admit: many works of art are so unpopular the only translations people will have are machine ones, from weird anime like Sazae-san, to Mastodon toots.

                            qgustavor@urusai.socialQ 1 Reply Last reply
                            0
                            • galaxis@mastodon.infra.deG galaxis@mastodon.infra.de

                              @Gargron Machine translated UIs are even worse a crime. LLMs don't have the slightest idea of the context of some random button, and (looking at Microsoft's German UI translations recently) seem to choose the worst possible word to drop into that.

                              tdelmas@mamot.frT This user is from outside of this forum
                              tdelmas@mamot.frT This user is from outside of this forum
                              tdelmas@mamot.fr
                              wrote sidst redigeret af
                              #64

                              @galaxis @Gargron Or Google. Last week I stumbled upon an Google admin interface where the checkbox with the English label "Enforcement" was translated in French with the equivalent of "Activation". It was about 2FA, and those both words doesn't mean at all the same thing in that context!

                              maco@wandering.shopM 1 Reply Last reply
                              0
                              • gabboman@gabboman.xyzG gabboman@gabboman.xyz

                                All your bases are belong to Us

                                aeva@mastodon.gamedev.placeA This user is from outside of this forum
                                aeva@mastodon.gamedev.placeA This user is from outside of this forum
                                aeva@mastodon.gamedev.place
                                wrote sidst redigeret af
                                #65

                                @gabboman @Gargron somebody set up us the bomb

                                1 Reply Last reply
                                0
                                • lauerhahn@sfba.socialL lauerhahn@sfba.social

                                  @Gargron I minored in linguistics in college, and a lot of exciting work was being done at the time around developing syntax models of how languages worked (and different ways humans use syntax), in part to inform machine translation models. This was more than 25 years ago. No LLMs involved.
                                  I have not kept up with current developments in machine translation but I strongly suspect that it's built on the foundation of those decades of work actually understanding how languages function, and what maps or doesn't map. Which is completely different than expecting generative AI to create a model.

                                  aran@localization.cafeA This user is from outside of this forum
                                  aran@localization.cafeA This user is from outside of this forum
                                  aran@localization.cafe
                                  wrote sidst redigeret af
                                  #66

                                  @lauerhahn @Gargron Alas no. Most machine translation engines now are purely statistical. They don't bother with semantic analysis, they just brute forced a mathematical model with tons of data.

                                  1 Reply Last reply
                                  0
                                  • stuartb@social.teamb.spaceS stuartb@social.teamb.space

                                    @Gargron Many years ago, while on holiday in Amsterdam, I bought a Dutch translation of a book by one of my favourite authors, Terry Pratchett.
                                    In it, there was an essay, in English, by Terry, about his struggles to find a translator for the book, which was only accomplished when he realised that it wasn't just a case of taking the text and replacing it with Dutch.
                                    No, large sections would have to be entirely re-written by the translator, to use concepts that a Dutch audience would find familiar.
                                    And not just in Dutch, but every language.
                                    The example he gave was one character who was experiencing the feeling of being stuck in traffic on a busy road on a Sunday afternoon, and after miles of driving, finding that the cause of the tailback was a little old lady out for her weekly drive to church in her trusty old Morris Marina, never getting above 20 MPH becuase it felt too fast.
                                    This is something that British people are well acquanted with, but the Dutch translator had to come up with a completely different way of explaining this, because it's not something particularly prevalant over there.
                                    It's not just about translating the words, its translating the feelings, the emotions, to something readers in another place will understand.
                                    And LLM's fail spectacularly at that.

                                    vfrmedia@social.tchncs.deV This user is from outside of this forum
                                    vfrmedia@social.tchncs.deV This user is from outside of this forum
                                    vfrmedia@social.tchncs.de
                                    wrote sidst redigeret af
                                    #67

                                    @stuartb @gargron

                                    Morris Marina wasn't completely uncommon in the Netherlands - although only a handful made it over there and interestingly it seems nearly all have been preserved as oldtimers!

                                    But the old lady in NL would more likely ride her bicycle to church, probably at much less than 30 km/h and quite likely so would all the rest of the congregation - the translator would have definitely needed to find another concept to match this..

                                    I got a DIY maintenance manual for my car which is in German (there isn't an equivalent one of same quality in English) and I definitely won't wholly trust an LLM to translate that (instead I print the relevant pages, go through it by hand and make notes of points that don't immediately come to mind as my German is only as good as a teenager)

                                    stuartb@social.teamb.spaceS 1 Reply Last reply
                                    0
                                    • stuartb@social.teamb.spaceS stuartb@social.teamb.space

                                      @Gargron Many years ago, while on holiday in Amsterdam, I bought a Dutch translation of a book by one of my favourite authors, Terry Pratchett.
                                      In it, there was an essay, in English, by Terry, about his struggles to find a translator for the book, which was only accomplished when he realised that it wasn't just a case of taking the text and replacing it with Dutch.
                                      No, large sections would have to be entirely re-written by the translator, to use concepts that a Dutch audience would find familiar.
                                      And not just in Dutch, but every language.
                                      The example he gave was one character who was experiencing the feeling of being stuck in traffic on a busy road on a Sunday afternoon, and after miles of driving, finding that the cause of the tailback was a little old lady out for her weekly drive to church in her trusty old Morris Marina, never getting above 20 MPH becuase it felt too fast.
                                      This is something that British people are well acquanted with, but the Dutch translator had to come up with a completely different way of explaining this, because it's not something particularly prevalant over there.
                                      It's not just about translating the words, its translating the feelings, the emotions, to something readers in another place will understand.
                                      And LLM's fail spectacularly at that.

                                      mikefromlfe@cupoftea.socialM This user is from outside of this forum
                                      mikefromlfe@cupoftea.socialM This user is from outside of this forum
                                      mikefromlfe@cupoftea.social
                                      wrote sidst redigeret af
                                      #68

                                      @stuartb @Gargron@mastodon.social
                                      Exactly!
                                      I did some work for a technical translation company after I retired (not translating I hasten to add) and the skill was making the language relevant to the target audience. To do that the translator had to have both a knowledge of the subject matter, the language and what sort of person would be using the translation.
                                      And they wondered why Google wasn't good enough, and why we charged what we did.

                                      1 Reply Last reply
                                      0
                                      • qgustavor@urusai.socialQ This user is from outside of this forum
                                        qgustavor@urusai.socialQ This user is from outside of this forum
                                        qgustavor@urusai.social
                                        wrote sidst redigeret af
                                        #69

                                        @grishka @Gargron Google Translate switched to using the same tech LLM uses. Actually, it's the opposite: the transformer model that LLM uses was created for translation first.

                                        If you are going to compare both, since the tech is pretty much the same, the main change between then is how they are trained: people often use LLMs that are trained to behave chatbots for translation, it create biases that are not present in models that are only trained for translation, mainly, LLMs are prone to "ignore all instructions".

                                        But the tech is pretty much the same: transformer models deal way better with context in comparison with older models.

                                        1 Reply Last reply
                                        0
                                        • gargron@mastodon.socialG gargron@mastodon.social

                                          I have the impression that primarily anglophone people don't read as much translated literature, because so much good literature already exists in their language, so this issue may not be as familiar within that demographic. As someone who did not grow up anglophone, I can tell you there is a world of difference between a good and a bad translation even when done by humans. Machine translations are not even on the scale.

                                          kaleissin@wandering.shopK This user is from outside of this forum
                                          kaleissin@wandering.shopK This user is from outside of this forum
                                          kaleissin@wandering.shop
                                          wrote sidst redigeret af
                                          #70

                                          @Gargron Preach!

                                          1 Reply Last reply
                                          0
                                          Svar
                                          • Svar som emne
                                          Login for at svare
                                          • Ældste til nyeste
                                          • Nyeste til ældste
                                          • Most Votes


                                          • Log ind

                                          • Har du ikke en konto? Tilmeld

                                          • Login or register to search.
                                          Powered by NodeBB Contributors
                                          Graciously hosted by data.coop
                                          • First post
                                            Last post
                                          0
                                          • Hjem
                                          • Seneste
                                          • Etiketter
                                          • Populære
                                          • Verden
                                          • Bruger
                                          • Grupper