Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
18 Indlæg 8 Posters 31 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • pojntfx@mastodon.socialP pojntfx@mastodon.social

    Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

    kstrlworks@techhub.socialK This user is from outside of this forum
    kstrlworks@techhub.socialK This user is from outside of this forum
    kstrlworks@techhub.social
    wrote sidst redigeret af
    #4

    @pojntfx Mozilla will really invest everywhere except making a better browser.

    pojntfx@mastodon.socialP 1 Reply Last reply
    0
    • pojntfx@mastodon.socialP pojntfx@mastodon.social

      Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

      sus@timeloop.cafeS This user is from outside of this forum
      sus@timeloop.cafeS This user is from outside of this forum
      sus@timeloop.cafe
      wrote sidst redigeret af
      #5

      @pojntfx that is disappointing

      1 Reply Last reply
      0
      • pojntfx@mastodon.socialP pojntfx@mastodon.social

        Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

        freddy@social.security.plumbingF This user is from outside of this forum
        freddy@social.security.plumbingF This user is from outside of this forum
        freddy@social.security.plumbing
        wrote sidst redigeret af
        #6

        @pojntfx @buherator not true. The feature is currently in development and uses different models while things are still under test. This is pre-release software. Behavior will change.
        Currently, everything is proxied through mozilla infra. The model that will ship is (afaiu) not yet determined.

        oliviablob@mastodon.neat.computerO pojntfx@mastodon.socialP 2 Replies Last reply
        0
        • freddy@social.security.plumbingF freddy@social.security.plumbing

          @pojntfx @buherator not true. The feature is currently in development and uses different models while things are still under test. This is pre-release software. Behavior will change.
          Currently, everything is proxied through mozilla infra. The model that will ship is (afaiu) not yet determined.

          oliviablob@mastodon.neat.computerO This user is from outside of this forum
          oliviablob@mastodon.neat.computerO This user is from outside of this forum
          oliviablob@mastodon.neat.computer
          wrote sidst redigeret af
          #7

          @freddy @pojntfx @buherator Proxying this stuff doesn’t really relieve many privacy issues when someone’s shoving private personal data into it, like what typically happens with AI assisted browsing.

          If someone’s just using an AI feature to summarize some random news articles or something proxying the requests is probably fine for most users.

          Confidential compute is better, but still not ideal.

          freddy@social.security.plumbingF 1 Reply Last reply
          0
          • oliviablob@mastodon.neat.computerO oliviablob@mastodon.neat.computer

            @freddy @pojntfx @buherator Proxying this stuff doesn’t really relieve many privacy issues when someone’s shoving private personal data into it, like what typically happens with AI assisted browsing.

            If someone’s just using an AI feature to summarize some random news articles or something proxying the requests is probably fine for most users.

            Confidential compute is better, but still not ideal.

            freddy@social.security.plumbingF This user is from outside of this forum
            freddy@social.security.plumbingF This user is from outside of this forum
            freddy@social.security.plumbing
            wrote sidst redigeret af
            #8

            @oliviablob @pojntfx @buherator I agree. If you want privacy, you probably shouldn’t use an llm hosted and controlled by someone else. I certainly wouldn’t.

            1 Reply Last reply
            0
            • freddy@social.security.plumbingF freddy@social.security.plumbing

              @pojntfx @buherator not true. The feature is currently in development and uses different models while things are still under test. This is pre-release software. Behavior will change.
              Currently, everything is proxied through mozilla infra. The model that will ship is (afaiu) not yet determined.

              pojntfx@mastodon.socialP This user is from outside of this forum
              pojntfx@mastodon.socialP This user is from outside of this forum
              pojntfx@mastodon.social
              wrote sidst redigeret af
              #9

              @freddy @buherator I hope there is at least an option of using a local LLM, check even GLM-4.6V is good enough for instrumenting browsers in my experience. Signing into an account (thereby sending all of my LLM context directly to my identity with Mozilla) and proxying via Mozilla infrastructure to Google (which does not anonymise since the context contains everything already) seems like a terrible direction here, seriously. Esp. given that there are lots of ways to run LLMs locally.

              freddy@social.security.plumbingF 1 Reply Last reply
              0
              • kstrlworks@techhub.socialK kstrlworks@techhub.social

                @pojntfx Mozilla will really invest everywhere except making a better browser.

                pojntfx@mastodon.socialP This user is from outside of this forum
                pojntfx@mastodon.socialP This user is from outside of this forum
                pojntfx@mastodon.social
                wrote sidst redigeret af
                #10

                @kstrlworks Servo honestly seems like the only way forward.

                kstrlworks@techhub.socialK 1 Reply Last reply
                0
                • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

                  @pojntfx to be fair, most computers are unable to run a meaningful LLM model at a speed that makes sense.

                  Yes, you can run a gemma-3-4b model on a CPU, but it is really very limited and tends to have quite a lot of hallucinations. I don't know any open models that would do significantly better, but I would love to be proven wrong.

                  pojntfx@mastodon.socialP This user is from outside of this forum
                  pojntfx@mastodon.socialP This user is from outside of this forum
                  pojntfx@mastodon.social
                  wrote sidst redigeret af
                  #11

                  @madsenandersc You're not wrong in a lot of ways. But I'll also say that recent advances in quantization (I'm using the GLM-4.6V model) and also the vulkan acceleration support in say llama.cpp is making a big difference. My RX4060 and AMD 890m are more than good enough to instrument a browser with a fully local LLM now.

                  madsenandersc@social.vivaldi.netM 1 Reply Last reply
                  0
                  • pojntfx@mastodon.socialP pojntfx@mastodon.social

                    @kstrlworks Servo honestly seems like the only way forward.

                    kstrlworks@techhub.socialK This user is from outside of this forum
                    kstrlworks@techhub.socialK This user is from outside of this forum
                    kstrlworks@techhub.social
                    wrote sidst redigeret af
                    #12

                    @pojntfx Be it Servo or Ladybird, the first one to reach stable status with uBlock Origin will feed the masses.

                    pojntfx@mastodon.socialP 1 Reply Last reply
                    0
                    • pojntfx@mastodon.socialP pojntfx@mastodon.social

                      @madsenandersc You're not wrong in a lot of ways. But I'll also say that recent advances in quantization (I'm using the GLM-4.6V model) and also the vulkan acceleration support in say llama.cpp is making a big difference. My RX4060 and AMD 890m are more than good enough to instrument a browser with a fully local LLM now.

                      madsenandersc@social.vivaldi.netM This user is from outside of this forum
                      madsenandersc@social.vivaldi.netM This user is from outside of this forum
                      madsenandersc@social.vivaldi.net
                      wrote sidst redigeret af madsenandersc@social.vivaldi.net
                      #13

                      @pojntfx oh, for sure - with that kind of hardware things start to look different.

                      I'm still not sold on models below 7-12B, and even without quantization, I feel that they tend to hallucinate a bit too much. With quantization, things tend to get even more...creative. 😉

                      I run gpt-oss-20b locally, and that is fine for some tasks, but the second things involve searching the web, the results become very much a mixed bag. I run it on a homelab with a Radeon 780m, but the great thing is that I can allocate up to 32GB of RAM to the GPU - that makes things not very fast but reasonably accurate.

                      The second part is language. I have found very few smaller models that speak Danish well enough to be useful (gemma-3-4b is one), so again - sending the query to a remotely server with a 120B or 403B model makes much more sense from a user-centric standpoint.

                      pojntfx@mastodon.socialP 1 Reply Last reply
                      0
                      • kstrlworks@techhub.socialK kstrlworks@techhub.social

                        @pojntfx Be it Servo or Ladybird, the first one to reach stable status with uBlock Origin will feed the masses.

                        pojntfx@mastodon.socialP This user is from outside of this forum
                        pojntfx@mastodon.socialP This user is from outside of this forum
                        pojntfx@mastodon.social
                        wrote sidst redigeret af
                        #14

                        @kstrlworks Ladybird's governance issues really make it not a viable alternative in my eyes. Solid engineering, but damn I won't be working with someone who believes I shouldn't be working or even exist

                        1 Reply Last reply
                        0
                        • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

                          @pojntfx oh, for sure - with that kind of hardware things start to look different.

                          I'm still not sold on models below 7-12B, and even without quantization, I feel that they tend to hallucinate a bit too much. With quantization, things tend to get even more...creative. 😉

                          I run gpt-oss-20b locally, and that is fine for some tasks, but the second things involve searching the web, the results become very much a mixed bag. I run it on a homelab with a Radeon 780m, but the great thing is that I can allocate up to 32GB of RAM to the GPU - that makes things not very fast but reasonably accurate.

                          The second part is language. I have found very few smaller models that speak Danish well enough to be useful (gemma-3-4b is one), so again - sending the query to a remotely server with a 120B or 403B model makes much more sense from a user-centric standpoint.

                          pojntfx@mastodon.socialP This user is from outside of this forum
                          pojntfx@mastodon.socialP This user is from outside of this forum
                          pojntfx@mastodon.social
                          wrote sidst redigeret af
                          #15

                          @madsenandersc Huh, interesting - yeah I never really deal with languages other than French, German and English I guess, haven't really run into this. For web search, https://newelle.qsk.me/#home has been surprisingly good with a 18B model, even though it's slow.

                          I guess one way they could implement the whole remote server situation would be to lean on say an OpenAI-compatible API - which something like vllm, llama.cpp, SGLang and so on can provide

                          madsenandersc@social.vivaldi.netM 1 Reply Last reply
                          0
                          • pojntfx@mastodon.socialP pojntfx@mastodon.social

                            @freddy @buherator I hope there is at least an option of using a local LLM, check even GLM-4.6V is good enough for instrumenting browsers in my experience. Signing into an account (thereby sending all of my LLM context directly to my identity with Mozilla) and proxying via Mozilla infrastructure to Google (which does not anonymise since the context contains everything already) seems like a terrible direction here, seriously. Esp. given that there are lots of ways to run LLMs locally.

                            freddy@social.security.plumbingF This user is from outside of this forum
                            freddy@social.security.plumbingF This user is from outside of this forum
                            freddy@social.security.plumbing
                            wrote sidst redigeret af
                            #16

                            @pojntfx I have seen PRDs that discuss local models, but I don’t recall if they are part of the mvp. Might just be that folks can set a custom pref in about:config for now.

                            1 Reply Last reply
                            0
                            • pojntfx@mastodon.socialP pojntfx@mastodon.social

                              Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

                              tommi@pan.rentT This user is from outside of this forum
                              tommi@pan.rentT This user is from outside of this forum
                              tommi@pan.rent
                              wrote sidst redigeret af
                              #17

                              @pojntfx Noooo @mala you cannot allow this

                              1 Reply Last reply
                              0
                              • pojntfx@mastodon.socialP pojntfx@mastodon.social

                                @madsenandersc Huh, interesting - yeah I never really deal with languages other than French, German and English I guess, haven't really run into this. For web search, https://newelle.qsk.me/#home has been surprisingly good with a 18B model, even though it's slow.

                                I guess one way they could implement the whole remote server situation would be to lean on say an OpenAI-compatible API - which something like vllm, llama.cpp, SGLang and so on can provide

                                madsenandersc@social.vivaldi.netM This user is from outside of this forum
                                madsenandersc@social.vivaldi.netM This user is from outside of this forum
                                madsenandersc@social.vivaldi.net
                                wrote sidst redigeret af
                                #18

                                @pojntfx

                                I've never heard of Nevelle before - that seems really promising. It definitely does a good job of searching the web when responding to queries, better than what I get from OpenWebUI and Ollama.

                                Yes, implementing a simple OpenAI compatible interface, where the user can connect to their local AI installation would be almost a given - it would remove a lot of worries about privacy for those who want to keep their information in-house.

                                My wife work at a place where there are a lot of industry secrets, so using AI is a no-go for them, even if it is just for aggregating data from the web or summarizing their own information from a lot of different documents. For them, local AI is not an option, it is a requirement.

                                1 Reply Last reply
                                0
                                Svar
                                • Svar som emne
                                Login for at svare
                                • Ældste til nyeste
                                • Nyeste til ældste
                                • Most Votes


                                • Log ind

                                • Har du ikke en konto? Tilmeld

                                • Login or register to search.
                                Powered by NodeBB Contributors
                                Graciously hosted by data.coop
                                • First post
                                  Last post
                                0
                                • Hjem
                                • Seneste
                                • Etiketter
                                • Populære
                                • Verden
                                • Bruger
                                • Grupper