Skip to content
  • Hjem
  • Seneste
  • Etiketter
  • Populære
  • Verden
  • Bruger
  • Grupper
Temaer
  • Light
  • Brite
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Kollaps
FARVEL BIG TECH
  1. Forside
  2. Ikke-kategoriseret
  3. Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

Planlagt Fastgjort Låst Flyttet Ikke-kategoriseret
18 Indlæg 8 Posters 31 Visninger
  • Ældste til nyeste
  • Nyeste til ældste
  • Most Votes
Svar
  • Svar som emne
Login for at svare
Denne tråd er blevet slettet. Kun brugere med emne behandlings privilegier kan se den.
  • freddy@social.security.plumbingF freddy@social.security.plumbing

    @pojntfx @buherator not true. The feature is currently in development and uses different models while things are still under test. This is pre-release software. Behavior will change.
    Currently, everything is proxied through mozilla infra. The model that will ship is (afaiu) not yet determined.

    pojntfx@mastodon.socialP This user is from outside of this forum
    pojntfx@mastodon.socialP This user is from outside of this forum
    pojntfx@mastodon.social
    wrote sidst redigeret af
    #9

    @freddy @buherator I hope there is at least an option of using a local LLM, check even GLM-4.6V is good enough for instrumenting browsers in my experience. Signing into an account (thereby sending all of my LLM context directly to my identity with Mozilla) and proxying via Mozilla infrastructure to Google (which does not anonymise since the context contains everything already) seems like a terrible direction here, seriously. Esp. given that there are lots of ways to run LLMs locally.

    freddy@social.security.plumbingF 1 Reply Last reply
    0
    • kstrlworks@techhub.socialK kstrlworks@techhub.social

      @pojntfx Mozilla will really invest everywhere except making a better browser.

      pojntfx@mastodon.socialP This user is from outside of this forum
      pojntfx@mastodon.socialP This user is from outside of this forum
      pojntfx@mastodon.social
      wrote sidst redigeret af
      #10

      @kstrlworks Servo honestly seems like the only way forward.

      kstrlworks@techhub.socialK 1 Reply Last reply
      0
      • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

        @pojntfx to be fair, most computers are unable to run a meaningful LLM model at a speed that makes sense.

        Yes, you can run a gemma-3-4b model on a CPU, but it is really very limited and tends to have quite a lot of hallucinations. I don't know any open models that would do significantly better, but I would love to be proven wrong.

        pojntfx@mastodon.socialP This user is from outside of this forum
        pojntfx@mastodon.socialP This user is from outside of this forum
        pojntfx@mastodon.social
        wrote sidst redigeret af
        #11

        @madsenandersc You're not wrong in a lot of ways. But I'll also say that recent advances in quantization (I'm using the GLM-4.6V model) and also the vulkan acceleration support in say llama.cpp is making a big difference. My RX4060 and AMD 890m are more than good enough to instrument a browser with a fully local LLM now.

        madsenandersc@social.vivaldi.netM 1 Reply Last reply
        0
        • pojntfx@mastodon.socialP pojntfx@mastodon.social

          @kstrlworks Servo honestly seems like the only way forward.

          kstrlworks@techhub.socialK This user is from outside of this forum
          kstrlworks@techhub.socialK This user is from outside of this forum
          kstrlworks@techhub.social
          wrote sidst redigeret af
          #12

          @pojntfx Be it Servo or Ladybird, the first one to reach stable status with uBlock Origin will feed the masses.

          pojntfx@mastodon.socialP 1 Reply Last reply
          0
          • pojntfx@mastodon.socialP pojntfx@mastodon.social

            @madsenandersc You're not wrong in a lot of ways. But I'll also say that recent advances in quantization (I'm using the GLM-4.6V model) and also the vulkan acceleration support in say llama.cpp is making a big difference. My RX4060 and AMD 890m are more than good enough to instrument a browser with a fully local LLM now.

            madsenandersc@social.vivaldi.netM This user is from outside of this forum
            madsenandersc@social.vivaldi.netM This user is from outside of this forum
            madsenandersc@social.vivaldi.net
            wrote sidst redigeret af madsenandersc@social.vivaldi.net
            #13

            @pojntfx oh, for sure - with that kind of hardware things start to look different.

            I'm still not sold on models below 7-12B, and even without quantization, I feel that they tend to hallucinate a bit too much. With quantization, things tend to get even more...creative. 😉

            I run gpt-oss-20b locally, and that is fine for some tasks, but the second things involve searching the web, the results become very much a mixed bag. I run it on a homelab with a Radeon 780m, but the great thing is that I can allocate up to 32GB of RAM to the GPU - that makes things not very fast but reasonably accurate.

            The second part is language. I have found very few smaller models that speak Danish well enough to be useful (gemma-3-4b is one), so again - sending the query to a remotely server with a 120B or 403B model makes much more sense from a user-centric standpoint.

            pojntfx@mastodon.socialP 1 Reply Last reply
            0
            • kstrlworks@techhub.socialK kstrlworks@techhub.social

              @pojntfx Be it Servo or Ladybird, the first one to reach stable status with uBlock Origin will feed the masses.

              pojntfx@mastodon.socialP This user is from outside of this forum
              pojntfx@mastodon.socialP This user is from outside of this forum
              pojntfx@mastodon.social
              wrote sidst redigeret af
              #14

              @kstrlworks Ladybird's governance issues really make it not a viable alternative in my eyes. Solid engineering, but damn I won't be working with someone who believes I shouldn't be working or even exist

              1 Reply Last reply
              0
              • madsenandersc@social.vivaldi.netM madsenandersc@social.vivaldi.net

                @pojntfx oh, for sure - with that kind of hardware things start to look different.

                I'm still not sold on models below 7-12B, and even without quantization, I feel that they tend to hallucinate a bit too much. With quantization, things tend to get even more...creative. 😉

                I run gpt-oss-20b locally, and that is fine for some tasks, but the second things involve searching the web, the results become very much a mixed bag. I run it on a homelab with a Radeon 780m, but the great thing is that I can allocate up to 32GB of RAM to the GPU - that makes things not very fast but reasonably accurate.

                The second part is language. I have found very few smaller models that speak Danish well enough to be useful (gemma-3-4b is one), so again - sending the query to a remotely server with a 120B or 403B model makes much more sense from a user-centric standpoint.

                pojntfx@mastodon.socialP This user is from outside of this forum
                pojntfx@mastodon.socialP This user is from outside of this forum
                pojntfx@mastodon.social
                wrote sidst redigeret af
                #15

                @madsenandersc Huh, interesting - yeah I never really deal with languages other than French, German and English I guess, haven't really run into this. For web search, https://newelle.qsk.me/#home has been surprisingly good with a 18B model, even though it's slow.

                I guess one way they could implement the whole remote server situation would be to lean on say an OpenAI-compatible API - which something like vllm, llama.cpp, SGLang and so on can provide

                madsenandersc@social.vivaldi.netM 1 Reply Last reply
                0
                • pojntfx@mastodon.socialP pojntfx@mastodon.social

                  @freddy @buherator I hope there is at least an option of using a local LLM, check even GLM-4.6V is good enough for instrumenting browsers in my experience. Signing into an account (thereby sending all of my LLM context directly to my identity with Mozilla) and proxying via Mozilla infrastructure to Google (which does not anonymise since the context contains everything already) seems like a terrible direction here, seriously. Esp. given that there are lots of ways to run LLMs locally.

                  freddy@social.security.plumbingF This user is from outside of this forum
                  freddy@social.security.plumbingF This user is from outside of this forum
                  freddy@social.security.plumbing
                  wrote sidst redigeret af
                  #16

                  @pojntfx I have seen PRDs that discuss local models, but I don’t recall if they are part of the mvp. Might just be that folks can set a custom pref in about:config for now.

                  1 Reply Last reply
                  0
                  • pojntfx@mastodon.socialP pojntfx@mastodon.social

                    Here is a sad (and somewhat pathetic, I guess) fact: The new Firefox "smart window" (which is an LLM-based browser), doesn't even use a local or open model, it's literally just Google's models run via their API

                    tommi@pan.rentT This user is from outside of this forum
                    tommi@pan.rentT This user is from outside of this forum
                    tommi@pan.rent
                    wrote sidst redigeret af
                    #17

                    @pojntfx Noooo @mala you cannot allow this

                    1 Reply Last reply
                    0
                    • pojntfx@mastodon.socialP pojntfx@mastodon.social

                      @madsenandersc Huh, interesting - yeah I never really deal with languages other than French, German and English I guess, haven't really run into this. For web search, https://newelle.qsk.me/#home has been surprisingly good with a 18B model, even though it's slow.

                      I guess one way they could implement the whole remote server situation would be to lean on say an OpenAI-compatible API - which something like vllm, llama.cpp, SGLang and so on can provide

                      madsenandersc@social.vivaldi.netM This user is from outside of this forum
                      madsenandersc@social.vivaldi.netM This user is from outside of this forum
                      madsenandersc@social.vivaldi.net
                      wrote sidst redigeret af
                      #18

                      @pojntfx

                      I've never heard of Nevelle before - that seems really promising. It definitely does a good job of searching the web when responding to queries, better than what I get from OpenWebUI and Ollama.

                      Yes, implementing a simple OpenAI compatible interface, where the user can connect to their local AI installation would be almost a given - it would remove a lot of worries about privacy for those who want to keep their information in-house.

                      My wife work at a place where there are a lot of industry secrets, so using AI is a no-go for them, even if it is just for aggregating data from the web or summarizing their own information from a lot of different documents. For them, local AI is not an option, it is a requirement.

                      1 Reply Last reply
                      0
                      Svar
                      • Svar som emne
                      Login for at svare
                      • Ældste til nyeste
                      • Nyeste til ældste
                      • Most Votes


                      • Log ind

                      • Har du ikke en konto? Tilmeld

                      • Login or register to search.
                      Powered by NodeBB Contributors
                      Graciously hosted by data.coop
                      • First post
                        Last post
                      0
                      • Hjem
                      • Seneste
                      • Etiketter
                      • Populære
                      • Verden
                      • Bruger
                      • Grupper