• WildPalmTree@lemmy.world
    link
    fedilink
    English
    arrow-up
    7
    ·
    11 days ago

    ELI5 1-bit module. With three attempts, i got nothing out of it, so I assume it’s a simpler, more energy efficient model.

    • icecreamtaco@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      11 days ago

      It’s a massive performance upgrade, which would make current sized models better and tiny phone-sized models viable. Only problem is that models need to be retrained to use it and afaik, no one significant has done it yet.

    • thickertoofan@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      11 days ago

      i’m not the smartest out there to explain it but it’s like …instead of floating point numbers as the weights, its just -1,0,1.

  • simple@lemm.ee
    link
    fedilink
    English
    arrow-up
    4
    ·
    11 days ago

    This wasn’t out? I’ve been hearing about BitNet for a while, just that there wasn’t a good 1-bit model out there.

    • thickertoofan@lemm.eeOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      11 days ago

      it was, it’s just that they have officially released a 2B model trained for the BitNet architecture

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 days ago

    Nice. Any additional info on how difficult it was to train this and whether we can expect more? They have a 3B model in the demo video, but doesn’t seem like they released that… I mean I’d like something a bit larger.