Google apologizes for ‘missing the mark’ after Gemini generated racially diverse Nazis::Google says it’s aware of historically inaccurate results for its Gemini AI image generator, following criticism that it depicted historically white groups as people of color.

  • @[email protected]
    link
    fedilink
    English
    3
    edit-2
    10 months ago

    I can see the argument that it has a sort of world model, but one that is purely word relationships is a very shallow sort of model. When I am asked what happens when a glass is dropped onto concrete, I don’t just think about what I’ve heard about those words and come up with a correlation, I can also think about my experiences with those materials and with falling things and reach a conclusion about how they will interact. That’s the kind of world model it’s missing. Material properties and interactions are well enough written about that it ~~simulates ~~ emulates doing this, but if you add a few details it can really throw it off. I asked Bing Copilot “What happens if you drop a glass of water on concrete?” and it went into excruciating detail about how the water will splash, mentions how it can absorb into it or affect uncured concrete, and now completely fails to notice that the glass itself will strike the concrete, instead describing the chemistry of how using “glass (such as from the glass of water)” as aggregate could affect the curing process. Having a purely statistical/linguistic world model leaves some pretty big holes in its “reasoning” process.