Microsoft’s World and Human Action MaskGIT Model, or WHAMM, builds on its earlier WHAM-1.6B version launched in February. Unlike its predecessor, this iteration introduces faster visual output using a MaskGIT-style architecture that generates image tokens in parallel. Moving away from the autoregressive method, which predicted tokens sequentially, WHAMM reduces latency…
Show Comment Form
Hide Comment Form