Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Liquid: Language models are scalable and unified multi-modal generators (foundationvision.github.io)

73 points by pr337h4m 12 hours ago | 7 comments

Centigonal 11 hours ago [-]

I love the website for this paper! Each section asks a question, and immediately answers it with a figure and a few sentences of discussion. It's less tech-demo heavy than a lot of other paper websites (those are cool, too, in their own way), and instead focuses on characterizing multimodal model behavior in a nice, clean, disciplined way.

gwern 10 hours ago [-]

> For the first time, Liquid uncovers a scaling law that performance drop unavoidably brought by the unified training of visual and language tasks diminishes as the model size increases...No prior work has explored whether LLMs retain the power-law scaling laws observed in language tasks when extended to visual generation tasks. We prove this alignment and further show that vision can be effectively learned by LLMs as a form of language.

Does this really show much that https://arxiv.org/abs/2301.03728#facebook (uncited) and other earlier work did not?

swyx 10 hours ago [-]

hmm this is a tough name - conflicts with Liquid AI https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

skerit 28 minutes ago [-]

I thought this was about them initially. Acronyms & made-up words should make a comeback as company names.

Nijikokun 9 hours ago [-]

it performs well with composition, however it seems SD and SDXL excels in capability and quality when intermixed with pipelines and workflows, this doesn't do much to talk about that comparison and whenever i see things like this i think about the overall workflow, like cool you do good composition but you don't fit within the workflow or ecosystem that surrounds that tool and thus i have low expectations around adoption

marviel 8 hours ago [-]

The Synesthesia these models must experience has gotta be intense

taneq 5 hours ago [-]

I wonder how much of the training set features those colourful alphabet fridge magnets? :D

Rendered at 07:47:49 GMT+0000 (Coordinated Universal Time) with Vercel.