2 comments

  • adsharma an hour ago

    I wish the authors calculated a plot of model size (number of params) vs number of triples it can hold before the memory collapse happens.

    It's hard to map the frequency of knowledge injection to a real world understanding of "how much knowledge" can a 4B param model hold?

  • gdiamos 2 hours ago

    I wonder if this depends on what is inside the domain specific data.

    I’m happy to see ML papers on hacker news.