Photo by @pawel_czerwinski

Facebook AI & Retrieval-Augmented Generation (RAG)

A new open-source language model through Hugging Face Transformers in 2020

The Hive

An article on Kinsta has gathered a variety of stats about Facebook and it has a section on Data and Usage [bold added]:

“Facebook generates 4 petabytes of data per day — that’s a million gigabytes. All that data is stored in what is known as the Hive…

…which contains about 300 petabytes of data. This enormous amount of content generation is without a doubt connected to the fact that Facebook users spend more time on the site than users spend on any other social network, putting in about an hour a day.”

RAG framework for AI

Facebook has designed a novel framework for AI that can create more intelligent natural language processing (NLP) models.

“Retrieval-augmented generation (“RAG”) models combine the powers of pretrained dense retrieval (DPR) and sequence-to-sequence models. RAG models retrieve documents, pass them to a seq2seq model, then marginalize to generate outputs. The retriever and seq2seq modules are initialized from pretrained models, and fine-tuned jointly, allowing both retrieval and generation to adapt to downstream tasks.”

To understand this statement it may be useful to retrieve a few descriptions of what this descriptions entail.

“Figure 1: An overview of retrieval-augmented generation (RAG). We combine a pre-trained retriever (Query Encoder + Document Index) with a pre-trained encoder-decoder (Generator) and fine-tune end-to-end. For some query x, we use Maximum Inner Product Search (MIPS) to find the top-K most relevant documents of all documents zi . To make the final prediction y, we treat z as a latent variable and marginalize over the encoder-decoder predictions given different documents.”
  1. Question answering.
  2. Fact verification.
  3. Question generation.

“Our result shows that we can effectively update RAG’s behavior with new world knowledge by simply replacing its non-parametric memory.”

This may have been done to counter a previous issue of adversarial AI.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Moltzau

AI Policy, Governance, Ethics and International Partnerships at All views are my own.