lunes, 21 de agosto de 2023

GENERATIVE AI Foundation Models

 WHAT IS GENERATIVE AI?

Generative AI is a broad term that can be used for any AI system whose primary function is to generate content. This is in contrast to AI systems that perform other functions, such as classifying data (for example, assigning labels to images), grouping data (for example, identifying customer segments with similar purchasing behavior), or choosing actions (for example, steering an autonomous vehicle).



Foundation Model is any model that is generally pretrained on broad unlabeled data sets in a self-supervised manner, and in the process, learns generalizable and adaptable data representations.

The term Foundation Model was coined in a 200+ pages Stanford report, which describes how AI is undergoing a paradigm shift with the rise of a new class of models called Foundation Models.

The authors observed two trends that led to the definition of Foundation Models:

Homogenization

As the ML community discovers techniques that work well for different kinds of problems, those techniques become part of a standardized approach to building ML systems. With Foundation Models, the model itself is becoming the object of homogenization. For example, the model itself provides the foundation on top of which new models can be developed to specialize in a domain.

Emergent behaviors

Previous generations of Machine Learning (ML) models were trained to perform specific behaviors, such as answering questions (Q&A) or summarizing a body of text. However, Foundation Models perform tasks for which they were not explicitly trained.


Applying the emergent behaviors of Foundation Models to client use cases is a challenge. These models generate content that varies, even if the inputs are identical. The outcomes could be biased, inaccurate, insecure, and have inherent copyright issues. Applying the appropriate safeguard rails facilitates the ability to quantify and mitigate the harmful business impacts from using generated content by ensuring the model and its outcomes are explainable and appropriate for the use case, and the training data is audited. Ultimately, all models should optimally be configured with AI Governance processes and policies in place.


What are the foundation models of generative AI?

Foundation Models are a type of Generative AI that are trained on large amounts of unstructured data in an unsupervised manner in order to learn general representations that can be adapted to perform multiple tasks across different domains. They aim to provide a foundation for building many different AI applications.22 jul 2023
http://zenconsultora.blogspot.com/2023/08/generative-ai-foundation-models.html

Pretrained

The model is initially trained on large amounts of text data.


FFFine-tuned

The model is fine-tuned for specific generative tasks.


Transformer

A type of machine learning architecture used to process and analyze data.



Transformer

A type of machine learning architecture used to process and analyze data.

Encoders and decoders

Encoders and decoders are components of the transformer architecture used to process and generate data sequences, such as text.

An encoder takes a sequence of input data, like a sentence, and converts it into a series of encoded representations. Each representation captures information about the original input data but at a different level of abstraction. The final encoded representation is typically a vector that summarizes the input sequence.

On the other hand, a decoder takes an encoded representation and uses it to generate a new sequence of data, like a translation of the original sentence into a different language. The decoder does this by predicting the next token in the sequence based on the encoded representation and the tokens generated so far.

Here's an example of how encoders and decoders might work together to translate a sentence from English to French:

Input sentence: "The cat sat on the mat."

Encoded representation: [0.2, 0.5, -0.1, 0.4, ...]

Target language: French

Decoder output: "Le chat s'est assis sur le tapis."

In this example, the encoder takes the English sentence as input and generates an encoded representation, which captures the meaning of the sentence in a lower-dimensional space. The decoder then uses this encoded representation to generate a new sequence of tokens in the target language, French. The final output is a translated sentence that captures the same meaning as the original sentence but in a different language.

Large Language Models (LLMs) are a type of AI system that works with language. An LLM aims to model language, that is, to create a simplified—but useful—digital representation. The "large" part of the term describes the trend toward training language models with more parameters. 

Typical examples of LLMs include OpenAI's GPT-4, Google's PaLM, and Meta's LLaMA. There is some ambiguity about whether to refer to specific products (such as OpenAI's ChatGPT or Google's Bard) as LLMs themselves or to say that they are powered by underlying LLMs.

As a term, LLM is often used by AI practitioners to refer to systems that work with language.

GPT, or Generative Pre-trained Transformer, is one of these Large Language Models.

Foundation Models

Creating Foundation Models from scratch requires a large volume of unlabeled data and large computing resources.

However, enterprises could start with pretrained Foundation Models that have been fine-tuned with much less labeled data and minimal computing resources.

Foundation Models are pretrained on massive amounts of data and can be fine-tuned for specific tasks. Foundation Models can generate human-like language, perform question-answering tasks, and even generate code. They represent a significant breakthrough in the field of artificial intelligence and have the potential to revolutionize a wide range of industries, including healthcare, finance, and education.Opportunities

For IBM, there are numerous opportunities to leverage Foundation Models to improve our products, services, and customer offerings. With IBM's unmatched history of AI and automation at scale, IBM's Garage methodology and open ecosystem approach, we are in a unique position to continue accelerating how we solve business problems by reducing costs, turnaround time, and improving productivity.