Last Updated on March 16, 2025
Bedrock - fully managed, serverless
Sagemaker - you need to configure endpoints
Sagemaker Jumpstart - one-click pre-built solutions
-----
F1 - performance
ROUGE - text generation
-----
Human bias vs Algorithmic bias, look for the human/machine differentiator
-----
Yes, it is possible to increase both bias and variance,
but this typically leads to a model that performs poorly due to both
underfitting and overfitting
-----
think > Variance
UNDER = LOW Var (underfit)
OVER = HIGH Var (overfit)
-----
Order Of Complexity
prompt, rag, fine-tune
-----
Asynchronous Inference Endpoint
less than/up to 1gb
save on cost by autoscaling instance count to zero when no requests to process
-----
realtime - immediate
-----
batch - slowest, large datasets
-----
-----
Inspector
Audit Mananger
Config
-----
Supervised:
Linear Regression
Classification
Neural Network
kNN - classifying based on proximity
-----
Unsupervised
K-means Clustering
market segmentation, image compression
-----
Hierarchical Clustering
analyze social network data
-----
Princicpal Component Analysis
facial recognition / computer vision
-----
Anomaly Detection
-----
Dimensionality reduction
-----
Random Cut Forest model
designed to detect anomalous data points within a dataset
-----
Self Supervised
uses UNLABELED data,
generates the labels themselves
-----
Semi Supervised
Fraud detection
Sentiment Analysis
-----
Shapley
detailed. individual predictions
-----
PDP
visual relationship of specific feature
-----
Provisioned Throughput
You can use a customized model only in the Provisioned Throughput mode
-----
Few Shot Prompting
you provide examples,
data should include user-input along with the correct user intent,
and provide examples of user queries and their corresponding intent
-----
Zero-Shot Prompting
no examples provided
-----
Single-Shot Prompting
single example
-----
to ensure the model generalizes well
Regularization
Cross-validation
Pruning
-----
Validation Sets are optional
-----
Model Parameters - define a model
Hyperparameters - can be adjusted for model customization
-----
Inspector
automated security assessment
-----
Audit Manager
helps you assess internal risk
-----
Deep Learning
GRADIENT DESCENT
adjust weights and biases of a neural network
-----
Ground Truth
build large labeled datasets
-----
LLM
non-deterministic
the generated text may be different for every user that uses the same prompt
use the inference parameter Temperature, which regulates the creativity of LLMs’ responses
can do natural language processing tasks
text generation
sentiment analysis
language translation
question/answer systems
------------
Inference
the model uses its trained parameters to generate a prediction or output based on new input data
provided by the user
------------
Disadvantage of Gen AI
lack the emotional depth present in human-created content
------------
Model interpretability and transparency
come at the cost of performance
Highly interpretable models, such as linear regression and decision trees, are generally simpler
and can be easier to understand, but this simplicity can negatively affect their performance
compared to more complex models like deep neural networks
------------
Generative Models
focus on generating new data from learned patterns
learns the underlying patterns of data to create new, similar data
------------
Discriminative Models
classify data by distinguishing between different classes
focuses on decision boundaries to classify inputs
------------
SLM
based on a large language model (LLM) but is significantly smaller in size and resource requirements
optimized for deployment on edge devices is specifically designed to be lightweight, efficient,
and capable of running on devices with limited computational resources, like edge computing scenarios
------------
Foundation Model
uses Self-supervised learning to create labels from input data
-----
FMs use UNLABELED data sets for self-supervised learning
-----
machine learning or deep learning model that is trained on vast datasets
so it can be applied across a wide range of use cases.
----------
Hierarchy
Artificial Intelligence > Machine Learning > Deep Learning > Generative AI
----------
----------
----------
Machine Learning
a data scientist MANUALLY determines the set of relevant features that the software must analyze
data scientist manually determines the set of relevant features that the software must analyze
--------
algorithms often require feature extraction and can use various methods such as
decision trees
or support vector machines
--------
Three main types of machine learning:
supervised learning
unsupervised learning
deep learning
--------
Feature Extraction
reduces the number of features by transforming data into a new space
--------
Feature Selection
reduces the number of features by selecting the most relevant ones from the existing features
----------
Deep Learning
a SUBSET OF ML that uses neural networks with many layers to learn from large amounts of data
-----
the data scientist gives only RAW DATA to the software and the deep learning network derives the features by itself
-----
uses large datasets to adjust the weights and biases of a neural network through multiple iterations,
using techniques such as GRADIENT DESCENT to minimize the error
------------
------------
------------
Embedding Models (Amazon Titan)
used to create vectors
------------
Multi-modal Model
can recognize various forms of input, such as data text, images
can accept a mix of input types such as audio/text and create a mix of output types
such as video/image
this is a significant advancement in AI
------------
Unimodal models
handle a single type of data
------------
Computer Vision (CV) Models
Amazon Rekognition makes it easy to add advanced computer vision capabilities to your application
------------
Model Efficiency Metric
assess how well an AI system performs in terms of resource usage, speed, and scalability
------------
ROUGE
Recall-Oriented Understudy for Gisting Evaluation
used to assess performance of FM text generation
quick and objective measurements
------------
BLEU (Bilingual Evaluation Understudy)
a metric specifically designed to evaluate the quality of text that has been machine-translated
by comparing it with one or more reference translations
quick and objective measurements
------------
Standard performance metrics used to evaluate the effectiveness of a classification system:
Precision, Recall, and F1-Score
------------
F1
a metric used in machine learning to evaluate the PERFORMANCE of a classification model
a model PERFORMANCE metric, NOT a business metric
the score can range from 0 to 1
1 represens a model that perfectly classifies each observation into the correct class
0 representing a model that is unable to classify any observation into the correct class
------------
BERT
bidirectional encoder representations from transformers
capture the contextual meaning of words by looking at both the words that come before and after them
creates dynamic word embeddings that change depending on the surrounding text
uses deep learning techniques to predict missing words by considering the words before and after the gap
evaluate how closely the chatbot’s replies match a reference set of correct responses
based on their context and meaning
------------
Partial Dependence Plots (general)
visualization technique to understand the relationship of a specific feature and the predicted outcome of a model
shows average predicted output for a given feature, while holding all other features constant
------------
SHAP Values Shapley (detailed)
provide a more detailed understanding of how individual predictions are made by a model
SHAP values assign a value to each feature for a specific prediction, indicating the contribution
of that feature to the predicted outcome
------------
Comprehend
natural language processing (NLP) service that uses machine learning to uncover insights and relationships in text
can discover PII
provides APIs for text analysis, such as:
Custom Entity Recognition
Custom Classification
Key Phrase Extraction
Sentiment Analysis
Entity Recognition
------------
Named Entity Recognition (NER)
a subfield of natural language processing (NLP) that focuses on identifying and classifying
specific data points from textual content
------------
Kendra
document content extractor via machine learning
highly accurate, easy enterprise search service powered by ML
search data from manuals, research reports, FAQs, human resources (HR) documentation, and customer service guides
search data across various systems such as S3, SharePoint, Salesforce, ServiceNow, RDS, OneDrive
------------
------------
------------
Order Of Complexity ------
--------------------------
1. Prompt Engineering
the practice of carefully designing prompts to efficiently tap into the capabilities of FMs.
It involves the use of prompts, which are short pieces of text that guide the model to generate more accurate
and relevant responses. With prompt engineering, you can improve the performance of FMs and make
them more effective for a variety of applications.
------------
2. Retrieval Augmented Generation (RAG)
allows you to customize a model’s responses when you want the model to consider new
knowledge or up-to-date information. When your data changes frequently, like inventory or pricing,
it’s not practical to fine-tune and update the model while it’s serving user queries
------------
3. Fine-tuning
the process of taking a pre-trained FM, such as Llama 2, and further training it on a downstream
task with a dataset specific to that task
------------
------------
------------
Bedrock
can be used to fine-tune pre-trained foundation models for various tasks, including sentiment analysis
can analyze text data to determine sentiment, making it a versatile option for advanced users
who may need more customizable solutions than Amazon Comprehend
the easiest way to build and scale generative AI applications with foundation models
provides an environment to build and scale generative AI applications with FMs
fully managed service that offers a choice of high-performing FMs from leading AI companies, available via an API
----------
Model Evaluation on Amazon Bedrock
the process of preparing data, training models, selecting appropriate metrics, testing and analyzing results,
ensuring fairness and bias detection, tuning performance, and continuous monitoring
----------
Model Customization Methods
Continued Pre-training, Fine-tuning
----------
RAG - Retrieval Augmented Generation
LEAST dev effort, compared to fine tuning
uses embedding vectors
Use cases: customer service, legal research, healthcare questions
gives answers with links to resources
----------
Human Model Evaluation
for assessing qualitative aspects of the model
automatic model valuation is valuable for assessing quantitative aspects of the model
----------
K-means Clustering - unsupervised
an unsupervised machine learning algorithm used for data clustering
groups unlabeled data points into clusters based on similarity
----------
kNN - supervised
assumes that similar items are close to each other in a feature space
for both classification and regression tasks
supervised learning algorithm used for classifying data points based on their proximity to labeled examples
----------
Vector Databases
OpenSerch Service - best for RAG
built to handle search and analytics workloads
highly effective for applications that require rapid data retrieval and relevance ranking
provides fast search capabilities and supports full-text search, indexing, and similarity scoring
kNN (fastest nearest neighbor) search capability
helps store embeddings within vector databases
If you do not have an existing vector database, Amazon Bedrock creates an OpenSearch Serverless
--------
DocumentDB - (Mongo enabled) NoSql
not optimized for full-text search or similarity searches
--------
DynamoDB
does not natively support advanced search capabilities or similarity scoring needed for RAG applications
--------
Aurora
OLTP (Online Transaction Processing) workloads. While it provides advanced indexing features for
relational data, it is not optimized for full-text search
--------
RDS: relations, open source
--------
Neptune: graph db
----------
Sources: S3, Confluence, SharePoint, SalesForce, Web Pages
----------
Knowledge Bases
a way to add, test and use your data alongside a FM
--------
you provide foundation models with contextual information from your company's private data
for Retrieval Augmented Generation (RAG), enhancing response relevance and accuracy
--------
Create/manage RAG capabilities to provide FMs and agents contextual information from your company’s
private data sources to deliver more relevant, accurate, and customized responses
--------
supports popular databases for vector storage, including vector engines for:
Amazon OpenSearch Serverless
Pinecone
Redis Enterprise Cloud
Amazon Aurora
MongoDB
----------
Agents
stepped instructional agents
autonomously or semi-autonomously perform specific actions or tasks based on predefined rules or algorithms
can complete complex tasks for a wide range of use cases
deliver up-to-date answers based on proprietary knowledge sources
----------
RAG refers to querying and retrieving information from a data source to augment a generated response to a prompt,
Agent refers to an application that carries out orchestrations through cyclically interpreting inputs
and producing outputs by using a foundation model
------------
Amazon Bedrock Studio
IDE for creating apps and checking outputs
----------
Pricing
Smaller models are cheaper to use than larger models
--------
Reducing Number of Tokens
most effective way to MINIMIZE COSTS associated with the use of a generative AI model on Amazon Bedrock
--------
On demand
text models, embedding models, image models
NOT for use with CUSTOM models
--------
Batch: discounts up to 50%, multi predictions,
--------
Provisioned Throughput
You can use a customized model only in the Provisioned Throughput mode
designed for situations where there is a predictable, continuous workload, such as the intensive compute
required during the fine-tuning phase
1-6 months
throughput max numbers/tokens,
works with Base, Fine-tuned, Custom Models
--------
Temperature, Top K, Top P
no impact on pricing
--------
Model Size
Number of Input/Ouput Tokens
--------
Promt Engineering - cheap
--------
RAG - medium cost
does not change weights of the FM
external knowledge, no need to know everything
--------
Fine Tuning
changes weights of the FM
involves providing LABELED training data where each example consists of a prompt
(the input to the model) and a completion (the desired output)
--------
Instruction Based Fine Tuning
instruction-based fine-tuning
MUST be in "Prompt -> Response" text pairs format
FM is tuned with instructions - computation
--------
Domain Adaptation Fine Tuning
FM trained on domain-specific dataset
involves fine-tuning the model on domain-specific data to adapt its knowledge to that particular domain
------------
Tokenization / Tokens
converting raw text into a sequence of tokens
tokens represent words, sub-words, or characters that the model processes as discrete units of text
a sequence of characters that a model can interpret or predict as a single unit of meaning
----------
Context Window
the number of tokens a model can consider when generating text
bigger window = more info & coherence
different models have different limits
----------
Embeddings
a vector of numerical values, represents condensed information obtained by transforming input into that vector
creates vectors out of text, images or audio
vectors have high dimensionality, to capture many featues for one token, such as
sematic meaning, syntactic role, sentiment
can power search applications
words with semantic relationshps have similar embeddings
----------
Multi-modal Embedding Model
designed to represent and align different types of data (such as text and images) in a shared embedding space
allowing a chatbot to understand and interpret both forms of input simultaneously
------------
Guardrails
monitor and analyze user inputs to ensure compliance with safety policies
control interaction between users and FM's
filter 'harmful' content
-----
remove PII
can mask personally identifiable information (PII) in model responses, replacing sensitive information
with identifier tags like [NAME-1], [EMAIL-1],
-----
enhanced privacy
reduce hallucinations
use multiple guradrail rules to work together
-------
profanity, sexual, insults, 'hate',
-------
grounding and relevance (tamp hallucinations)
------------
Hallucination
responses that sound plausible and appear factual, but is incorrect or fabricated
------------
Data Drift
when the distribution or characteristics of the input data change over time, which can cause the model’s performance to degrade
------------
Cloudwatch
Model Invokation Logging
history of all that happens in bedrock
can go to S3 or CloudWatch
log groups
Metrics
ContentFilteredCount
build alarms such as guardrail breaches
------------
Model Invocation Logging
allows for detailed logging of all requests and responses during model invocations in Amazon Bedrock
allows direct tracking of model input and output, ideal for monitoring and auditing
------------
Prompt
a good prompt has: Instructions, Context, Input data, Output Indicator
Enhanced: set parameters, clear expectations
Negative: what you dont want, helps maintain focus & clarity
----------
Temperature: 0 = conservative, repetitive, focused on most likely response
1 = high creativity
----------
Top P and Top K - sampling strategies used in large language models (LLMs)
to control the randomness and coherence of the generated text
----------
Top P (percentage)
selects the smallest subset of tokens whose cumulative probability adds up to P
more adaptive than Top K, allows for flexible token selection. leading to more diverse outputs
while maintaining fluency
Influences the PERCENTAGE of most-likely candidates that the model considers for the next token
--------
0 = only consider 25% of most likely words
1 = broad range, more creative
----------
Top K
limits number of probable words
a smaller K makes the output more deterministic and focused
a larger K allows for more diversity and creativity in the generated text
----------
Response Length
max answer length
----------
Stop Sequences
tokens that signal the model to stop generating output
----------
Latency
model size, model type (llama, claude) number of tokens in output
NOT affected by all other parameter limitations
----------
----------
----------
Zero-Shot Prompting
the model is asked to perform a task or generate content without having seen examples of the task during training
less effective for tasks that require specific writing style
may struggle with inference
fully rely on models knowledge
the more capable FM, more likely for good results
----------
Few Shot Prompting
you provide examples, to help train the model on the desired style and format
-----
data should include user-input along with the correct user intent,
and provide examples of user queries and their corresponding intent
----------
Single-Shot Prompting
model is shown a single example to guide its response to a task
----------
Chain of Thought Prompting
divide task into sequence of reasoning steps leading to more structure and coherence
----------
----------
----------
RAG
combine models capability with external data sources
----------
Prompt Templates: simplify and standardize the process of generating prompts
helps process user input
orchestrates between FM, action groups and knowledge bases
formats and returns responses
----------
Prompt Template Injections: "ignore prompt template" attack
add explicit instructions to ignore unrelated or potential malicious content
----------
IAM Identity Center
to control access, users only receive responses from docs they have access to
can be configured with Google Login, Microsoft Active Directory
------------
Amazon Q
fully managed gen-ai assistant for your employees
a generative AI-powered assistant for accelerating software development and leveraging companies' internal data
does NOT allow you to choose the underlying Foundation Model
based on company knowledge and data
has guardrails, can block specific words or topics
can be configured to use only internal info
----------
Data Connectors
s3, rds, workdocs, aurora, slack, sharepint, salesforce, ms365
----------
Plugins
Jira, ServiceNow - allow you to interact with 3rd party apps with APIs
----------
Q Apps
auto generates web apps from instructions
----------
Q in Connect
contact center service from AWS
helps customer service agents provide better customer service
uses real-time conversation with the customer along with relevant company content to
automatically recommend what to say or what actions an agent should take to better assist customers
----------
Q Developer
can be used in integrated development environments (IDEs) as well as the AWS Management Console
powered by Amazon Bedrock
answers questions about AWS docs, services, account-specific cost-related questions using natural language
Understand and manage your cloud infrastructure on AWS
suggest CLI commands, analyze your bill, resolve errors, troubleshooting
AI coding companion, similar to github copilot, JS, Python, TyseScript
integrates with IDE VSCode, VS, Jetbrains
----------
Q Business
can answer questions, provide summaries, generate content, and complete tasks based on your enterprise data
also helps streamline tasks and accelerate problem-solving
can perform routine actions like submitting time-off requests and sending meeting invites
uses large language models (LLMs) and foundation models
features document enrichment which helps control what docs and doc attributes are ingested into the index
also uses RAG for optimizing the output of an LLM, so it references an authoritative knowledge base
outside of its training data sources before generating a response
----------
Quicksight Q
a way to use natural language to provide visual responses to business questions in just a few seconds
Users can ask questions in everyday language and receive precise responses with relevant visualizations
to enhance data comprehension
------------
Quicksight
a business intelligence (BI) service that allows users to easily create and share interactive dashboards
and visualizations from various data sources, enabling real-time insights and reporting
------------
PartyRock
https://partyrock.aws
anyone can build AI apps
------------
Transformer Models
use a self-attention mechanism and implement contextual embeddings, MEANING it focuses on different
parts of the input sequence, determining the importance of each token in relation to others
excels at processing sequential data, such as text, speech, and images
------------
Diffusion Models
work by first corrupting data with noise through a forward diffusion process and then learning to
reverse this process to denoise the data
------------
GPT
generative pre-trained transformer, generates human text based on prompts
------------
Principal Component Analysis (PCA)
a statistical method used for reducing the dimensions of large datasets
to simplify data while retaining most of the variance in the data
does not understand or differentiate the contextual meanings of words in natural language processing
------------
Word2Vec
uses static embeddings, cannot adjust the embedding based on context
an early embedding model that creates vector representations of words based on their co-occurrence in a given text
------------
Singular Value Decomposition (SVD)
a matrix decomposition method used in various applications like data compression and noise reduction
not designed to handle the dynamic, context-dependent meanings of words in sentences
------------
RNN
recurrent neural network, speech recognition, used for video analysis
------------
CNN
used for single image analysis
good for tasks like image classification to identify products in photos uploaded by users
suited for handling the complexities of image data
------------
ResNet
residual network, image recognition
------------
SVM
Support vector machine
------------
WaveNet
audio waveform for speech synthesis
------------
GAN
generative adversarial network
generates synthetic data, images, video, sound
------------
XGBoost
extreme gradient boosting
------------
------------
------------
Feature Engineering
TRANSFORMS RAW DATA into a more effective set of inputs
Each input comprises several attributes, known as features
-----
TRANSFORMS data and creates variables for the model
feature engineering significantly enhances their predictive accuracy and decision-making capability
-----
For Structured Data
includes tasks like normalization
handling missing values
encoding categorical variables
-----
For Unstructured Data (such as text or images)
tokenization (breaking down text into tokens)
vectorization (converting text or images into numerical vectors)
extracting features that can represent the content meaningfully
------------
Reinforcement Learning
effective for environments where responses need to be optimized based on direct user interaction
& SATISFACTION
agent interacting with an environment by taking actions and receiving rewards or penalties
reward values attached to the different steps that the algorithm must go through
-----
combines computer science, neuroscience, psychology yo determine how to map situations to actions
to maximize numerical reward signal
agents learn, and adjust appropriately
-----
Markov Decision Process
agent has a policy as a guideline for an OPTIMAL action to take at a given state
each state is associated with a "value function"
circular process: agent -> action -> environment -> state, reward -> agent
-----
ReLU (Rectified Linear Unit)
activation function used in deep learning neural networks
outputs the input directly if it is positive; otherwise, it returns zero
expressed as f(x)=max(0,x).
------------
Reinforcement Learning with Human Feedback
Data Collection
Supervised Fine-tuning
build seperate reward model
optimize the language model with reward based model
------------
Transfer Learning
a model pre-trained on one task is adapted to improve performance on a different but related task
by leveraging knowledge from the original task
------------
Incremental Training
a method that allows the chatbot to adapt over time by learning from new data
withtout forgetting previously learned information
allows a model to update itself with new data while retaining knowledge from old data
------------
Continued or Further Pre-Training
uses unlabeled data to pre-train a model
-----
training the model on a large corpus of domain-specific data
enhancing its ability to understand domain-specific terms, jargon, and context
------------
Fine-Tuning
uses labeled data to train a model
involves further training and more COSTS
fine-tuning uses less computation power as it works on a smaller dataset (cost)
------------
Pre-Training
is computationally expensive (cost)
------------
Dynamic Data Masking
a technique used to protect sensitive data from unauthorized access by masking the data in query results
does not guarantee that sensitive information is entirely removed from inference responses
------------
Semi-Supervised learning
Fraud detection
when you apply both supervised and unsupervised learning techniques to a common problem
use a small amount of labeled data and a large amount of unlabeled data to train systems
After that, the partially trained algorithm labels the unlabeled data - called pseudo-labeling
The model is then re-trained on the resulting data mix without being explicitly programmed
------------
Supervised Learning
models are supplied with LABELED, defined data to assess for correlations
most appropriate when you have a large dataset of LABELED examples
requires extensive datasets and retraining the model whenever new data is available
------------
Techniques
----------
Linear Regression
SUPERVISED
used to predict continuous numerical values based on independent features
----------
Neural Network
more complex SUPERVISED learning technique
takes some given inputs and performs one or more layers of mathematical transformation
based on adjusting data weightings
example of a neural network technique is predicting a digit from a handwritten image
consist of layers of nodes (neurons) that process input data
adjusting the weights of connections between nodes through training to recognize patterns and make predictions
------------
Self Supervised Learning
generates the labels themselves
models are provided with vast amounts of raw, UNLABELED data
------------
Unsupervised learning
you give the algorithm input data WITHOUT any labeled OUTPUT data
the model then establishes meaningful connections between the inputs and PREDETERMINED outputs
-----
the algorithm groups data based on similarities without predefined categories or labels
----
Examples:
Clustering - groups certain data inputs, so they may be categorized as a whole
Dimensionality reduction - reduces the number of features in a dataset, reduce complexity and overheads
------------
------------
------------
Confusion Matrix Metrics
Precision, Recall, F1, Accuracy
best way to evaluate the perfomance of a model that does classifications
evaluate the performance of classification models by displaying the number of
true positives, true negatives,
false positives, false negatives
provides a detailed breakdown of the model's performance across all classes, making it the most suitable
choice for evaluating a classification model's accuracy
------------
AUC-ROC
performance meteric?
------------
Math equations metrics that show the quality of regression
MAE - mean absolute error
MAPE - mean absolute percentage error
RMSE - root mean squared error
R-squared - measures variance
------------
Inferencing
the model uses its trained parameters to generate a prediction or output
based on new input data provided by the user
--------
Real time - speed is referred over accuracy
Batch - accuracy is more important
------------
Pases of ML Project
Retrain, Deployment, Monitoring, Iterations
------------
Model Parameters
values that define a model and its behavior in INTERPRETING input and GENERATING responses
controlled and updated by providers
internal variables of the model that are learned and adjusted during the training process
directly influence the output of the model for a given input
Examples include the WEIGHTS and BIASES in a neural network
------------
Hyperparameters
values that can be adjusted for model customization to control the training process
------------
Hyperparameter Tuning
define model structure, set before training
optimize accuracy, overfitting, enhanced generalization
most effective solution to avoid overfitting
use hyperparameters for model tuning
-----
Automatic Model Tuning
automatically tunes your machine learning model by running A NUMBER OF JOBS on your dataset
to find the best version of the model
-----
no settings are mandatory
-----
Adjustable Parameters:
-----
Learning Rate
updating model weights during training
-----
Batch Size
number of examples, smaller is more stable, but uses more compute time
-----
Number of Epochs
how many times the model will iterate over entire training dataset
allows the model to learn from the training data for a longer period
potentially capturing more complex patterns and relationships
-----
Bias is an error introduced by approximating a real-world problem
-----
Variance is an error introduced by the model's sensitivity to small fluctuations in the training data
-----
Bias versus variance trade-off
balancing:
complexity (variance)
incorrect assumptions (bias),
where high bias can cause underfitting and high variance can cause overfitting
-----
Cost Trade-offs
Regional coverage can affect cost due to differences in prices across regions and
also because data transfer between regions can lead to additional costs
-----
Benchmark Datasets
pre-compiled, standardized datasets specifically designed to test for BIASES and DISCRIMINATION in model outputs
minimizes administrative effort while providing reliable, comprehensive insights into biases or discrimination
to assess the accuracy (even no bias realted items) of the model to ensure it meets performance requirements
-----
Overfitting
high variance and low bias
performs well on the training data but poorly on new, unseen data
has HIGH prediction accuracy on TAINING data, LOW accuracy for NEW data
model is overly complex and captures noise or random fluctuations in the training data
rather than the underlying patterns
Fix by:
- increase training data
- early stopping of training to adjust
- data augmentation to increase diversity in the dataset
- adjust hyperparameters, but do not 'add' them
-----
Underfitting
low variance, high bias
performs poorly on both the training data and new, unseen data
cannot capture the relationship between the input and output data effectively
model is too simple
model will show high error rates on BOTH the training set and new data
-----
Decision Trees
highly interpretable models that provide clear visualization of the decision-making process
Decision Trees can handle multi-class classification problems
suitable for categorizing things into multiple distinct classes
-----
Pruning
simplifies decision trees by removing branches that have little importance
-----
Cross-validation
helps ensure the model generalizes well to unseen data by dividing data into multiple training/validation sets
-----
the training set - is used for training the model
the validation - set is used for tuning hyperparameters and model selection
the test set - is used for evaluating the final model performance
-----
Validation Set (optional)
used for tuning hyperparameters and selecting the best model during the training process
-----
Test Set
used on the final trained model to assess performance on unseen data
determines how well the model generalizes
-----
Regularization
penalize complex models to reduce overfitting
adjust balance between simple and complex model
adds constraints that penalize complexity, encouraging the model to generalize better
------------
ML - good for approximations, NOT when you want exact answers
------------------------------------
------------------------------------
------------------------------------
SageMaker
build, train, and deploy machine learning models
--------
Autopilot
designed for automating the process of finding the best hyperparameters
--------
allows data scientists and developers to quickly and confidently build, train,
and deploy ML models into a production-ready hosted environment.
--------
Pre-trained models are fully customizable for your use case with your data
--------
Evaluate, compare, and select Foundation Models quickly based on pre-defined quality and responsibility metrics
--------
Clarify
evaluates FMs for your gen AI use case with metrics such as accuracy, robustness, and toxicity
helps identify potential BIAS during data preparation without writing code
compare models, explain outputs, DETECTS BIAS
--------
Model Monitor
does NOT assist in model selection or CONTENT moderation
continuously monitor machine learning models deployed in production
detects data quality issues, concept drift, and other anomalies
see quality of a model
alerts you of inaccurate predictions
--------
Model Dashboard
view, search, and explore all of the models in your account
track which models are deployed for inference and if they are used in batch transform jobs or hosted on endpoints
--------
Model Cards
document risk and rating of the model, as well as custom information like
training details and metrics, evaluation results, and observations
-----
guidance on how each model should be used, along with an assessment of the
potential risks associated with its DEPLOYMENT
-----
call-outs such as considerations, recommendations, and custom information
centralize and standardize model documentation so you can implement ML responsibly
-----
documents critical details about your machine learning (ML) models in a single place for
streamlined governance and reporting
-----
Describes how a model should be used in a production environment
--------
AI Service Cards
provide transparency about AWS AI services' intended use, limitations, and potential impacts
a form of responsible AI documentation that provides customers with a single place to find information
on the intended use cases and limitations
--------
Model Dashbord
centralized view of all models to easily track, manage, and monitor them
overview of deployed models and ENDPOINTS so that you can track resources and model behavior violations
--------
MLflow
used to manage your iterative process of ML experimentation
track, organize, view, analyze, and compare iterative ML experimentation
gain comparative insights and register and deploy your best-performing models
--------
Data Preparation
One of the main challenges in machine learning implementation is the difficulty in collecting
and preparing high-quality data for training models
--------
Exploratory Data Analysis (EDA)
involves examining the data through statistical summaries and visualizations to identify patterns,
detect anomalies, and form hypotheses
calculating statistics and visualizing data are fundamental to EDA, helping to uncover patterns,
detect outliers, and gain insights
--------
Data Wrangler
single visual interface
explore and prep data sets
can fix bias by balancing the dataset
Fix bias by balancing the dataset
a service that simplifies the process of data preparation and feature engineering
Offers 300+ pre-configured data transformations to prepare data for ML
--------
Data Lineage
tracks how data is generated, transformed, transmitted, and used across a system over time
can ensure data privacy and compliance by tracking the flow and transformation of data
--------
End To End ML service
--------
Feature Store
fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models
can ingest data from a variety of sources, such as:
application and service logs, clickstreams, sensors,
S3, Redshift, Lake Formation, Snowflake, and Databricks Delta Lake
--------
Ground Truth
helps build large, high-quality, labeled datasets for training machine learning models
-----
uses Reinforcement Learning from Human Feedback for model grading and labeling,
harnessing human input across the ML lifecycle
to improve the accuracy and relevancy of models
-----
Self-service offering
your data annotators, content creators, and prompt engineers (in-house, vendor-managed, or leveraging
the public crowd) can use our low-code user interface to accelerate human-in-the-loop tasks
-----
AWS-managed offering
(SageMaker Ground Truth Plus), AWS handles the heavy lifting, which includes selecting and managing
the right workforce for your use case
--------
Ground Truth Plus
fully managed data labeling service that helps deliver high-quality annotations
uses a combination of human labelers and machine learning-assisted labeling to ensure accuracy
and consistency in the labels
for businesses looking to create accurate training datasets at scale while minimizing manual errors
--------
Human Evaluation
of text-related tasks using a comparitive mechanism, need the most important points on the Likert Scale
ensure the outputs are adapted to your users
--------
RLHF
a technique used to train AI models using human feedback to refine their behavior
--------
Augmented AI (Amazon A2I)
can add MULTIPLE reviewers
-----
an AWS service that provides a human review of machine learning predictions to improve
model accuracy and reliability
-----
makes it easy to build and manage human reviews for machine learning applications
provides built-in human review workflows for common machine learning use cases
helps implement human review workflows for machine learning predictions
-----
integrates human judgment into ML workflows, allowing for reviews and corrections of model predictions
for applications requiring high accuracy and accountability
--------
Jumpstart
Pre-trained models are FULLY CUSTOMIZABLE for your use case with your data
pre built solutions, evaluate, compare, and select models quickly
a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions
deploy with just a few clicks
access pre-trained models, including FMs, to perform tasks like article summarization and image generation
easily deploy them into production with the user interface or SDK
--------
Generative AI powered summarization chatbot
generate concise summaries of text
with prompt engineering, the summarization chatbot can be specifically tailored to accurately extract
detailed key points, entities, or legal clauses from complex legal document
--------
Pipelines - CI/CD for ML
--------
Registry - central repo to manage ML versions
--------
Role Manager
access control
baseline set of permissions for ML activities
--------
Governance tools provided by Amazon SageMaker
Role Manager
Model Cards
Model Dashboard
--------
Studio - unified UI
--------
Real Time - Fastest
SYNCHRONOUS
more configuration RAM
when low latency is essential, and responses are needed immediately
fully managed and support autoscaling
--------
Asynchronous - Medium
longer processing times
ideal for payload sizes less than 1gb and immediate results are not critical
delayed responses at a lower cost
data goes into one bucket, result fed into another
near real-time latency requirements
request and response -> S3
--------
Batch - Slow
ASYNCHRONOUS
for large payloads
predictions for entier data set
also used S3
when you can tolerate some delays in receiving responses, but need a cost-effective inference method that
optimizes resource usage without sacrificing too much on turnaround time
--------
Serverless
good for workloads with unpredictable traffic or sporadic requests
scales automatically based on demand
no config, one click and go
--------
Deployment Types
Real-Time Inference: 6mb - fast, low latency, near instant preditions for web/mobile apps
Serveless Inference: 4mb - fast, low latency, sporadic, short term inference, no infrastructure,
allowing for COLD starts
Asynchronous Inference: 1gb - near real-time, large payloads longer processing
Batch Transform: 100mb - high latency, concurrent, bulk processing for large datasets
--------
Network Isolation Mode - no internet, no S3
DeepAR forecasting - time series data
--------
Canvas
No code solution to create machine-learning models
Browse, import, and join data
Gives the ability to use machine learning to generate predictions without the need to write any code
generates new relevant features, tests hundreds of prediction models
cleans data, automatically detecting and cleaning missing values
enables users to create machine learning models using a visual interface
------------------------------------
------------------------------------
------------------------------------
Trusted Advisor
advises on BEST PRACTICES, and optimize your AWS environment for cost savings on:
service limits, perfomance, security, FAULT tolerance
---------
Audit Manager
helps you assess internal risk with prebuilt frameworks that translate evidence from cloud services
into security IT audit reports
continuously audit AWS usage to simplify how you assess risk and compliance
with regulations and industry standards
an essential tool for governance in AI systems
---------
Inspector
automated security assessment service that helps improve the security and compliance of
applications deployed on AWS
for use on EC2, ECR & Lambda
does NOT track CONFIG changes
INSTALLED agents on EC2
AUTOMATED continual checks for security VULNERABILITY
CHECKS for OS vulnerabilities
can also be integrated with CI/CD tools to monitor and improve security & compliance of web apps
------------
Config
A view of the CHANGES in your RESOURCES associated with your AWS account
INCLUDING how they are configured
can generate an INVENTORY of AWS resources
------------
Artifact
provides on-demand access to AWS’ compliance reports and online agreements
------------
Exposure
the risk of exposing sensitive or confidential information to a model during training or inference.
a model can then reveal this sensitive data from their training corpus, leading to data leaks or privacy violation
-----------
Prompt Injection
influencing the outputs by embedding specific instructions within the prompts themselves
-----------
Dynamic Prompt Engineering
modifying the input prompts to the LLM to customize the chatbot's responses
-----------
Prompt Template
can guides the LLM to detect and respond appropriately to potential attack patterns
can include predefined instructions that condition the LLM to recognize malicious patterns
and avoid generating unintended or harmful responses
embedding defense mechanisms within the prompts themselves
-----------
Jailbreaking
bypassing the built-in restrictions and safety measures of AI systems to unlock restricted
functionalities or generate prohibited content
-----------
Hijacking
manipulating an AI system to serve malicious purposes or to misbehave in unintended ways
cause model to stray from its intended prompt, often producing unexpected or undesired outputs
-----------
Risk management in the Generative AI Security Scoping Matrix
identifying potential threats to generative AI solutions and recommending mitigations
-----------
Average Response Time
evaluate the runtime efficiency of a model
-----------
Amazon Transcribe Medical
an auto speech recognition (ASR) service to add medical speech-to-text capabilities to voice-enabled application
-----
Amazon Comprehend Medical
returns useful information in unstructured clinical text such as physician's notes, discharge summaries,
test results, and case notes
-----------
Trainium (chips)
for deep learning training
-----------
Inferentia (chips)
for deep learning inference
-----------
Stable Diffusion
a gen AI model that produces unique photorealistic images from text and image prompts
-----------
Classification techniques
commonly used to predict a category or class from a set of features andare well suited for
predicting customer churn, which is a binary outcome: churn or not churn
-----------
Multi-Class Classification
assigns each instance to ONE of several possible classes
(e.g., an image classified as either a cat, dog, or bird)
-----------
Multi-Label Classification
multi-label classification assigns each instance to one or more classes
(e.g., a document classified as both "science" and "technology")
-----------
Object Detection
a computer vision technique that identifies instances of objects within images and videos,
including detecting and classifying ANIMALS
-----------
DeepRacer
is a Wi-Fi enabled, physical vehicle that can drive itself on a physical track
-----------
-----------
-----------
SageMaker Studio supports IDE:
-----
JupyterLab
A popular notebook interface for data exploration and model building.
-----
Code Editor
Based on Code-OSS (Visual Studio Code - Open Source), it offers a lightweight and powerful code editor
with familiar shortcuts, terminal, debugger, and refactoring tools.
-----
RStudio
A fully managed IDE for R development, featuring a console, syntax-highlighting editor, and tools for plotting,
history, debugging, and workspace management.
-----------
Anomaly Detection System
can analyze patterns and behaviors, such as IP address access patterns,
to detect any DEVIATIONS from the norm, which could indicate suspicious or malicious activity
-----------
Area Under the Curve
one of the model performance metrics
-----------
Data Minimization
reduces privacy risk, cost, carbon footprint
-----------
-----------
The Explainable AI Principle
contrastive explanations is central to human understanding
-----
In this context, an event 'X' (the AI’s decision) needs to be contrasted with another event 'Y'
(as a baseline or what was expected).
-----
They facilitate easier debugging and optimization
-----------
When designing models for human interaction, design the model to give CONTRASTIVE explanations
-----------
Data Classification
a concept th