AWS AI Practitioner Notes

Last Updated on March 16, 2025

Bedrock - fully managed, serverless
Sagemaker - you need to configure endpoints
Sagemaker Jumpstart - one-click pre-built solutions
-----
F1 - performance
ROUGE - text generation
-----
Human bias vs Algorithmic bias, look for the human/machine differentiator
-----
Yes, it is possible to increase both bias and variance, 
  but this typically leads to a model that performs poorly due to both 
  underfitting and overfitting
-----
think > Variance
  UNDER = LOW Var (underfit)
  OVER = HIGH Var (overfit)
-----
Order Of Complexity
  prompt, rag, fine-tune
-----
Asynchronous Inference Endpoint
  less than/up to 1gb
  save on cost by autoscaling instance count to zero when no requests to process
-----
realtime - immediate
-----
batch - slowest, large datasets
-----
-----
Inspector
Audit Mananger
Config
-----
Supervised:
 Linear Regression
 Classification
 Neural Network
 kNN - classifying based on proximity
-----
Unsupervised
 K-means Clustering
   market segmentation, image compression
 -----
 Hierarchical Clustering
   analyze social network data
 -----
 Princicpal Component Analysis
   facial recognition / computer vision
 -----
 Anomaly Detection
 -----
 Dimensionality reduction
 -----
 Random Cut Forest model
  designed to detect anomalous data points within a dataset

-----
Self Supervised
 uses UNLABELED data, 
 generates the labels themselves
-----
Semi Supervised
  Fraud detection
  Sentiment Analysis
-----
Shapley
  detailed. individual predictions
-----
PDP 
  visual relationship of specific feature
-----
Provisioned Throughput
  You can use a customized model only in the Provisioned Throughput mode
-----
Few Shot Prompting
  you provide examples,
  data should include user-input along with the correct user intent, 
      and provide examples of user queries and their corresponding intent
-----
Zero-Shot Prompting
  no examples provided
-----
Single-Shot Prompting
  single example
-----
to ensure the model generalizes well
  Regularization
  Cross-validation 
  Pruning
-----
Validation Sets are optional
-----
Model Parameters - define a model
Hyperparameters - can be adjusted for model customization 
-----
Inspector
  automated security assessment
-----
Audit Manager
  helps you assess internal risk 
-----
Deep Learning
  GRADIENT DESCENT
  adjust weights and biases of a neural network 
-----
Ground Truth
  build large labeled datasets
-----
LLM
  non-deterministic
  the generated text may be different for every user that uses the same prompt
  use the inference parameter Temperature, which regulates the creativity of LLMs’ responses

can do natural language processing tasks
  text generation
  sentiment analysis
  language translation
  question/answer systems
------------
Inference
  the model uses its trained parameters to generate a prediction or output based on new input data 
    provided by the user
------------
Disadvantage of Gen AI
   lack the emotional depth present in human-created content
------------
Model interpretability and transparency 
  come at the cost of performance
  Highly interpretable models, such as linear regression and decision trees, are generally simpler 
    and can be easier to understand, but this simplicity can negatively affect their performance 
    compared to more complex models like deep neural networks
------------
Generative Models
  focus on generating new data from learned patterns
  learns the underlying patterns of data to create new, similar data
------------
Discriminative Models
  classify data by distinguishing between different classes
  focuses on decision boundaries to classify inputs
------------
SLM
  based on a large language model (LLM) but is significantly smaller in size and resource requirements
  optimized for deployment on edge devices is specifically designed to be lightweight, efficient, 
    and capable of running on devices with limited computational resources, like edge computing scenarios
------------
Foundation Model
  uses Self-supervised learning to create labels from input data
  -----
  FMs use UNLABELED data sets for self-supervised learning
  -----
  machine learning or deep learning model that is trained on vast datasets 
    so it can be applied across a wide range of use cases.
----------
Hierarchy
 Artificial Intelligence > Machine Learning > Deep Learning > Generative AI
----------
----------
----------
Machine Learning
  a data scientist MANUALLY determines the set of relevant features that the software must analyze
    data scientist manually determines the set of relevant features that the software must analyze
  --------
  algorithms often require feature extraction and can use various methods such as 
    decision trees 
    or support vector machines
  --------
  Three main types of machine learning: 
    supervised learning
    unsupervised learning
    deep learning
  --------
  Feature Extraction 
    reduces the number of features by transforming data into a new space
  --------
  Feature Selection
    reduces the number of features by selecting the most relevant ones from the existing features 
----------
Deep Learning
  a SUBSET OF ML that uses neural networks with many layers to learn from large amounts of data
  -----
  the data scientist gives only RAW DATA to the software and the deep learning network derives the features by itself
  -----
  uses large datasets to adjust the weights and biases of a neural network through multiple iterations, 
    using techniques such as GRADIENT DESCENT to minimize the error
------------
------------
------------
Embedding Models (Amazon Titan)
  used to create vectors
------------
Multi-modal Model
  can recognize various forms of input, such as data text, images
  can accept a mix of input types such as audio/text and create a mix of output types 
    such as video/image
  this is a significant advancement in AI
------------
Unimodal models
  handle a single type of data
------------
Computer Vision (CV) Models
  Amazon Rekognition makes it easy to add advanced computer vision capabilities to your application	

------------
Model Efficiency Metric
  assess how well an AI system performs in terms of resource usage, speed, and scalability

------------
ROUGE
  Recall-Oriented Understudy for Gisting Evaluation
  used to assess performance of FM text generation
  quick and objective measurements
------------
BLEU (Bilingual Evaluation Understudy)
  a metric specifically designed to evaluate the quality of text that has been machine-translated 
    by comparing it with one or more reference translations
  quick and objective measurements
------------
Standard performance metrics used to evaluate the effectiveness of a classification system:
  Precision, Recall, and F1-Score
------------
F1 
  a metric used in machine learning to evaluate the PERFORMANCE of a classification model
  a model PERFORMANCE metric, NOT a business metric
  the score can range from 0 to 1
  1 represens a model that perfectly classifies each observation into the correct class
  0 representing a model that is unable to classify any observation into the correct class
------------
BERT
  bidirectional encoder representations from transformers
  capture the contextual meaning of words by looking at both the words that come before and after them
  creates dynamic word embeddings that change depending on the surrounding text
  uses deep learning techniques to predict missing words by considering the words before and after the gap
  evaluate how closely the chatbot’s replies match a reference set of correct responses 
    based on their context and meaning
------------
Partial Dependence Plots (general)
  visualization technique to understand the relationship of a specific feature and the predicted outcome of a model
  shows average predicted output for a given feature, while holding all other features constant
------------
SHAP Values Shapley (detailed)
  provide a more detailed understanding of how individual predictions are made by a model
  SHAP values assign a value to each feature for a specific prediction, indicating the contribution 
    of that feature to the predicted outcome
------------
Comprehend
  natural language processing (NLP) service that uses machine learning to uncover insights and relationships in text
  can discover PII
  provides APIs for text analysis, such as:
    Custom Entity Recognition
    Custom Classification
    Key Phrase Extraction
    Sentiment Analysis
    Entity Recognition
------------
Named Entity Recognition (NER)
  a subfield of natural language processing (NLP) that focuses on identifying and classifying 
    specific data points from textual content
------------
Kendra
  document content extractor via machine learning
  highly accurate, easy enterprise search service powered by ML
  search data from manuals, research reports, FAQs, human resources (HR) documentation, and customer service guides
  search data across various systems such as S3, SharePoint, Salesforce, ServiceNow, RDS, OneDrive
------------
------------
------------
Order Of Complexity ------
--------------------------
1. Prompt Engineering
  the practice of carefully designing prompts to efficiently tap into the capabilities of FMs. 
  It involves the use of prompts, which are short pieces of text that guide the model to generate more accurate 
  and relevant responses. With prompt engineering, you can improve the performance of FMs and make 
  them more effective for a variety of applications.
------------
2. Retrieval Augmented Generation (RAG) 
  allows you to customize a model’s responses when you want the model to consider new 
    knowledge or up-to-date information. When your data changes frequently, like inventory or pricing, 
    it’s not practical to fine-tune and update the model while it’s serving user queries
------------
3. Fine-tuning
  the process of taking a pre-trained FM, such as Llama 2, and further training it on a downstream 
    task with a dataset specific to that task
------------
------------
------------
Bedrock
  can be used to fine-tune pre-trained foundation models for various tasks, including sentiment analysis
  can analyze text data to determine sentiment, making it a versatile option for advanced users 
    who may need more customizable solutions than Amazon Comprehend
  the easiest way to build and scale generative AI applications with foundation models
  provides an environment to build and scale generative AI applications with FMs
  fully managed service that offers a choice of high-performing FMs from leading AI companies, available via an API
  ----------
  Model Evaluation on Amazon Bedrock
    the process of preparing data, training models, selecting appropriate metrics, testing and analyzing results,
      ensuring fairness and bias detection, tuning performance, and continuous monitoring
  ----------
  Model Customization Methods
    Continued Pre-training, Fine-tuning
  ----------
  RAG - Retrieval Augmented Generation
    LEAST dev effort, compared to fine tuning
    uses embedding vectors
    Use cases:  customer service, legal research, healthcare questions
    gives answers with links to resources
  ----------
  Human Model Evaluation
    for assessing qualitative aspects of the model
    automatic model valuation is valuable for assessing quantitative aspects of the model
  ----------
  K-means Clustering - unsupervised
    an unsupervised machine learning algorithm used for data clustering
    groups unlabeled data points into clusters based on similarity
  ----------
  kNN - supervised
    assumes that similar items are close to each other in a feature space
    for both classification and regression tasks
    supervised learning algorithm used for classifying data points based on their proximity to labeled examples
  ----------
  Vector Databases
    OpenSerch Service - best for RAG
      built to handle search and analytics workloads
      highly effective for applications that require rapid data retrieval and relevance ranking
      provides fast search capabilities and supports full-text search, indexing, and similarity scoring
      kNN (fastest nearest neighbor) search capability
      helps store embeddings within vector databases
      If you do not have an existing vector database, Amazon Bedrock creates an OpenSearch Serverless
    --------
    DocumentDB - (Mongo enabled) NoSql
      not optimized for full-text search or similarity searches
    --------
    DynamoDB
      does not natively support advanced search capabilities or similarity scoring needed for RAG applications
    --------
    Aurora
      OLTP (Online Transaction Processing) workloads. While it provides advanced indexing features for
      relational data, it is not optimized for full-text search
    --------
    RDS: relations, open source
    --------
    Neptune: graph db
  ----------
  Sources: S3, Confluence, SharePoint, SalesForce, Web Pages
  ----------
  Knowledge Bases
    a way to add, test and use your data alongside a FM
    --------
    you provide foundation models with contextual information from your company's private data 
      for Retrieval Augmented Generation (RAG), enhancing response relevance and accuracy
    --------
    Create/manage RAG capabilities to provide FMs and agents contextual information from your company’s 
      private data sources to deliver more relevant, accurate, and customized responses
    --------
    supports popular databases for vector storage, including vector engines for:
      Amazon OpenSearch Serverless
      Pinecone
      Redis Enterprise Cloud
      Amazon Aurora
      MongoDB
  ----------
  Agents
    stepped instructional agents
    autonomously or semi-autonomously perform specific actions or tasks based on predefined rules or algorithms
    can complete complex tasks for a wide range of use cases
    deliver up-to-date answers based on proprietary knowledge sources
  ----------
  RAG refers to querying and retrieving information from a data source to augment a generated response to a prompt,
  Agent refers to an application that carries out orchestrations through cyclically interpreting inputs 
    and producing outputs by using a foundation model
------------
  Amazon Bedrock Studio
    IDE for creating apps and checking outputs
  ----------
  Pricing
    Smaller models are cheaper to use than larger models
    --------
    Reducing Number of Tokens
      most effective way to MINIMIZE COSTS associated with the use of a generative AI model on Amazon Bedrock
    --------
    On demand
      text models, embedding models, image models
      NOT for use with CUSTOM models
    --------
    Batch: discounts up to 50%, multi predictions, 
    --------
    Provisioned Throughput
      You can use a customized model only in the Provisioned Throughput mode
      designed for situations where there is a predictable, continuous workload, such as the intensive compute 
        required during the fine-tuning phase
      1-6 months
      throughput max numbers/tokens, 
      works with Base, Fine-tuned, Custom Models
    --------
    Temperature, Top K, Top P
      no impact on pricing
    --------
    Model Size
    Number of Input/Ouput Tokens
    --------
    Promt Engineering - cheap
    --------
    RAG - medium cost
      does not change weights of the FM
      external knowledge, no need to know everything
    --------
    Fine Tuning
      changes weights of the FM
      involves providing LABELED training data where each example consists of a prompt 
        (the input to the model) and a completion (the desired output)
    --------
    Instruction Based Fine Tuning
    instruction-based fine-tuning
      MUST be in "Prompt -> Response" text pairs format
      FM is tuned with instructions - computation
    --------
    Domain Adaptation Fine Tuning
      FM trained on domain-specific dataset
      involves fine-tuning the model on domain-specific data to adapt its knowledge to that particular domain
------------
Tokenization / Tokens
  converting raw text into a sequence of tokens
  tokens represent words, sub-words, or characters that the model processes as discrete units of text
  a sequence of characters that a model can interpret or predict as a single unit of meaning
  ----------
  Context Window
    the number of tokens a model can consider when generating text
    bigger window = more info & coherence
    different models have different limits
  ----------
  Embeddings
    a vector of numerical values, represents condensed information obtained by transforming input into that vector
    creates vectors out of text, images or audio
    vectors have high dimensionality, to capture many featues for one token, such as
      sematic meaning, syntactic role, sentiment
    can power search applications
    words with semantic relationshps have similar embeddings
  ----------
  Multi-modal Embedding Model
    designed to represent and align different types of data (such as text and images) in a shared embedding space
      allowing a chatbot to understand and interpret both forms of input simultaneously
------------
Guardrails
  monitor and analyze user inputs to ensure compliance with safety policies
  control interaction between users and FM's
  filter 'harmful' content
  -----
  remove PII
  can mask personally identifiable information (PII) in model responses, replacing sensitive information 
    with identifier tags like [NAME-1], [EMAIL-1],
   -----
  enhanced privacy
  reduce hallucinations
  use multiple guradrail rules to work together
  -------
  profanity, sexual, insults, 'hate', 
  -------
  grounding and relevance (tamp hallucinations)
------------
Hallucination
  responses that sound plausible and appear factual, but is incorrect or fabricated
------------
Data Drift
  when the distribution or characteristics of the input data change over time, which can cause the model’s performance to degrade
------------
Cloudwatch
  Model Invokation Logging
    history of all that happens in bedrock
    can go to S3 or CloudWatch
    log groups
  Metrics
    ContentFilteredCount
    build alarms such as guardrail breaches
------------
Model Invocation Logging
  allows for detailed logging of all requests and responses during model invocations in Amazon Bedrock
  allows direct tracking of model input and output, ideal for monitoring and auditing
------------
Prompt
  a good prompt has: Instructions, Context, Input data, Output Indicator
  Enhanced: set parameters, clear expectations
  Negative: what you dont want, helps maintain focus & clarity
  ----------
  Temperature: 0 = conservative, repetitive, focused on most likely response
               1 = high creativity
  ----------
  Top P and Top K - sampling strategies used in large language models (LLMs) 
    to control the randomness and coherence of the generated text
  ----------
  Top P (percentage)
    selects the smallest subset of tokens whose cumulative probability adds up to P
    more adaptive than Top K, allows for flexible token selection. leading to more diverse outputs 
      while maintaining fluency
    Influences the PERCENTAGE of most-likely candidates that the model considers for the next token
    --------
    0 = only consider 25% of most likely words
    1 = broad range, more creative
  ----------
  Top K
    limits number of probable words
    a smaller K makes the output more deterministic and focused
    a larger K allows for more diversity and creativity in the generated text
  ----------
  Response Length
    max answer length
  ----------
  Stop Sequences
    tokens that signal the model to stop generating output
  ----------
  Latency
    model size, model type (llama, claude) number of tokens in output
    NOT affected by all other parameter limitations
  ----------
  ----------
  ----------
  Zero-Shot Prompting
    the model is asked to perform a task or generate content without having seen examples of the task during training
    less effective for tasks that require specific writing style
    may struggle with inference
    fully rely on models knowledge
    the more capable FM, more likely for good results
  ----------
  Few Shot Prompting
    you provide examples, to help train the model on the desired style and format
    -----
    data should include user-input along with the correct user intent, 
      and provide examples of user queries and their corresponding intent
  ----------
  Single-Shot Prompting
    model is shown a single example to guide its response to a task
  ----------
  Chain of Thought Prompting
    divide task into sequence of reasoning steps leading to more structure and coherence
  ----------
  ----------
  ----------
  RAG
    combine models capability with external data sources
  ----------
  Prompt Templates: simplify and standardize the process of generating prompts
    helps process user input
    orchestrates between FM, action groups and knowledge bases
    formats and returns responses
  ----------
  Prompt Template Injections: "ignore prompt template" attack
    add explicit instructions to ignore unrelated or potential malicious content

----------
IAM Identity Center
  to control access, users only receive responses from docs they have access to
  can be configured with Google Login, Microsoft Active Directory
------------
Amazon Q
  fully managed gen-ai assistant for your employees
  a generative AI-powered assistant for accelerating software development and leveraging companies' internal data
  does NOT allow you to choose the underlying Foundation Model
  based on company knowledge and data
  has guardrails, can block specific words or topics
  can be configured to use only internal info
  ----------
  Data Connectors
    s3, rds, workdocs, aurora, slack, sharepint, salesforce, ms365
  ----------
  Plugins
    Jira, ServiceNow - allow you to interact with 3rd party apps with APIs
  ----------
  Q Apps
    auto generates web apps from instructions 
  ----------
  Q in Connect
    contact center service from AWS
    helps customer service agents provide better customer service
    uses real-time conversation with the customer along with relevant company content to 
      automatically recommend what to say or what actions an agent should take to better assist customers
  ----------
  Q Developer
    can be used in integrated development environments (IDEs) as well as the AWS Management Console
    powered by Amazon Bedrock
    answers questions about AWS docs, services, account-specific cost-related questions using natural language
    Understand and manage your cloud infrastructure on AWS
    suggest CLI commands, analyze your bill, resolve errors, troubleshooting
    AI coding companion, similar to github copilot, JS, Python, TyseScript
    integrates with IDE VSCode, VS, Jetbrains
  ----------
  Q Business
    can answer questions, provide summaries, generate content, and complete tasks based on your enterprise data
    also helps streamline tasks and accelerate problem-solving
    can perform routine actions like submitting time-off requests and sending meeting invites
    uses large language models (LLMs) and foundation models
    features document enrichment which helps control what docs and doc attributes are ingested into the index
    also uses RAG for optimizing the output of an LLM, so it references an authoritative knowledge base
      outside of its training data sources before generating a response
  ----------
  Quicksight Q
    a way to use natural language to provide visual responses to business questions in just a few seconds
    Users can ask questions in everyday language and receive precise responses with relevant visualizations
     to enhance data comprehension
------------
Quicksight
  a business intelligence (BI) service that allows users to easily create and share interactive dashboards 
    and visualizations from various data sources, enabling real-time insights and reporting
------------
PartyRock
  https://partyrock.aws
  anyone can build AI apps
------------
Transformer Models
  use a self-attention mechanism and implement contextual embeddings, MEANING it focuses on different 
    parts of the input sequence, determining the importance of each token in relation to others
  excels at processing sequential data, such as text, speech, and images
------------
Diffusion Models
  work by first corrupting data with noise through a forward diffusion process and then learning to 
    reverse this process to denoise the data
------------
GPT
  generative pre-trained transformer, generates human text based on prompts
------------
Principal Component Analysis (PCA)
   a statistical method used for reducing the dimensions of large datasets 
    to simplify data while retaining most of the variance in the data
   does not understand or differentiate the contextual meanings of words in natural language processing
------------
Word2Vec
  uses static embeddings, cannot adjust the embedding based on context
  an early embedding model that creates vector representations of words based on their co-occurrence in a given text
------------
Singular Value Decomposition (SVD)
  a matrix decomposition method used in various applications like data compression and noise reduction
  not designed to handle the dynamic, context-dependent meanings of words in sentences
------------
RNN
  recurrent neural network, speech recognition, used for video analysis
------------
CNN 
  used for single image analysis
  good for tasks like image classification to identify products in photos uploaded by users
  suited for handling the complexities of image data
------------
ResNet
  residual network, image recognition
------------
SVM
  Support vector machine
------------
WaveNet
  audio waveform for speech synthesis
------------
GAN
  generative adversarial network
  generates synthetic data, images, video, sound
------------
XGBoost
  extreme gradient boosting
------------
------------
------------
Feature Engineering
  TRANSFORMS RAW DATA into a more effective set of inputs
  Each input comprises several attributes, known as features
  -----
  TRANSFORMS data and creates variables for the model
  feature engineering significantly enhances their predictive accuracy and decision-making capability
  -----
  For Structured Data
    includes tasks like normalization
    handling missing values
    encoding categorical variables
  -----
  For Unstructured Data (such as text or images)
    tokenization (breaking down text into tokens)
    vectorization (converting text or images into numerical vectors)
    extracting features that can represent the content meaningfully
------------
Reinforcement Learning
  effective for environments where responses need to be optimized based on direct user interaction 
    & SATISFACTION
  agent interacting with an environment by taking actions and receiving rewards or penalties
  reward values attached to the different steps that the algorithm must go through
  -----
  combines computer science, neuroscience, psychology yo determine how to map situations to actions 
    to maximize numerical reward signal
  agents learn, and adjust appropriately
  -----
  Markov Decision Process
    agent has a policy as a guideline for an OPTIMAL action to take at a given state
    each state is associated with a "value function"
    circular process: agent -> action -> environment -> state, reward -> agent
  -----
  ReLU (Rectified Linear Unit)
    activation function used in deep learning neural networks
    outputs the input directly if it is positive; otherwise, it returns zero
    expressed as f(x)=max(0,x).
------------
Reinforcement Learning with Human Feedback
  Data Collection
  Supervised Fine-tuning
  build seperate reward model
  optimize the language model with reward based model
------------
Transfer Learning
  a model pre-trained on one task is adapted to improve performance on a different but related task 
    by leveraging knowledge from the original task
------------
Incremental Training
  a method that allows the chatbot to adapt over time by learning from new data
    withtout forgetting previously learned information
  allows a model to update itself with new data while retaining knowledge from old data
------------
Continued or Further Pre-Training
  uses unlabeled data to pre-train a model
  -----
  training the model on a large corpus of domain-specific data
    enhancing its ability to understand domain-specific terms, jargon, and context
------------
Fine-Tuning
  uses labeled data to train a model
  involves further training and more COSTS
  fine-tuning uses less computation power as it works on a smaller dataset (cost)
------------
Pre-Training 
  is computationally expensive (cost)
------------
Dynamic Data Masking 
  a technique used to protect sensitive data from unauthorized access by masking the data in query results 
  does not guarantee that sensitive information is entirely removed from inference responses
------------
Semi-Supervised learning
  Fraud detection
  when you apply both supervised and unsupervised learning techniques to a common problem
    use a small amount of labeled data and a large amount of unlabeled data to train systems
  After that, the partially trained algorithm labels the unlabeled data - called pseudo-labeling
  The model is then re-trained on the resulting data mix without being explicitly programmed
------------
Supervised Learning 
  models are supplied with LABELED, defined data to assess for correlations
  most appropriate when you have a large dataset of LABELED examples
  requires extensive datasets and retraining the model whenever new data is available
------------
  Techniques
  ----------
  Linear Regression
    SUPERVISED
    used to predict continuous numerical values based on independent features
  ----------
  Neural Network
    more complex SUPERVISED learning technique
    takes some given inputs and performs one or more layers of mathematical transformation 
    based on adjusting data weightings
    example of a neural network technique is predicting a digit from a handwritten image
    consist of layers of nodes (neurons) that process input data
      adjusting the weights of connections between nodes through training to recognize patterns and make predictions
------------
Self Supervised Learning
  generates the labels themselves
  models are provided with vast amounts of raw, UNLABELED data
------------
Unsupervised learning
  you give the algorithm input data WITHOUT any labeled OUTPUT data
    the model then establishes meaningful connections between the inputs and PREDETERMINED outputs
  -----
  the algorithm groups data based on similarities without predefined categories or labels
  ----
  Examples: 
    Clustering - groups certain data inputs, so they may be categorized as a whole
    Dimensionality reduction - reduces the number of features in a dataset, reduce complexity and overheads
------------
------------
------------
Confusion Matrix Metrics
  Precision, Recall, F1, Accuracy
  best way to evaluate the perfomance of a model that does classifications
  evaluate the performance of classification models by displaying the number of 
    true positives, true negatives, 
    false positives, false negatives
  provides a detailed breakdown of the model's performance across all classes, making it the most suitable 
   choice for evaluating a classification model's accuracy 
------------
AUC-ROC
  performance meteric?
------------
Math equations metrics that show the quality of regression
 MAE - mean absolute error
 MAPE - mean absolute percentage error
 RMSE - root mean squared error
 R-squared - measures variance
------------
Inferencing
  the model uses its trained parameters to generate a prediction or output 
    based on new input data provided by the user
  --------
  Real time - speed is referred over accuracy
  Batch - accuracy is more important
------------
Pases of ML Project
  Retrain, Deployment, Monitoring, Iterations
------------
Model Parameters
  values that define a model and its behavior in INTERPRETING input and GENERATING responses
  controlled and updated by providers
  internal variables of the model that are learned and adjusted during the training process
  directly influence the output of the model for a given input
  Examples include the WEIGHTS and BIASES in a neural network
------------
Hyperparameters
  values that can be adjusted for model customization to control the training process
------------
Hyperparameter Tuning
  define model structure, set before training
  optimize accuracy, overfitting, enhanced generalization
  most effective solution to avoid overfitting
  use hyperparameters for model tuning
  -----
  Automatic Model Tuning
    automatically tunes your machine learning model by running A NUMBER OF JOBS on your dataset 
    to find the best version of the model
    -----
    no settings are mandatory
  -----
  Adjustable Parameters:
  -----
  Learning Rate
    updating model weights during training
  -----
  Batch Size
    number of examples, smaller is more stable, but uses more compute time
  -----
  Number of Epochs
    how many times the model will iterate over entire training dataset
    allows the model to learn from the training data for a longer period
    potentially capturing more complex patterns and relationships
  -----
  Bias is an error introduced by approximating a real-world problem
  -----
  Variance is an error introduced by the model's sensitivity to small fluctuations in the training data
  -----
  Bias versus variance trade-off 
   balancing:
    complexity (variance)
    incorrect assumptions (bias), 
   where high bias can cause underfitting and high variance can cause overfitting

  -----
  Cost Trade-offs
    Regional coverage can affect cost due to differences in prices across regions and 
      also because data transfer between regions can lead to additional costs
  -----
  Benchmark Datasets
    pre-compiled, standardized datasets specifically designed to test for BIASES and DISCRIMINATION in model outputs
    minimizes administrative effort while providing reliable, comprehensive insights into biases or discrimination
    to assess the accuracy (even no bias realted items) of the model to ensure it meets performance requirements
  -----
  Overfitting
    high variance and low bias
    performs well on the training data but poorly on new, unseen data
    has HIGH prediction accuracy on TAINING data, LOW accuracy for NEW data
    model is overly complex and captures noise or random fluctuations in the training data
      rather than the underlying patterns
  Fix by:
  - increase training data
  - early stopping of training to adjust
  - data augmentation to increase diversity in the dataset
  - adjust hyperparameters, but do not 'add' them
  -----
  Underfitting
    low variance, high bias
    performs poorly on both the training data and new, unseen data
    cannot capture the relationship between the input and output data effectively
    model is too simple
    model will show high error rates on BOTH the training set and new data
  -----
  Decision Trees
    highly interpretable models that provide clear visualization of the decision-making process
    Decision Trees can handle multi-class classification problems
    suitable for categorizing things into multiple distinct classes
  -----
  Pruning 
    simplifies decision trees by removing branches that have little importance
  -----
  Cross-validation 
    helps ensure the model generalizes well to unseen data by dividing data into multiple training/validation sets
  -----
  the training set - is used for training the model
  the validation   - set is used for tuning hyperparameters and model selection
  the test set     - is used for evaluating the final model performance
  -----
  Validation Set (optional)
    used for tuning hyperparameters and selecting the best model during the training process
  -----
  Test Set
    used on the final trained model to assess performance on unseen data
    determines how well the model generalizes
  -----
  Regularization
    penalize complex models to reduce overfitting
    adjust balance between simple and complex model
    adds constraints that penalize complexity, encouraging the model to generalize better

------------
ML - good for approximations, NOT when you want exact answers
------------------------------------
------------------------------------
------------------------------------
SageMaker
  build, train, and deploy machine learning models
  --------
  Autopilot
    designed for automating the process of finding the best hyperparameters
  --------
  allows data scientists and developers to quickly and confidently build, train, 
    and deploy ML models into a production-ready hosted environment.
  --------
  Pre-trained models are fully customizable for your use case with your data
  --------
  Evaluate, compare, and select Foundation Models quickly based on pre-defined quality and responsibility metrics
  --------
  Clarify
    evaluates FMs for your gen AI use case with metrics such as accuracy, robustness, and toxicity 
    helps identify potential BIAS during data preparation without writing code
    compare models, explain outputs, DETECTS BIAS
  --------
  Model Monitor
    does NOT assist in model selection or CONTENT moderation
    continuously monitor machine learning models deployed in production
    detects data quality issues, concept drift, and other anomalies
    see quality of a model
    alerts you of inaccurate predictions
  --------
  Model Dashboard
    view, search, and explore all of the models in your account
    track which models are deployed for inference and if they are used in batch transform jobs or hosted on endpoints
  --------
  Model Cards
    document risk and rating of the model, as well as custom information like
      training details and metrics, evaluation results, and observations
    -----
    guidance on how each model should be used, along with an assessment of the 
      potential risks associated with its DEPLOYMENT
    -----
    call-outs such as considerations, recommendations, and custom information
    centralize and standardize model documentation so you can implement ML responsibly
    -----
    documents critical details about your machine learning (ML) models in a single place for 
      streamlined governance and reporting
    -----
    Describes how a model should be used in a production environment
  --------
  AI Service Cards
    provide transparency about AWS AI services' intended use, limitations, and potential impacts
    a form of responsible AI documentation that provides customers with a single place to find information 
      on the intended use cases and limitations
  --------
  Model Dashbord
    centralized view of all models to easily track, manage, and monitor them
    overview of deployed models and ENDPOINTS so that you can track resources and model behavior violations
  --------
  MLflow
    used to manage your iterative process of ML experimentation
    track, organize, view, analyze, and compare iterative ML experimentation
    gain comparative insights and register and deploy your best-performing models
  --------
  Data Preparation
    One of the main challenges in machine learning implementation is the difficulty in collecting 
      and preparing high-quality data for training models
  --------
  Exploratory Data Analysis (EDA)
    involves examining the data through statistical summaries and visualizations to identify patterns, 
      detect anomalies, and form hypotheses
    calculating statistics and visualizing data are fundamental to EDA, helping to uncover patterns, 
      detect outliers, and gain insights 
  --------
  Data Wrangler
    single visual interface
    explore and prep data sets
    can fix bias by balancing the dataset
        Fix bias by balancing the dataset
    a service that simplifies the process of data preparation and feature engineering
    Offers 300+ pre-configured data transformations to prepare data for ML
  --------
  Data Lineage
    tracks how data is generated, transformed, transmitted, and used across a system over time
    can ensure data privacy and compliance by tracking the flow and transformation of data
  --------
  End To End ML service
  --------
  Feature Store
    fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models
    can ingest data from a variety of sources, such as:
      application and service logs, clickstreams, sensors,
      S3, Redshift, Lake Formation, Snowflake, and Databricks Delta Lake
  --------
  Ground Truth
    helps build large, high-quality, labeled datasets for training machine learning models
    -----
    uses Reinforcement Learning from Human Feedback for model grading and labeling, 
      harnessing human input across the ML lifecycle
      to improve the accuracy and relevancy of models
    -----
    Self-service offering
      your data annotators, content creators, and prompt engineers (in-house, vendor-managed, or leveraging 
      the public crowd) can use our low-code user interface to accelerate human-in-the-loop tasks
    -----
    AWS-managed offering
      (SageMaker Ground Truth Plus), AWS handles the heavy lifting, which includes selecting and managing 
      the right workforce for your use case
  --------
  Ground Truth Plus
    fully managed data labeling service that helps deliver high-quality annotations
    uses a combination of human labelers and machine learning-assisted labeling to ensure accuracy 
      and consistency in the labels
    for businesses looking to create accurate training datasets at scale while minimizing manual errors
  --------
  Human Evaluation
    of text-related tasks using a comparitive mechanism, need the most important points on the Likert Scale
    ensure the outputs are adapted to your users
  --------
  RLHF
    a technique used to train AI models using human feedback to refine their behavior
  --------
  Augmented AI (Amazon A2I)
    can add MULTIPLE reviewers
    -----
    an AWS service that provides a human review of machine learning predictions to improve 
      model accuracy and reliability
    -----
    makes it easy to build and manage human reviews for machine learning applications
    provides built-in human review workflows for common machine learning use cases
    helps implement human review workflows for machine learning predictions
    -----
    integrates human judgment into ML workflows, allowing for reviews and corrections of model predictions
      for applications requiring high accuracy and accountability
  --------
  Jumpstart
    Pre-trained models are FULLY CUSTOMIZABLE for your use case with your data
    pre built solutions, evaluate, compare, and select models quickly 
    a machine learning hub with foundation models, built-in algorithms, and prebuilt ML solutions
    deploy with just a few clicks
    access pre-trained models, including FMs, to perform tasks like article summarization and image generation
    easily deploy them into production with the user interface or SDK
  --------
  Generative AI powered summarization chatbot
    generate concise summaries of text
    with prompt engineering, the summarization chatbot can be specifically tailored to accurately extract 
      detailed key points, entities, or legal clauses from complex legal document
  --------
  Pipelines - CI/CD for ML
  --------
  Registry - central repo to manage ML versions
  --------
  Role Manager
    access control
    baseline set of permissions for ML activities
  --------
  Governance tools provided by Amazon SageMaker
    Role Manager
    Model Cards
    Model Dashboard
  --------
  Studio - unified UI
  --------
  Real Time - Fastest
    SYNCHRONOUS
    more configuration RAM
    when low latency is essential, and responses are needed immediately
    fully managed and support autoscaling
  --------
  Asynchronous - Medium
    longer processing times
    ideal for payload sizes less than 1gb and immediate results are not critical
    delayed responses at a lower cost
    data goes into one bucket, result fed into another
    near real-time latency requirements
    request and response -> S3
  --------
  Batch - Slow
    ASYNCHRONOUS
    for large payloads
    predictions for entier data set
    also used S3
    when you can tolerate some delays in receiving responses, but need a cost-effective inference method that
     optimizes resource usage without sacrificing too much on turnaround time
  --------
  Serverless
    good for workloads with unpredictable traffic or sporadic requests
    scales automatically based on demand
    no config, one click and go
  --------
  Deployment Types
    Real-Time Inference: 6mb - fast, low latency, near instant preditions for web/mobile apps
    Serveless Inference: 4mb - fast, low latency, sporadic, short term inference, no infrastructure, 
      allowing for COLD starts
    Asynchronous Inference: 1gb - near real-time, large payloads longer processing
    Batch Transform: 100mb - high latency, concurrent, bulk processing for large datasets
  --------
  Network Isolation Mode - no internet, no S3
  DeepAR forecasting - time series data
  --------
  Canvas 
    No code solution to create machine-learning models
    Browse, import, and join data
    Gives the ability to use machine learning to generate predictions without the need to write any code
    generates new relevant features, tests hundreds of prediction models
    cleans data, automatically detecting and cleaning missing values
    enables users to create machine learning models using a visual interface
------------------------------------
------------------------------------
------------------------------------
Trusted Advisor
  advises on BEST PRACTICES, and optimize your AWS environment for cost savings on:
    service limits, perfomance, security, FAULT tolerance
---------
Audit Manager
  helps you assess internal risk with prebuilt frameworks that translate evidence from cloud services 
    into security IT audit reports
  continuously audit AWS usage to simplify how you assess risk and compliance
    with regulations and industry standards
  an essential tool for governance in AI systems
---------
Inspector
  automated security assessment service that helps improve the security and compliance of 
    applications deployed on AWS
  for use on EC2, ECR & Lambda
  does NOT track CONFIG changes
  INSTALLED agents on EC2 
  AUTOMATED continual checks for security VULNERABILITY
  CHECKS for OS vulnerabilities
  can also be integrated with CI/CD tools to monitor and improve security & compliance of web apps
------------
Config
  A view of the CHANGES in your RESOURCES associated with your AWS account
  INCLUDING how they are configured
  can generate an INVENTORY of AWS resources
------------
Artifact
  provides on-demand access to AWS’ compliance reports and online agreements
------------
Exposure
  the risk of exposing sensitive or confidential information to a model during training or inference. 
  a model can then reveal this sensitive data from their training corpus, leading to data leaks or privacy violation
-----------
Prompt Injection
  influencing the outputs by embedding specific instructions within the prompts themselves
-----------
Dynamic Prompt Engineering
  modifying the input prompts to the LLM to customize the chatbot's responses
-----------
Prompt Template 
  can guides the LLM to detect and respond appropriately to potential attack patterns
  can include predefined instructions that condition the LLM to recognize malicious patterns 
    and avoid generating unintended or harmful responses
  embedding defense mechanisms within the prompts themselves
-----------
Jailbreaking
  bypassing the built-in restrictions and safety measures of AI systems to unlock restricted 
    functionalities or generate prohibited content
-----------
Hijacking
  manipulating an AI system to serve malicious purposes or to misbehave in unintended ways
  cause model to stray from its intended prompt, often producing unexpected or undesired outputs
-----------
Risk management in the Generative AI Security Scoping Matrix
  identifying potential threats to generative AI solutions and recommending mitigations
-----------
Average Response Time
  evaluate the runtime efficiency of a model
-----------
Amazon Transcribe Medical
  an auto speech recognition (ASR) service to add medical speech-to-text capabilities to voice-enabled application
-----
Amazon Comprehend Medical
  returns useful information in unstructured clinical text such as physician's notes, discharge summaries, 
    test results, and case notes
-----------
Trainium (chips)
  for deep learning training
-----------
Inferentia (chips)
  for deep learning inference
-----------
Stable Diffusion 
  a gen AI model that produces unique photorealistic images from text and image prompts
-----------
Classification techniques
  commonly used to predict a category or class from a set of features andare well suited for 
    predicting customer churn, which is a binary outcome: churn or not churn
-----------
Multi-Class Classification
  assigns each instance to ONE of several possible classes
  (e.g., an image classified as either a cat, dog, or bird)
-----------
Multi-Label Classification
  multi-label classification assigns each instance to one or more classes
  (e.g., a document classified as both "science" and "technology")
-----------
Object Detection
  a computer vision technique that identifies instances of objects within images and videos, 
    including detecting and classifying ANIMALS
-----------
DeepRacer 
  is a Wi-Fi enabled, physical vehicle that can drive itself on a physical track
-----------
-----------
-----------
SageMaker Studio supports IDE:
  -----
  JupyterLab
    A popular notebook interface for data exploration and model building.
  -----
  Code Editor
    Based on Code-OSS (Visual Studio Code - Open Source), it offers a lightweight and powerful code editor 
      with familiar shortcuts, terminal, debugger, and refactoring tools.
  -----
  RStudio
    A fully managed IDE for R development, featuring a console, syntax-highlighting editor, and tools for plotting,
      history, debugging, and workspace management.
-----------
Anomaly Detection System 
  can analyze patterns and behaviors, such as IP address access patterns, 
  to detect any DEVIATIONS from the norm, which could indicate suspicious or malicious activity
-----------
Area Under the Curve
  one of the model performance metrics
-----------
Data Minimization
  reduces privacy risk, cost, carbon footprint
-----------



-----------
The Explainable AI Principle
  contrastive explanations is central to human understanding
  -----
  In this context, an event 'X' (the AI’s decision) needs to be contrasted with another event 'Y' 
    (as a baseline or what was expected).
  -----
  They facilitate easier debugging and optimization
-----------
When designing models for human interaction, design the model to give CONTRASTIVE explanations
-----------
Data Classification
  a concept th