Vertex ai Types of AI prediction

Here are the main types of predictions in Google Vertex AI, based on how models are deployed and used:





1) Online Predictions (Real-time)



Used when you need an immediate response from the model.


Best for:


  • Chatbots
  • Recommendation systems
  • Fraud detection
  • Real-time personalization



How it works:


  • You send a request → Vertex AI returns a prediction instantly (milliseconds).



Example use case:

You enter text into a chatbot, and it responds immediately.





2) Batch Predictions (Offline)



Used when you have a large dataset and don’t need instant results.


Best for:


  • Scoring large datasets
  • Monthly/weekly analytics
  • Data processing jobs



How it works:


  • You upload a file (e.g., CSV in Cloud Storage)
  • Vertex AI processes it in bulk
  • You get results as another file



Example use case:

Predict churn probability for 1 million customers overnight.





3) Streaming Predictions



Used when predictions must be made continuously on incoming data.


Best for:


  • IoT (Internet of Things)
  • Real-time event processing
  • Live data feeds



How it works:


  • Data flows in (e.g., from Pub/Sub)
  • Vertex AI makes predictions in real-time



Example use case:

Predict equipment failure from live sensor data.





4) AutoML Predictions



Predictions from models trained using Vertex AI AutoML (no coding required).


Types include:


  • AutoML Tables
  • AutoML Vision
  • AutoML Text
  • AutoML Video



Best for:

Business users or teams without deep ML expertise.





5) Custom Model Predictions



Predictions from models you train yourself (TensorFlow, PyTorch, Scikit-learn, etc.).


Best for:


  • Advanced ML teams
  • Research or complex use cases
  • Highly customized AI models



You deploy your model to Vertex AI and then call it for predictions.





6) Foundation Model Predictions (Generative AI)



Using Google’s prebuilt models like:


  • Gemini (text, chat, multimodal)
  • Imagen (image generation)
  • Codey (code generation)



Examples:


  • Generate text
  • Summarize documents
  • Create images
  • Answer questions






Simple Summary Table


Prediction Type

Use Case

Response Time

Online

Chatbots, real-time apps

Very fast

Batch

Large datasets

Slow

Streaming

Live data

Continuous

AutoML

No-code ML

Varies

Custom Model

Advanced ML

Varies

Foundation Model

Generative AI

Fast

From Blogger iPhone client