Image editing

A Comprehensive Overview of Segment Anything (2026)

Segment Anything (SAM) is Meta AI's open-source foundation model for image segmentation. Trained on 11 million images and 1.1 billion masks, SAM delivers zero-shot object segmentation using point, box, or text prompts. Explore SAM's features, pricing, alternatives, and developer API.

Try NemoVideo Free Open Site
Last reviewed: March 2026 · By NemoVideo Editorial Team

A Comprehensive Overview of Segment Anything (2026)

A Comprehensive Overview of Segment Anything (2026)

Segment Anything Model (SAM) is a foundation model for image segmentation developed by Meta AI and released as open-source research in April 2023. SAM produces high-quality object masks from input prompts such as points, bounding boxes, or masks, and can automatically generate masks for every object in an image. It was trained on the SA-1B dataset containing 11 million images and over 1.1 billion masks, giving it strong zero-shot performance across diverse segmentation tasks.

The model comes in three sizes based on the Vision Transformer (ViT) backbone: ViT-B with 91 million parameters, ViT-L with 308 million parameters, and ViT-H with 636 million parameters. After precomputing the image embedding, SAM can generate segmentation masks in approximately 50 milliseconds, enabling real-time interactive use. The model is published under the Apache 2.0 license and is free for both personal and commercial use.

SAM has evolved through multiple versions: SAM 2 (2024) extended segmentation to video, becoming the first unified model for images and video. SAM 3 (November 2025) introduced text-prompt capabilities, allowing natural language segmentation of 270,000+ concepts with 48.5 AP on LVIS.

The interactive Segment Anything Playground at segment-anything.com lets anyone upload an image and segment objects by clicking points or drawing boxes, without any coding or installation required. For developers, the full source code and model checkpoints are available on GitHub at github.com/facebookresearch/segment-anything.

Best Segment Anything Alternatives

While SAM excels at zero-shot promptable segmentation, different use cases may benefit from specialized alternatives. Here are the top image segmentation tools and platforms to consider in 2026:

M2
Mask2Former

A universal segmentation architecture achieving state-of-the-art results across panoptic, instance, and semantic segmentation with 57.8 PQ on COCO panoptic. Uses masked attention for 8x faster convergence than Mask R-CNN. Best for teams needing fine-tuned, production-grade segmentation models.

DL
DeepLabV3+

A proven semantic segmentation model from Google Research, widely used for fine-tuning on domain-specific datasets. Ideal for production deployments where you have labeled training data and need pixel-level classification for specific object categories.

MS
MobileSAM / FastSAM

Lightweight alternatives to SAM designed for edge and mobile deployment. MobileSAM compresses the ViT encoder while maintaining quality. FastSAM replaces the ViT backbone with a CNN architecture for significantly faster inference on resource-constrained devices.

YL
YOLO + Ultralytics

The Ultralytics YOLO framework integrates SAM alongside its own instance segmentation models for real-time detection and segmentation. RTMDet-Ins achieves 52.8% AP at over 300 FPS on RTX 3090, making it ideal for real-time production applications.

EC
Encord

A full data labeling platform that integrates SAM as an annotation accelerator. Wraps zero-shot segmentation with project management, versioning, active learning, role-based access, and audit logs. Built for teams annotating thousands of images with dedicated staff.

Pricing of Segment Anything

Segment Anything is completely free and open source under the Apache 2.0 license. There are no subscription plans, credit systems, or usage fees from Meta. You can download the model weights and source code from GitHub and run it at no cost. The only expense is the GPU hardware needed to run the model locally.

OptionPriceDetails
SAM (Self-Hosted)Free (Apache 2.0)Full model weights, source code, commercial use allowed. Requires GPU hardware.
Segment Anything PlaygroundFreeInteractive web demo at segment-anything.com. Upload images and segment with clicks. Usage caps apply.
Hugging Face InferenceFree / Pay-per-useFree API with rate limits. Paid inference endpoints for production workloads.
AWS SageMaker JumpStartPay for computeSAM 2.1 available on SageMaker. Pay for GPU instance hours (varies by instance type).
NemoVideoFree / PremiumAI-powered video editing with agentic workflow, smart captions, and auto-editing.

Professional image-to-video creation. Explore NemoVideo's plans and start free with AI-powered tools.

Does Segment Anything Have a Free Version?

Yes, Segment Anything is entirely free. Unlike most AI tools that offer limited free tiers, SAM is released as open-source software under the Apache 2.0 license. This means you can download, modify, and even redistribute the model for free, including for commercial purposes. There are no premium tiers, no feature gating, and no usage-based pricing from Meta.

There are several ways to use SAM for free:

  • Segment Anything Playground -- Visit segment-anything.com to use the interactive web demo. Upload any image and click to segment objects instantly, no account or installation required.
  • Self-hosted -- Download the model from GitHub (github.com/facebookresearch/segment-anything) and run it on your own machine. You need a GPU with sufficient VRAM (ViT-B runs on consumer GPUs, ViT-H needs more powerful hardware).
  • Hugging Face -- Use the free inference API on Hugging Face Hub with rate limits, or run the model in Google Colab notebooks at no cost.
  • Ultralytics YOLO -- SAM is integrated into the Ultralytics framework, which provides a simple Python API for running SAM alongside other vision models.

Turn your best images into video stories. NemoVideo's free plan offers AI editing, smart captions, and export tools. Get started free.

How to Use Segment Anything for Beginners

Getting started with Segment Anything is straightforward whether you prefer a no-code web interface or a Python development environment.

Option A: Use the Web Playground (No Code)

Visit segment-anything.com and use Meta's interactive Segment Anything Playground. Upload any image, then click on objects to generate segmentation masks in real time. You can add inclusion points (to select objects) or exclusion points (to refine the mask). No account, installation, or coding knowledge is required.

Option B: Python Setup for Developers

Install the SAM library directly from GitHub: pip install 'git+https://github.com/facebookresearch/segment-anything.git'. You also need PyTorch 1.7+ and TorchVision 0.8+. Download a model checkpoint (ViT-H is the most accurate at 636M parameters; ViT-B is the fastest at 91M parameters).

Step 1: Load the Model

Use sam_model_registry to load a checkpoint file, then create a SamPredictor instance. Set your input image with predictor.set_image(image), which precomputes the image embedding (this is the slow step, taking a few seconds).

Step 2: Generate Masks with Prompts

Provide point coordinates or bounding box coordinates as prompts. Call predictor.predict() to generate segmentation masks. SAM returns multiple mask candidates ranked by confidence score. For automatic segmentation of all objects, use SamAutomaticMaskGenerator instead.

Step 3: Export and Use Results

SAM outputs binary masks as NumPy arrays that you can overlay on images, save as PNGs, or feed into downstream pipelines. The lightweight mask decoder can also be exported to ONNX format for deployment in browsers or edge devices using ONNX Runtime.

Turn your edited images into videos. Just tell NemoVideo's AI Agent what you want and it produces a polished video automatically.

Best Image editing Tools in 2026

The image segmentation and editing landscape in 2026 offers powerful tools for creators and developers at every level. Here are the standout platforms to consider:

  • Segment Anything (SAM 3) -- Meta AI's latest foundation model with text-prompt segmentation, handling 270,000+ concepts in zero-shot settings. Free and open source under Apache 2.0.
  • NemoVideo -- AI-powered agentic video editing with chat-based workflow, perfect for turning segmented images and raw footage into professional video content.
  • Mask2Former -- Universal segmentation architecture achieving 57.8 PQ on COCO panoptic. Best for fine-tuned production models across panoptic, instance, and semantic tasks.
  • Ultralytics YOLO -- Integrates SAM with real-time object detection. RTMDet-Ins delivers 300+ FPS instance segmentation for production deployments.
  • Encord -- Enterprise annotation platform with SAM integration, project management, versioning, and active learning for teams labeling at scale.
  • MobileSAM -- Compressed SAM variant optimized for mobile and edge deployment where GPU resources are limited.
  • nnU-Net -- Specialized for medical image segmentation. Won 9 out of 10 MICCAI 2020 challenges with automatic configuration and no manual tuning required.

Does Segment Anything Have an API?

Segment Anything provides a Python library rather than a hosted REST API. You install it from the official GitHub repository and run inference locally or on your own servers. The key classes for developers are:

  • SamPredictor -- For prompted segmentation using point coordinates, bounding boxes, or existing masks as input. Set an image once, then generate masks interactively with different prompts.
  • SamAutomaticMaskGenerator -- For automatic segmentation of all objects in an image without any prompts. Returns a list of masks sorted by confidence score.
  • sam_model_registry -- For loading model checkpoints (ViT-B, ViT-L, or ViT-H) from downloaded weight files.

The mask decoder component can be exported to ONNX format, allowing deployment in any environment that supports ONNX Runtime, including web browsers via JavaScript. This enables building interactive segmentation tools that run entirely client-side.

For hosted API access without managing infrastructure, third-party services offer SAM endpoints: Hugging Face provides a free inference API with rate limits, Segmind offers pay-per-use API pricing, and AWS SageMaker JumpStart supports SAM 2.1 deployment with managed GPU instances. The Ultralytics Python package also wraps SAM in a simplified API alongside YOLO models.

Frequently Asked Questions

Yes, Segment Anything is completely free and open source. It is released under the Apache 2.0 license, which allows free use for both personal and commercial purposes. You can download the model weights and source code from GitHub at github.com/facebookresearch/segment-anything, or use the free interactive demo at segment-anything.com. There are no subscription fees, credit systems, or premium tiers.
Top alternatives include Mask2Former for state-of-the-art panoptic segmentation (57.8 PQ on COCO), DeepLabV3+ for fine-tuned semantic segmentation, MobileSAM and FastSAM for lightweight edge deployment, and YOLO with Ultralytics for real-time detection and segmentation at 300+ FPS. For annotation workflows, Encord and V7 Labs integrate SAM into full labeling platforms. For turning segmented images into videos, NemoVideo offers AI-powered agentic video editing.
The easiest way is to visit the Segment Anything Playground at segment-anything.com, upload an image, and click on objects to segment them -- no account or installation needed. For developers, install the Python library via pip from GitHub, download a model checkpoint (ViT-B for speed or ViT-H for accuracy), and use the SamPredictor class to generate masks from point or box prompts. SAM requires Python 3.8+, PyTorch 1.7+, and a GPU for efficient inference.
SAM provides a Python library (not a hosted REST API) with key classes like SamPredictor for prompted segmentation and SamAutomaticMaskGenerator for automatic segmentation. The mask decoder can be exported to ONNX format for browser or edge deployment. For hosted API access, third-party services like Hugging Face (free with rate limits), Segmind (pay-per-use), and AWS SageMaker JumpStart (managed GPU instances) offer SAM endpoints.
Yes, Meta offers a free SAM 3 Playground where you can upload images or videos and try the model directly in your browser. You can click on objects or type text prompts to segment them, with no coding or installation required.
SAM (original) handles image segmentation with point/box prompts. SAM 2 extended this to video with object tracking. SAM 3, released in November 2025, understands open-vocabulary text concepts and can segment all matching instances simultaneously using simple descriptions like 'yellow school bus,' eliminating the need for manual single-instance prompts.
The original SAM model can be computationally heavy, but efficient variants have been developed. FastSAM achieves comparable performance at drastically reduced computational cost, while MobileSAM offers a 5x speedup over FastSAM and a 7x smaller model size, making near-real-time segmentation feasible.
SAM demonstrates exceptional zero-shot generalization across multiple domains, but inherent complexities within domain-specific datasets present challenges in boundary refinement and precision. For specialized applications like medical imaging, fine-tuning or domain-specific adaptations of SAM are typically recommended for optimal results.
Create stunning videos with NemoVideo AI Agent — No editing skills needed Try NemoVideo Free