Segment Anything (SAM) is Meta AI's open-source foundation model for image segmentation. Trained on 11 million images and 1.1 billion masks, SAM delivers zero-shot object segmentation using point, box, or text prompts. Explore SAM's features, pricing, alternatives, and developer API.
Try NemoVideo Free Open Site
Segment Anything Model (SAM) is a foundation model for image segmentation developed by Meta AI and released as open-source research in April 2023. SAM produces high-quality object masks from input prompts such as points, bounding boxes, or masks, and can automatically generate masks for every object in an image. It was trained on the SA-1B dataset containing 11 million images and over 1.1 billion masks, giving it strong zero-shot performance across diverse segmentation tasks.
The model comes in three sizes based on the Vision Transformer (ViT) backbone: ViT-B with 91 million parameters, ViT-L with 308 million parameters, and ViT-H with 636 million parameters. After precomputing the image embedding, SAM can generate segmentation masks in approximately 50 milliseconds, enabling real-time interactive use. The model is published under the Apache 2.0 license and is free for both personal and commercial use.
SAM has evolved through multiple versions: SAM 2 (2024) extended segmentation to video, becoming the first unified model for images and video. SAM 3 (November 2025) introduced text-prompt capabilities, allowing natural language segmentation of 270,000+ concepts with 48.5 AP on LVIS.
The interactive Segment Anything Playground at segment-anything.com lets anyone upload an image and segment objects by clicking points or drawing boxes, without any coding or installation required. For developers, the full source code and model checkpoints are available on GitHub at github.com/facebookresearch/segment-anything.
While SAM excels at zero-shot promptable segmentation, different use cases may benefit from specialized alternatives. Here are the top image segmentation tools and platforms to consider in 2026:
NemoVideo is an AI-powered video editing platform that can turn your segmented images and artwork into engaging video content. With its agentic workflow, describe what you want and NemoVideo handles editing, transitions, and effects automatically. Start free today.
A universal segmentation architecture achieving state-of-the-art results across panoptic, instance, and semantic segmentation with 57.8 PQ on COCO panoptic. Uses masked attention for 8x faster convergence than Mask R-CNN. Best for teams needing fine-tuned, production-grade segmentation models.
A proven semantic segmentation model from Google Research, widely used for fine-tuning on domain-specific datasets. Ideal for production deployments where you have labeled training data and need pixel-level classification for specific object categories.
Lightweight alternatives to SAM designed for edge and mobile deployment. MobileSAM compresses the ViT encoder while maintaining quality. FastSAM replaces the ViT backbone with a CNN architecture for significantly faster inference on resource-constrained devices.
The Ultralytics YOLO framework integrates SAM alongside its own instance segmentation models for real-time detection and segmentation. RTMDet-Ins achieves 52.8% AP at over 300 FPS on RTX 3090, making it ideal for real-time production applications.
A full data labeling platform that integrates SAM as an annotation accelerator. Wraps zero-shot segmentation with project management, versioning, active learning, role-based access, and audit logs. Built for teams annotating thousands of images with dedicated staff.
Segment Anything is completely free and open source under the Apache 2.0 license. There are no subscription plans, credit systems, or usage fees from Meta. You can download the model weights and source code from GitHub and run it at no cost. The only expense is the GPU hardware needed to run the model locally.
| Option | Price | Details |
|---|---|---|
| SAM (Self-Hosted) | Free (Apache 2.0) | Full model weights, source code, commercial use allowed. Requires GPU hardware. |
| Segment Anything Playground | Free | Interactive web demo at segment-anything.com. Upload images and segment with clicks. Usage caps apply. |
| Hugging Face Inference | Free / Pay-per-use | Free API with rate limits. Paid inference endpoints for production workloads. |
| AWS SageMaker JumpStart | Pay for compute | SAM 2.1 available on SageMaker. Pay for GPU instance hours (varies by instance type). |
| NemoVideo | Free / Premium | AI-powered video editing with agentic workflow, smart captions, and auto-editing. |
Professional image-to-video creation. Explore NemoVideo's plans and start free with AI-powered tools.
Yes, Segment Anything is entirely free. Unlike most AI tools that offer limited free tiers, SAM is released as open-source software under the Apache 2.0 license. This means you can download, modify, and even redistribute the model for free, including for commercial purposes. There are no premium tiers, no feature gating, and no usage-based pricing from Meta.
There are several ways to use SAM for free:
Turn your best images into video stories. NemoVideo's free plan offers AI editing, smart captions, and export tools. Get started free.
Getting started with Segment Anything is straightforward whether you prefer a no-code web interface or a Python development environment.
Visit segment-anything.com and use Meta's interactive Segment Anything Playground. Upload any image, then click on objects to generate segmentation masks in real time. You can add inclusion points (to select objects) or exclusion points (to refine the mask). No account, installation, or coding knowledge is required.
Install the SAM library directly from GitHub: pip install 'git+https://github.com/facebookresearch/segment-anything.git'. You also need PyTorch 1.7+ and TorchVision 0.8+. Download a model checkpoint (ViT-H is the most accurate at 636M parameters; ViT-B is the fastest at 91M parameters).
Use sam_model_registry to load a checkpoint file, then create a SamPredictor instance. Set your input image with predictor.set_image(image), which precomputes the image embedding (this is the slow step, taking a few seconds).
Provide point coordinates or bounding box coordinates as prompts. Call predictor.predict() to generate segmentation masks. SAM returns multiple mask candidates ranked by confidence score. For automatic segmentation of all objects, use SamAutomaticMaskGenerator instead.
SAM outputs binary masks as NumPy arrays that you can overlay on images, save as PNGs, or feed into downstream pipelines. The lightweight mask decoder can also be exported to ONNX format for deployment in browsers or edge devices using ONNX Runtime.
Turn your edited images into videos. Just tell NemoVideo's AI Agent what you want and it produces a polished video automatically.
The image segmentation and editing landscape in 2026 offers powerful tools for creators and developers at every level. Here are the standout platforms to consider:
Segment Anything provides a Python library rather than a hosted REST API. You install it from the official GitHub repository and run inference locally or on your own servers. The key classes for developers are:
The mask decoder component can be exported to ONNX format, allowing deployment in any environment that supports ONNX Runtime, including web browsers via JavaScript. This enables building interactive segmentation tools that run entirely client-side.
For hosted API access without managing infrastructure, third-party services offer SAM endpoints: Hugging Face provides a free inference API with rate limits, Segmind offers pay-per-use API pricing, and AWS SageMaker JumpStart supports SAM 2.1 deployment with managed GPU instances. The Ultralytics Python package also wraps SAM in a simplified API alongside YOLO models.