How to Implement Image Classification in SwiftUI

iOS 17+ Xcode 16+ Advanced APIs: CoreML · Vision Updated: May 12, 2026

TL;DR

Create a VNClassifyImageRequest, hand it a VNImageRequestHandler built from a CGImage, and read back an array of VNClassificationObservation results — all on-device with no network required. Wrap the synchronous Vision call in a Swift concurrency continuation to keep SwiftUI's @Observable view model clean.

import Vision

func classifyImage(_ uiImage: UIImage) async throws -> [VNClassificationObservation] {
    guard let cgImage = uiImage.cgImage else { throw ClassifierError.invalidImage }
    return try await withCheckedThrowingContinuation { continuation in
        let request = VNClassifyImageRequest { req, error in
            if let error { continuation.resume(throwing: error); return }
            let results = (req.results as? [VNClassificationObservation]) ?? []
            continuation.resume(returning: results.filter { $0.confidence > 0.01 })
        }
        do {
            try VNImageRequestHandler(cgImage: cgImage).perform([request])
        } catch {
            continuation.resume(throwing: error)
        }
    }
}

Full implementation

The implementation below uses an @Observable view model to drive state, PhotosPicker for image input, and Vision's built-in VNClassifyImageRequest — which uses a bundled CoreML model so you don't need to ship your own .mlmodel for general scene classification. Results are shown as a confidence bar list, color-coded by certainty, with full VoiceOver support.

import SwiftUI
import Vision
import PhotosUI

// MARK: - Errors

enum ClassifierError: LocalizedError {
    case invalidImage
    var errorDescription: String? {
        "Could not extract CGImage from the selected photo."
    }
}

// MARK: - ViewModel

@Observable
final class ImageClassifierViewModel {
    var selectedPhoto: PhotosPickerItem?
    var displayImage: UIImage?
    var classifications: [VNClassificationObservation] = []
    var isClassifying = false
    var errorMessage: String?

    func loadAndClassify() async {
        guard let item = selectedPhoto else { return }
        isClassifying = true
        errorMessage = nil
        defer { isClassifying = false }

        do {
            guard let data = try await item.loadTransferable(type: Data.self),
                  let uiImage = UIImage(data: data) else {
                errorMessage = "Failed to load image data."
                return
            }
            displayImage = uiImage
            classifications = try await classify(uiImage)
        } catch {
            errorMessage = error.localizedDescription
        }
    }

    private func classify(_ image: UIImage) async throws -> [VNClassificationObservation] {
        guard let cgImage = image.cgImage else { throw ClassifierError.invalidImage }
        return try await withCheckedThrowingContinuation { continuation in
            let request = VNClassifyImageRequest { req, error in
                if let error {
                    continuation.resume(throwing: error)
                    return
                }
                let all = (req.results as? [VNClassificationObservation]) ?? []
                // Keep only meaningful results; Vision returns hundreds of low-confidence labels
                let top = Array(all.filter { $0.confidence > 0.01 }.prefix(10))
                continuation.resume(returning: top)
            }
            do {
                let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])
                try handler.perform([request])
            } catch {
                continuation.resume(throwing: error)
            }
        }
    }
}

// MARK: - Main View

struct ImageClassifierView: View {
    @State private var model = ImageClassifierViewModel()

    var body: some View {
        NavigationStack {
            ScrollView {
                VStack(spacing: 24) {
                    imagePreviewSection
                    if !model.classifications.isEmpty {
                        resultsSection
                    }
                    if let err = model.errorMessage {
                        Text(err)
                            .foregroundStyle(.red)
                            .font(.caption)
                    }
                }
                .padding()
            }
            .navigationTitle("Image Classifier")
            .toolbar {
                ToolbarItem(placement: .primaryAction) {
                    PhotosPicker(
                        selection: $model.selectedPhoto,
                        matching: .images,
                        photoLibrary: .shared()
                    ) {
                        Label("Pick Photo", systemImage: "photo.badge.plus")
                    }
                    .accessibilityLabel("Pick a photo to classify")
                }
            }
            .onChange(of: model.selectedPhoto) { _, _ in
                Task { await model.loadAndClassify() }
            }
        }
    }

    // MARK: Subviews

    @ViewBuilder
    private var imagePreviewSection: some View {
        if let image = model.displayImage {
            Image(uiImage: image)
                .resizable()
                .scaledToFit()
                .clipShape(RoundedRectangle(cornerRadius: 16))
                .overlay(alignment: .bottom) {
                    if model.isClassifying {
                        ProgressView("Classifying…")
                            .padding(8)
                            .background(.ultraThinMaterial, in: Capsule())
                            .padding(.bottom, 12)
                    }
                }
                .accessibilityLabel("Selected photo for classification")
        } else {
            ContentUnavailableView(
                "No Image Selected",
                systemImage: "photo.on.rectangle.angled",
                description: Text("Tap the button above to pick a photo from your library.")
            )
            .frame(minHeight: 240)
        }
    }

    private var resultsSection: some View {
        VStack(alignment: .leading, spacing: 12) {
            Text("Classification Results")
                .font(.headline)
                .accessibilityAddTraits(.isHeader)
            ForEach(model.classifications, id: \.identifier) { obs in
                ClassificationRow(observation: obs)
            }
        }
        .frame(maxWidth: .infinity, alignment: .leading)
    }
}

// MARK: - Classification Row

struct ClassificationRow: View {
    let observation: VNClassificationObservation

    private var confidenceColor: Color {
        switch observation.confidence {
        case 0.7...: .green
        case 0.4...: .orange
        default:     .red
        }
    }

    var body: some View {
        VStack(alignment: .leading, spacing: 6) {
            HStack {
                Text(observation.identifier
                    .replacingOccurrences(of: "_", with: " ")
                    .capitalized)
                    .font(.subheadline.weight(.medium))
                Spacer()
                Text(observation.confidence,
                     format: .percent.precision(.fractionLength(1)))
                    .font(.subheadline.monospacedDigit())
                    .foregroundStyle(.secondary)
            }
            ProgressView(value: Double(observation.confidence))
                .tint(confidenceColor)
        }
        .accessibilityElement(children: .combine)
        .accessibilityLabel(
            "\(observation.identifier.replacingOccurrences(of: "_", with: " ").capitalized), \(Int(observation.confidence * 100)) percent confidence"
        )
    }
}

// MARK: - Preview

#Preview {
    ImageClassifierView()
}

How it works

1
PhotosPicker + async data transfer. PhotosPicker(selection: $model.selectedPhoto, matching: .images) returns a PhotosPickerItem. The .onChange modifier triggers loadTransferable(type: Data.self), which fetches the full-resolution image data from the Photos library asynchronously without blocking the UI thread.
2
CGImage extraction. Vision operates on CGImage, not UIImage. The guard in classify(_:) converts and throws ClassifierError.invalidImage for edge cases such as animated GIFs that have no direct CGImage backing.
3
Bridging Vision's callback to Swift concurrency. VNImageRequestHandler.perform(_:) is synchronous and calls its completion handler on the same thread. Wrapping it in withCheckedThrowingContinuation bridges that callback to an async throws function so the caller can await it cleanly inside the view model's Task.
4
Filtering results. VNClassifyImageRequest can return hundreds of VNClassificationObservation values (the model knows thousands of ImageNet labels). The .filter { $0.confidence > 0.01 }.prefix(10) pipeline keeps only the top 10 meaningful predictions.
5
Accessible confidence bars. ClassificationRow combines its child elements into a single VoiceOver unit via .accessibilityElement(children: .combine) and provides a natural-language label like "Cat, 92 percent confidence" so screen reader users get the full picture without hearing raw numeric identifiers.

Variants

Using a custom CoreML model (.mlmodel)

Replace the built-in VNClassifyImageRequest with a VNCoreMLRequest backed by your own model. Drag the .mlmodel into Xcode — it auto-generates a Swift class.

import CoreML
import Vision

// Xcode auto-generates MobileNetV2 from MobileNetV2.mlmodel
func classifyWithCustomModel(_ cgImage: CGImage) async throws -> [VNClassificationObservation] {
    let config = MLModelConfiguration()
    config.computeUnits = .cpuAndNeuralEngine   // use ANE for speed
    let mlModel = try MobileNetV2(configuration: config).model
    let vnModel = try VNCoreMLModel(for: mlModel)

    return try await withCheckedThrowingContinuation { continuation in
        let request = VNCoreMLRequest(model: vnModel) { req, error in
            if let error { continuation.resume(throwing: error); return }
            let results = (req.results as? [VNClassificationObservation]) ?? []
            continuation.resume(returning: Array(results.prefix(5)))
        }
        request.imageCropAndScaleOption = .centerCrop  // matches model training
        do {
            try VNImageRequestHandler(cgImage: cgImage).perform([request])
        } catch {
            continuation.resume(throwing: error)
        }
    }
}

Classifying live camera frames

For real-time classification, feed CMSampleBuffer frames from AVCaptureSession into VNImageRequestHandler(cmSampleBuffer:) inside the AVCaptureVideoDataOutputSampleBufferDelegate callback. Throttle with a timestamp gate (e.g., classify at most 10 fps) to prevent the Neural Engine from overheating and to keep the UI responsive. Use VNSequenceRequestHandler if you need temporal context across frames.

Common pitfalls

⚠
iOS version floor for VNClassifyImageRequest. VNClassifyImageRequest was introduced in iOS 13, but the model quality improved substantially in iOS 14 and 17. If you target iOS 17+, you get the best built-in model — always test on a physical device because the Simulator uses the Mac's Neural Engine and may behave differently.
⚠
Calling perform(_:) on the main actor. VNImageRequestHandler.perform is blocking and can take 50–300 ms on older hardware. Never call it directly on the main thread — always run it inside a Task, or explicitly dispatch to a background queue inside the continuation, to avoid freezing the UI.
⚠
Ignoring imageCropAndScaleOption for custom models. Vision by default letter-boxes images, but most CoreML classifiers were trained with center-crop preprocessing. Mismatching this option silently degrades accuracy — always set request.imageCropAndScaleOption = .centerCrop and verify it against your model's training pipeline.
⚠
Not filtering the 1,000-label result list. Without a confidence threshold, your UI receives hundreds of observations including near-zero hits like "ear" at 0.0001. Always filter by confidence > 0.01 and cap results with .prefix(N) before binding to a ForEach — a 1,000-row list will stutter even on Pro hardware.

Prompt this with Claude Code

When using Soarias or Claude Code directly to implement this:

Implement image classification in SwiftUI for iOS 17+.
Use CoreML/Vision: VNClassifyImageRequest, VNImageRequestHandler,
VNCoreMLRequest, and PhotosPicker for image selection.
Wrap the synchronous Vision call in withCheckedThrowingContinuation.
Show top-10 results as confidence bars with color coding.
Make it accessible (VoiceOver labels on each result row).
Add a #Preview with a realistic sample image from SF Symbols or Assets.

In Soarias's Build phase, paste this prompt into the feature scaffold step so Claude Code generates the full view, view model, and unit test stubs for the classification pipeline before you wire in your custom .mlmodel.

FAQ

Does this work on iOS 16?

VNClassifyImageRequest itself is available back to iOS 13, so the classification logic compiles fine on iOS 16. However, this guide uses @Observable (iOS 17+) and ContentUnavailableView (iOS 17+). Replace @Observable with @ObservableObject + @Published, and swap ContentUnavailableView for a plain VStack placeholder to support iOS 16.

How accurate is VNClassifyImageRequest versus a fine-tuned CoreML model?

Apple's built-in classifier is a general-purpose ImageNet model — excellent for everyday objects, animals, and scenes, but it won't know your app-specific categories (e.g., specific product SKUs, medical images, or rare bird species). For domain-specific classification, train a custom model in Create ML, export it as .mlmodel, and swap to VNCoreMLRequest as shown in the Variants section above.

What is the UIKit equivalent?

In UIKit you'd use the same Vision API — there's no UIKit-specific alternative. The only difference is lifecycle management: you'd trigger classification from imagePickerController(_:didFinishPickingMediaWithInfo:) (or a PHPickerViewController delegate callback) and dispatch back to the main queue with DispatchQueue.main.async to update UILabel / UIProgressView outlets. SwiftUI's structured concurrency makes the async bridging considerably cleaner.

Last reviewed: 2026-05-12 by the Soarias team.