How to Implement Image Classification in SwiftUI
Create a VNClassifyImageRequest, hand it a VNImageRequestHandler built from a CGImage, and read back an array of VNClassificationObservation results — all on-device with no network required. Wrap the synchronous Vision call in a Swift concurrency continuation to keep SwiftUI's @Observable view model clean.
import Vision
func classifyImage(_ uiImage: UIImage) async throws -> [VNClassificationObservation] {
guard let cgImage = uiImage.cgImage else { throw ClassifierError.invalidImage }
return try await withCheckedThrowingContinuation { continuation in
let request = VNClassifyImageRequest { req, error in
if let error { continuation.resume(throwing: error); return }
let results = (req.results as? [VNClassificationObservation]) ?? []
continuation.resume(returning: results.filter { $0.confidence > 0.01 })
}
do {
try VNImageRequestHandler(cgImage: cgImage).perform([request])
} catch {
continuation.resume(throwing: error)
}
}
}
Full implementation
The implementation below uses an @Observable view model to drive state, PhotosPicker for image input, and Vision's built-in VNClassifyImageRequest — which uses a bundled CoreML model so you don't need to ship your own .mlmodel for general scene classification. Results are shown as a confidence bar list, color-coded by certainty, with full VoiceOver support.
import SwiftUI
import Vision
import PhotosUI
// MARK: - Errors
enum ClassifierError: LocalizedError {
case invalidImage
var errorDescription: String? {
"Could not extract CGImage from the selected photo."
}
}
// MARK: - ViewModel
@Observable
final class ImageClassifierViewModel {
var selectedPhoto: PhotosPickerItem?
var displayImage: UIImage?
var classifications: [VNClassificationObservation] = []
var isClassifying = false
var errorMessage: String?
func loadAndClassify() async {
guard let item = selectedPhoto else { return }
isClassifying = true
errorMessage = nil
defer { isClassifying = false }
do {
guard let data = try await item.loadTransferable(type: Data.self),
let uiImage = UIImage(data: data) else {
errorMessage = "Failed to load image data."
return
}
displayImage = uiImage
classifications = try await classify(uiImage)
} catch {
errorMessage = error.localizedDescription
}
}
private func classify(_ image: UIImage) async throws -> [VNClassificationObservation] {
guard let cgImage = image.cgImage else { throw ClassifierError.invalidImage }
return try await withCheckedThrowingContinuation { continuation in
let request = VNClassifyImageRequest { req, error in
if let error {
continuation.resume(throwing: error)
return
}
let all = (req.results as? [VNClassificationObservation]) ?? []
// Keep only meaningful results; Vision returns hundreds of low-confidence labels
let top = Array(all.filter { $0.confidence > 0.01 }.prefix(10))
continuation.resume(returning: top)
}
do {
let handler = VNImageRequestHandler(cgImage: cgImage, options: [:])
try handler.perform([request])
} catch {
continuation.resume(throwing: error)
}
}
}
}
// MARK: - Main View
struct ImageClassifierView: View {
@State private var model = ImageClassifierViewModel()
var body: some View {
NavigationStack {
ScrollView {
VStack(spacing: 24) {
imagePreviewSection
if !model.classifications.isEmpty {
resultsSection
}
if let err = model.errorMessage {
Text(err)
.foregroundStyle(.red)
.font(.caption)
}
}
.padding()
}
.navigationTitle("Image Classifier")
.toolbar {
ToolbarItem(placement: .primaryAction) {
PhotosPicker(
selection: $model.selectedPhoto,
matching: .images,
photoLibrary: .shared()
) {
Label("Pick Photo", systemImage: "photo.badge.plus")
}
.accessibilityLabel("Pick a photo to classify")
}
}
.onChange(of: model.selectedPhoto) { _, _ in
Task { await model.loadAndClassify() }
}
}
}
// MARK: Subviews
@ViewBuilder
private var imagePreviewSection: some View {
if let image = model.displayImage {
Image(uiImage: image)
.resizable()
.scaledToFit()
.clipShape(RoundedRectangle(cornerRadius: 16))
.overlay(alignment: .bottom) {
if model.isClassifying {
ProgressView("Classifying…")
.padding(8)
.background(.ultraThinMaterial, in: Capsule())
.padding(.bottom, 12)
}
}
.accessibilityLabel("Selected photo for classification")
} else {
ContentUnavailableView(
"No Image Selected",
systemImage: "photo.on.rectangle.angled",
description: Text("Tap the button above to pick a photo from your library.")
)
.frame(minHeight: 240)
}
}
private var resultsSection: some View {
VStack(alignment: .leading, spacing: 12) {
Text("Classification Results")
.font(.headline)
.accessibilityAddTraits(.isHeader)
ForEach(model.classifications, id: \.identifier) { obs in
ClassificationRow(observation: obs)
}
}
.frame(maxWidth: .infinity, alignment: .leading)
}
}
// MARK: - Classification Row
struct ClassificationRow: View {
let observation: VNClassificationObservation
private var confidenceColor: Color {
switch observation.confidence {
case 0.7...: .green
case 0.4...: .orange
default: .red
}
}
var body: some View {
VStack(alignment: .leading, spacing: 6) {
HStack {
Text(observation.identifier
.replacingOccurrences(of: "_", with: " ")
.capitalized)
.font(.subheadline.weight(.medium))
Spacer()
Text(observation.confidence,
format: .percent.precision(.fractionLength(1)))
.font(.subheadline.monospacedDigit())
.foregroundStyle(.secondary)
}
ProgressView(value: Double(observation.confidence))
.tint(confidenceColor)
}
.accessibilityElement(children: .combine)
.accessibilityLabel(
"\(observation.identifier.replacingOccurrences(of: "_", with: " ").capitalized), \(Int(observation.confidence * 100)) percent confidence"
)
}
}
// MARK: - Preview
#Preview {
ImageClassifierView()
}
How it works
-
1
PhotosPicker + async data transfer.
PhotosPicker(selection: $model.selectedPhoto, matching: .images)returns aPhotosPickerItem. The.onChangemodifier triggersloadTransferable(type: Data.self), which fetches the full-resolution image data from the Photos library asynchronously without blocking the UI thread. -
2
CGImage extraction. Vision operates on
CGImage, notUIImage. The guard inclassify(_:)converts and throwsClassifierError.invalidImagefor edge cases such as animated GIFs that have no direct CGImage backing. -
3
Bridging Vision's callback to Swift concurrency.
VNImageRequestHandler.perform(_:)is synchronous and calls its completion handler on the same thread. Wrapping it inwithCheckedThrowingContinuationbridges that callback to anasync throwsfunction so the caller canawaitit cleanly inside the view model'sTask. -
4
Filtering results.
VNClassifyImageRequestcan return hundreds ofVNClassificationObservationvalues (the model knows thousands of ImageNet labels). The.filter { $0.confidence > 0.01 }.prefix(10)pipeline keeps only the top 10 meaningful predictions. -
5
Accessible confidence bars.
ClassificationRowcombines its child elements into a single VoiceOver unit via.accessibilityElement(children: .combine)and provides a natural-language label like "Cat, 92 percent confidence" so screen reader users get the full picture without hearing raw numeric identifiers.
Variants
Using a custom CoreML model (.mlmodel)
Replace the built-in VNClassifyImageRequest with a VNCoreMLRequest backed by your own model. Drag the .mlmodel into Xcode — it auto-generates a Swift class.
import CoreML
import Vision
// Xcode auto-generates MobileNetV2 from MobileNetV2.mlmodel
func classifyWithCustomModel(_ cgImage: CGImage) async throws -> [VNClassificationObservation] {
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine // use ANE for speed
let mlModel = try MobileNetV2(configuration: config).model
let vnModel = try VNCoreMLModel(for: mlModel)
return try await withCheckedThrowingContinuation { continuation in
let request = VNCoreMLRequest(model: vnModel) { req, error in
if let error { continuation.resume(throwing: error); return }
let results = (req.results as? [VNClassificationObservation]) ?? []
continuation.resume(returning: Array(results.prefix(5)))
}
request.imageCropAndScaleOption = .centerCrop // matches model training
do {
try VNImageRequestHandler(cgImage: cgImage).perform([request])
} catch {
continuation.resume(throwing: error)
}
}
}
Classifying live camera frames
For real-time classification, feed CMSampleBuffer frames from AVCaptureSession into VNImageRequestHandler(cmSampleBuffer:) inside the AVCaptureVideoDataOutputSampleBufferDelegate callback. Throttle with a timestamp gate (e.g., classify at most 10 fps) to prevent the Neural Engine from overheating and to keep the UI responsive. Use VNSequenceRequestHandler if you need temporal context across frames.
Common pitfalls
-
⚠
iOS version floor for VNClassifyImageRequest.
VNClassifyImageRequestwas introduced in iOS 13, but the model quality improved substantially in iOS 14 and 17. If you target iOS 17+, you get the best built-in model — always test on a physical device because the Simulator uses the Mac's Neural Engine and may behave differently. -
⚠
Calling
perform(_:)on the main actor.VNImageRequestHandler.performis blocking and can take 50–300 ms on older hardware. Never call it directly on the main thread — always run it inside aTask, or explicitly dispatch to a background queue inside the continuation, to avoid freezing the UI. -
⚠
Ignoring
imageCropAndScaleOptionfor custom models. Vision by default letter-boxes images, but most CoreML classifiers were trained with center-crop preprocessing. Mismatching this option silently degrades accuracy — always setrequest.imageCropAndScaleOption = .centerCropand verify it against your model's training pipeline. -
⚠
Not filtering the 1,000-label result list. Without a confidence threshold, your UI receives hundreds of observations including near-zero hits like "ear" at 0.0001. Always filter by
confidence > 0.01and cap results with.prefix(N)before binding to aForEach— a 1,000-row list will stutter even on Pro hardware.
Prompt this with Claude Code
When using Soarias or Claude Code directly to implement this:
Implement image classification in SwiftUI for iOS 17+. Use CoreML/Vision: VNClassifyImageRequest, VNImageRequestHandler, VNCoreMLRequest, and PhotosPicker for image selection. Wrap the synchronous Vision call in withCheckedThrowingContinuation. Show top-10 results as confidence bars with color coding. Make it accessible (VoiceOver labels on each result row). Add a #Preview with a realistic sample image from SF Symbols or Assets.
In Soarias's Build phase, paste this prompt into the feature scaffold step so Claude Code generates the full view, view model, and unit test stubs for the classification pipeline before you wire in your custom .mlmodel.
Related
FAQ
Does this work on iOS 16?
VNClassifyImageRequest itself is available back to iOS 13, so the classification logic compiles fine on iOS 16. However, this guide uses @Observable (iOS 17+) and ContentUnavailableView (iOS 17+). Replace @Observable with @ObservableObject + @Published, and swap ContentUnavailableView for a plain VStack placeholder to support iOS 16.
How accurate is VNClassifyImageRequest versus a fine-tuned CoreML model?
Apple's built-in classifier is a general-purpose ImageNet model — excellent for everyday objects, animals, and scenes, but it won't know your app-specific categories (e.g., specific product SKUs, medical images, or rare bird species). For domain-specific classification, train a custom model in Create ML, export it as .mlmodel, and swap to VNCoreMLRequest as shown in the Variants section above.
What is the UIKit equivalent?
In UIKit you'd use the same Vision API — there's no UIKit-specific alternative. The only difference is lifecycle management: you'd trigger classification from imagePickerController(_:didFinishPickingMediaWithInfo:) (or a PHPickerViewController delegate callback) and dispatch back to the main queue with DispatchQueue.main.async to update UILabel / UIProgressView outlets. SwiftUI's structured concurrency makes the async bridging considerably cleaner.
Last reviewed: 2026-05-12 by the Soarias team.