Experimenting with the proposed Cross-Origin Storage API in Transformers.js

Transformers.js developers face a storage problem with cross-origin models A developer running an automatic speech recognition pipeline on one website cannot reuse…

By AI Maestro June 23, 2026 6 min read
Experimenting with the proposed Cross-Origin Storage API in Transformers.js


Transformers.js developers face a storage problem with cross-origin models

A developer running an automatic speech recognition pipeline on one website cannot reuse a cached model downloaded by a different website, even if both sites use the exact same Hugging Face model.

The cache challenge

Transformers.js allows web developers to run inference in the browser by creating an instance of the pipeline() function. A developer might set up an automatic speech recognition task like this:

import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@4.2.0';

const asr = await pipeline(
  'automatic-speech-recognition',
  'Xenova/whisper-tiny.en',
  { device: 'webgpu' },
);
const result = await asr('jfk.wav');
console.log(result);

The code specifies Xenova/whisper-tiny.en as the model. This is a standard choice for English speech recognition and is the default model for this task in Transformers.js.

Model resources

When you run the example in a browser, Transformers.js downloads and caches the necessary model resources and WebAssembly (Wasm) files. Chrome DevTools shows these files in the Cache storage section. On a page reload, the browser serves the resources from the Cache API, and the model returns results almost instantly.

However, Xenova/whisper-tiny.en is a popular model. Many different apps might use it. If you visit a different origin running the same example, the browser must download and cache all the model resources again, even if they are byte-for-byte identical to what another site already has. In this test, the duplicate download and storage added up to 177 MB. This waste grows quickly.

Wasm runtime resources

The issue worsens when you add a second pipeline, such as sentiment analysis. This task uses the Xenova/distilbert-base-uncased-finetuned-sst-2-english model by default. Transformers.js selects this automatically if you do not specify a model.

const classifier = await pipeline('sentiment-analysis');
const sentiment = await classifier(result.text);
pre.append('\n\n' + JSON.stringify(sentiment, null, 2));

Two different AI models rely on the same 4,733 kB ort-wasm-simd-threaded.asyncify.wasm WebAssembly runtime file from the underlying ONNX Runtime library. Open the extended demo on a different origin, and you will see in the Network tab that the Wasm runtime downloads and caches again.

Even if apps do not share the same AI models, your browser makes redundant requests for shared Wasm resources you already have. The browser also caches them again, consuming space on your hard disk.

Cache isolation

AI model resources serving

AI model resources come from the Hugging Face Hub, ultimately the Hugging Face CDN. The browser requests a resource like https://huggingface.co/Xenova/distilbert-base-uncased-finetuned-sst-2-english/resolve/main/config.json. This redirects to a final CDN URL like https://huggingface.co/api/resolve-cache/models/Xenova/distilbert-base-uncased-finetuned-sst-2-english/0b6928efcb76139cae2c6881d49cda67fe119f42/config.json?%2FXenova%2Fdistilbert-base-uncased-finetuned-sst-2-english%2Fresolve%2Fmain%2Fconfig.json=&etag=%223c36342ef1f74de2797d667c68c6b7b988d0b87c%22.

Wasm runtime resources serving

The Wasm runtime resources are served from the jsDelivr CDN by default. The file ort-wasm-simd-threaded.asyncify.wasm comes from https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm at the time of writing.

One might assume that if different apps serve resources from the same CDN URLs, caching should work. Browsers have not worked this way for a long time. Caches are isolated by origin to prevent timing attacks. The time a website takes to respond to HTTP requests can reveal that the browser has accessed the same resource in the past, creating security and privacy leaks. The article Gaining security and privacy by partitioning the cache explains the details.

Chrome’s implementation

Chrome caches resources using a Network Isolation Key in addition to the resource URL. The key is composed of the top-level site and the current-frame site. Consider the toy examples hosted on https://googlechrome.github.io and https://rawcdn.rawgit.net. If both use the Wasm runtime from https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm, their cache keys differ.

Network Isolation KeyResource URL
Top-level siteCurrent-frame site

https://googlechrome.github.io

https://googlechrome.github.io

https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm

https://rawcdn.rawgit.net

https://rawcdn.rawgit.net

https://cdn.jsdelivr.net/npm/onnxruntime-web@1.26.0-dev.20260416-b7804b056c/dist/ort-wasm-simd-threaded.asyncify.wasm

Even if the resource URLs are identical, the Network Isolation Keys do not match. There is no cache hit, leading to duplicate downloads and storage. This is the problem the Cross-Origin Storage proposal solves.

Enter the Cross-Origin Storage API

💡 Note: The Cross-Origin Storage API is an early-stage proposal that isn’t final. While the proposed API is not yet natively implemented in any browser, you don’t have to wait to experiment with it. Install the Cross-Origin Storage extension to inject the

navigator.crossOriginStorage

polyfill on all pages and test the complete flow.

The proposed Cross-Origin Storage (COS) API introduces a dedicated navigator.crossOriginStorage interface. Web apps can store and retrieve large files across origin boundaries. Identification uses a cryptographic hash rather than a URL or origin.

Identifying files by hash is the key. The ort-wasm-simd-threaded.asyncify.wasm runtime downloaded while visiting https://googlechrome.github.io is recognized as identical to the one https://rawcdn.rawgit.net requests, regardless of where either origin fetched it.

const hash = {
  algorithm: 'SHA-256',
  value: '8f434346648f6b96df89dda901c5176b10a6d83961dd3c1ac88b59b2dc327aa4',
};

try {
  const handle = await navigator.crossOriginStorage.requestFileHandle(hash);
  // Cache hit! Get the file as a Blob and use it directly.
  const fileBlob = await handle.getFile();
} catch (err) {
  // Cache miss. Download from network, then store for next time.
  const fileBlob = await fetch('https://cdn.jsdelivr.net/.../ort-wasm-simd-threaded.asyncify.wasm')
    .then(r => r.blob());
  const handle = await navigator.crossOriginStorage.requestFileHandle(
    hash,
    { create: true, origins: '*' },
  );
  const writableStream = await handle.createWritable();
  await writableStream.write(fileBlob);
  await writableStream.close();  
}

If the resource is in COS, you get back a FileSystemFileHandle to read the blob directly via getFile(). The resulting File inherits from Blob. If the resource is not in COS, you fall back to the network and write the resource into COS for the next app that needs it. That next app could be your own or an unrelated one on a different origin.

The API follows the File System Standard’s FileSystemDirectoryHandle.getFileHandle() from the Origin Private File System (OPFS) API. The hash parameter acts like the name parameter in OPFS, uniquely identifying a resource. The options.create flag behaves similarly: absent or false for read-only access, true when you intend to write.

Control who can read what

Not every resource should be globally shared. COS gives developers precise control over visibility through the origins option when storing a file.

  • Setting origins: '*' makes a file globally available. Any origin can find it by hash. This suits AI model resources or the Wasm runtime in the Transformers.js example, where every web app benefits from a single cached copy.
  • Passing a specific list of origins, like origins: ['https://write.example.com', 'https://calculate.example.com'], restricts access to those sites. This works for proprietary resources shared across a company’s properties that should not be discoverable by others, such as a proprietary proofreading AI model used in a commercial office suite.
  • Omitting origins entirely makes the file available only to same-site origins. This is a sensible default for resources shared across an organization’s subdomains
Scroll to Top