coreml : set convert_to="mlprogram" in convert #3060

danbev · 2025-04-18T19:23:15Z

This commit adds the skip_model_load argument to the convert_encoder and convert_decoder functions in the convert-whisper-to-coreml.py file.

The motivation for this is that this is only needed if one intends to perform inference on the model after conversion. In this case it also seem to avoid an issue with larger models where the following error is throws:

Running MIL backend_neuralnetwork pipeline: 100%|█████████| 9/9 [00:00<00:00, 35.44 passes/s]
Translating MIL ==> NeuralNetwork Ops: 100%|███████████| 5641/5641 [03:31<00:00, 26.65 ops/s]
Traceback (most recent call last):
  File "/Users/danbev/work/ai/whisper-work/models/convert-whisper-to-coreml.py", line 322, in <module>
    encoder = convert_encoder(hparams, encoder, quantize=args.quantize)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/models/convert-whisper-to-coreml.py", line 255, in convert_encoder
    model = ct.convert(
            ^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 635, in convert
    mlmodel = mil_convert(
              ^^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 186, in mil_convert
    return _mil_convert(
           ^^^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 245, in _mil_convert
    return modelClass(
           ^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/models/model.py", line 489, in __init__
    self.__proxy__, self._spec, self._framework_error = self._get_proxy_and_spec(
                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/models/model.py", line 550, in _get_proxy_and_spec
    _MLModelProxy(
ValueError: basic_string

Refs: #3012

This commit adds the `skip_model_load` argument to the `convert_encoder` and `convert_decoder` functions in the `convert-whisper-to-coreml.py` file. The motivation for this is that this is only needed if one intends to perform inference on the model after conversion. In this case it also seem to avoid an issue with larger models where the following error is throws: ```console Running MIL backend_neuralnetwork pipeline: 100%|█████████| 9/9 [00:00<00:00, 35.44 passes/s] Translating MIL ==> NeuralNetwork Ops: 100%|███████████| 5641/5641 [03:31<00:00, 26.65 ops/s] Traceback (most recent call last): File "/Users/danbev/work/ai/whisper-work/models/convert-whisper-to-coreml.py", line 322, in <module> encoder = convert_encoder(hparams, encoder, quantize=args.quantize) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/models/convert-whisper-to-coreml.py", line 255, in convert_encoder model = ct.convert( ^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/_converters_entry.py", line 635, in convert mlmodel = mil_convert( ^^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 186, in mil_convert return _mil_convert( ^^^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/converters/mil/converter.py", line 245, in _mil_convert return modelClass( ^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/models/model.py", line 489, in __init__ self.__proxy__, self._spec, self._framework_error = self._get_proxy_and_spec( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.11/site-packages/coremltools/models/model.py", line 550, in _get_proxy_and_spec _MLModelProxy( ValueError: basic_string ``` Refs: ggml-org#3012

This commit updates the conversion process for Whisper models to use the "mlprogram" format instead of "neuralnetwork". The motivation for this change is that when using the "neuralnetwork" format the underlying model produced is based on protobuf and my understanding is that there are limitations to this format, such as sizes of strings and the complexity of the model. Currently when trying to convert larger models such as large-v3 the conversion fails but succeeds for smaller models. The "mlprogram" format is a more recent addition to CoreML and is designed to be more flexible and powerful, allowing for more complex models and larger data types. This seems to work for larger and smaller models alike and unless I'm there are considerations that I'm not aware of I think this is what we should be using moving forward.

danbev mentioned this pull request Apr 19, 2025

generate coreml model ValueError: basic_string #3012

Open

ggerganov approved these changes Apr 20, 2025

View reviewed changes

danbev merged commit 8b92060 into ggml-org:master Apr 23, 2025
51 checks passed

danbev changed the title ~~coreml : skip model load in convert-whisper-to-coreml.py~~ * coreml : skip model load in convert-whisper-to-coreml.py Apr 23, 2025

danbev changed the title * coreml : skip model load in convert-whisper-to-coreml.py coreml : set convert_to="mlprogram" in convert Apr 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coreml : set convert_to="mlprogram" in convert #3060

coreml : set convert_to="mlprogram" in convert #3060

danbev commented Apr 18, 2025

coreml : set convert_to="mlprogram" in convert #3060

coreml : set convert_to="mlprogram" in convert #3060

Conversation

danbev commented Apr 18, 2025