On running my flutter app which uses llama_cpp_dart package. It is showing error loading model
02:56 01 Apr 2026

I am trying to run a gguf model using llama_cpp_dart v^0.2.2 package. But it is showing model cannot be loaded. The initialize function is what runs these. I checked the llama_cpp folder in lib\src of llama_cpp_dart package in the pub. At a particular point the cpp code calls load_llama_model_from_file which has model path ptr and modelparams as parameter. It is returning null which results in the corresponding llama exception in stacktrace.

Thank you..

Code used:

Future initialize() async {
    if (_state == ModelState.loading || _state == ModelState.ready) return;
    _set(ModelState.loading, p: 0.0);

    try {
      final nativeLibDir = await _getNativeLibDir();
      debugPrint('Native lib dir: $nativeLibDir');
      Llama.libraryPath = '$nativeLibDir/libllama.so';

      const model = 'model.gguf';

      final dir = await getApplicationSupportDirectory();
      final filePath = '${dir.path}/$model';
      final fileExists = await File(filePath).exists();
      if (!fileExists) {
        final data = await rootBundle.load('assets/models/$model');
        // 2. Write it all at once synchronously to ensure no 'busy' file handles
        // This is much safer for the C++ engine to read later
        final file = File(filePath);
        await file.writeAsBytes(
          data.buffer.asUint8List(data.offsetInBytes, data.lengthInBytes),
          mode: FileMode.write,
          flush: true, // This forces the OS to finish the write before moving on
        );
      }

      _set(ModelState.loading, p: 0.90);

      final cmd = LlamaLoad(
        path: filePath,
        modelParams: ModelParams()..nGpuLayers = 0,
        contextParams: ContextParams()
          ..nCtx = 512
          ..nThreads = 4,
        samplingParams: SamplerParams()..temp = 0.8,
        verbose: true,
      );

      _parent = LlamaParent(cmd);
      _parent!.stream.listen(
        (token) => debugPrint('TOKEN: $token'),
        onError: (e) => debugPrint('ISOLATE ERROR: $e'),
        onDone: () => debugPrint('ISOLATE DONE'),
        cancelOnError: false,
      );

      debugPrint('Calling init()...');
      await _parent!.init().timeout(const Duration(minutes: 5), onTimeout: () {
        throw Exception('init() timed out');
      });

      _set(ModelState.ready, p: 1.0);
    } catch (e, st) {
      _errorMsg = e.toString();
      debugPrint('TinyLlama init error: $e\n$st');
      _set(ModelState.error);
    }
  }

  void _set(ModelState s, {double? p}) {
    _state = s;
    if (p != null) _loadProgress = p;
    notifyListeners();
  }

  static const _channel = MethodChannel('native_lib_path');

  Future _getNativeLibDir() async {
    return await _channel.invokeMethod('getPath') ?? '';
  }

Stack trace is given below:

I/flutter (22081): Calling init()...I/Quality (22081): Skipped: false 12 cost 201.68422 refreshRate 16603314 bit true processName com.example.frontendD/VRI[MainActivity](22081): registerCallbacksForSync syncBuffer=falseD/BLASTBufferQueue(22081): [VRI[MainActivity]#0](f:0,a:1) acquireNextBufferLocked size=1080x2400 mFrameNumber=1 applyTransaction=true mTimestamp=1129899983992346(auto) mPendingTransactions.size=0 graphicBufferId=94837172862991 transform=0D/VRI[MainActivity](22081): Received frameCommittedCallback lastAttemptedDrawFrameNum=1 didProduceBuffer=true syncBuffer=falseW/Parcel  (22081): Expecting binder but got null!D/VRI[MainActivity](22081):  debugCancelDraw  cancelDraw=false,count = 247,android.view.ViewRootImpl@5f45cfcD/VRI[MainActivity](22081): draw finished.D/VRI[MainActivity](22081): onFocusEvent trueI/flutter (22081): CPU : NEON = 1 | ARM_FMA = 1 | LLAMAFILE = 1 | REPACK = 1 | I/flutter (22081): modelPath: /data/user/0/com.example.frontend/files/model.ggufI/flutter (22081): libraryPath: /data/app/~~GJjJDHgHXAOuGkVJm8rXmA==/com.example.frontend-rxovsU9_2IHs-z_cK78-Nw==/lib/arm64/libllama.soI/flutter (22081): TinyLlama init error: LlamaException: Error loading model: LlamaException: Failed to initialize Llama (LlamaException: Could not load model at /data/user/0/com.example.frontend/files/model.gguf)I/flutter (22081): I/Quality (22081): Skipped: false 1 cost 16.622011 refreshRate 16605110 bit true processName com.example.frontendD/ProfileInstaller(22081): Installing profile for com.example.frontendW/OnBackInvokedCallback(22081): OnBackInvokedCallback is not enabled for the application.W/OnBackInvokedCallback(22081): Set 'android:enableOnBackInvokedCallback="true"' in the application manifest.

dart machine-learning flutter-dependencies llamacpp static-quantization