Using transformers on Kaggle

PeteBleackley · July 29, 2025, 12:31pm

In this Kaggle kernel I’m using a ModernBert base model with a custom head. The resulting model inherits from transformers.PreTrainedModel
This works fine in the draft session, but throws an error when I try to save and commit.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py in __getattr__(self, name)
   2044             try:
-> 2045                 module = self._get_module(self._class_to_module[name])
   2046                 value = getattr(module, name)

/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
   2074         except Exception as e:
-> 2075             raise e
   2076 

/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
   2072         try:
-> 2073             return importlib.import_module("." + module_name, self.__name__)
   2074         except Exception as e:

/usr/lib/python3.11/importlib/__init__.py in import_module(name, package)
    125             level += 1
--> 126     return _bootstrap._gcd_import(name[level:], package, level)
    127 

/usr/lib/python3.11/importlib/_bootstrap.py in _gcd_import(name, package, level)

/usr/lib/python3.11/importlib/_bootstrap.py in _find_and_load(name, import_)

/usr/lib/python3.11/importlib/_bootstrap.py in _find_and_load_unlocked(name, import_)

/usr/lib/python3.11/importlib/_bootstrap.py in _load_unlocked(spec)

/usr/lib/python3.11/importlib/_bootstrap_external.py in exec_module(self, module)

/usr/lib/python3.11/importlib/_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

/usr/local/lib/python3.11/dist-packages/transformers/modeling_utils.py in <module>
     72 )
---> 73 from .loss.loss_utils import LOSS_MAPPING
     74 from .pytorch_utils import (  # noqa: F401

/usr/local/lib/python3.11/dist-packages/transformers/loss/loss_utils.py in <module>
     20 
---> 21 from .loss_d_fine import DFineForObjectDetectionLoss
     22 from .loss_deformable_detr import DeformableDetrForObjectDetectionLoss, DeformableDetrForSegmentationLoss

/usr/local/lib/python3.11/dist-packages/transformers/loss/loss_d_fine.py in <module>
     20 from ..utils import is_vision_available
---> 21 from .loss_for_object_detection import (
     22     box_iou,

/usr/local/lib/python3.11/dist-packages/transformers/loss/loss_for_object_detection.py in <module>
     31 if is_vision_available():
---> 32     from transformers.image_transforms import center_to_corners_format
     33 

/usr/local/lib/python3.11/dist-packages/transformers/image_transforms.py in <module>
     20 
---> 21 from .image_utils import (
     22     ChannelDimension,

/usr/local/lib/python3.11/dist-packages/transformers/image_utils.py in <module>
     58     if is_torchvision_available():
---> 59         from torchvision.transforms import InterpolationMode
     60 

/usr/local/lib/python3.11/dist-packages/torchvision/__init__.py in <module>
      9 from .extension import _HAS_OPS  # usort:skip
---> 10 from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils  # usort:skip
     11 

/usr/local/lib/python3.11/dist-packages/torchvision/_meta_registrations.py in <module>
    162 
--> 163 @torch.library.register_fake("torchvision::nms")
    164 def meta_nms(dets, scores, iou_threshold):

/usr/local/lib/python3.11/dist-packages/torch/library.py in register(func)
   1022             use_lib = lib
-> 1023         use_lib._register_fake(op_name, func, _stacklevel=stacklevel + 1)
   1024         return func

/usr/local/lib/python3.11/dist-packages/torch/library.py in _register_fake(self, op_name, fn, _stacklevel)
    213 
--> 214         handle = entry.fake_impl.register(func_to_register, source)
    215         self._registration_handles.append(handle)

/usr/local/lib/python3.11/dist-packages/torch/_library/fake_impl.py in register(self, func, source)
     30             )
---> 31         if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
     32             raise RuntimeError(

RuntimeError: operator torchvision::nms does not exist

The above exception was the direct cause of the following exception:

ModuleNotFoundError                       Traceback (most recent call last)
/tmp/ipykernel_26/2863755039.py in <cell line: 0>()
----> 1 class EncoderModel(transformers.PreTrainedModel):
      2 
      3     def __init__(self,path):
      4         """
      5         Creates the encoder model

/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py in __getattr__(self, name)
   2046                 value = getattr(module, name)
   2047             except (ModuleNotFoundError, RuntimeError) as e:
-> 2048                 raise ModuleNotFoundError(
   2049                     f"Could not import module '{name}'. Are this object's requirements defined correctly?"
   2050                 ) from e

ModuleNotFoundError: Could not import module 'PreTrainedModel'. Are this object's requirements defined correctly?

Has anyone else encountered this issue, and have any idea how to solve it?

John6666 · July 29, 2025, 10:46pm

Perhaps the Kaggle Docker image has an outdated version of torchvision, causing incompatibility?
I think it might work if you fix the torchvision version or, if unnecessary, uninstall it altogether…

github.com/ultralytics/ultralytics

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend.

opened 08:07PM - 23 Sep 23 UTC

closed 07:45AM - 24 Sep 23 UTC

chri002

bug

### Search before asking - [X] I have searched the YOLOv8 [issues](https://gi…thub.com/ultralytics/ultralytics/issues) and found no similar bug report. ### Bug I have tried the old answers but nothing works with torch 2.* in windows, on linux there is no problem; on windows i have tried installing 2.0.0 2.0.1 and 2.0.2 with associated torchvision and torchaudio for cuda 11.8, changing ultralytics version but nothing. update: **If I run the code from the terminal python script it works, but within jupyter it does not work and returns the error.** the code is a easy inference: ``` from PIL import Image from ultralytics import YOLO reconizer = YOLO("yolov8n.pt") image = Image.open("NewPath.png") output = None output = reconizer(image) out = [] for r in output: boxes = r.boxes for box in boxes: if(box.conf>0.5): b = box.xyxy[0] c = box.cls out.append([b, reconizer.names[int(c)]]) print(out) ``` the error is as follows: ------------------------------------------------------------------------ NotImplementedError Traceback (most recent call last) Cell In[3], line 16 14 image = Image.open("NewPath.png") 15 output = None ---> 16 output = reconizer(image) 17 out = [] 18 for r in output: File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\engine\model.py:96, in Model.__call__(self, source, stream, **kwargs) 94 def __call__(self, source=None, stream=False, **kwargs): 95 """Calls the 'predict' function with given arguments to perform object detection.""" ---> 96 return self.predict(source, stream, **kwargs) File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\engine\model.py:236, in Model.predict(self, source, stream, predictor, **kwargs) 234 if prompts and hasattr(self.predictor, 'set_prompts'): # for SAM-type models 235 self.predictor.set_prompts(prompts) --> 236 return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream) File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\engine\predictor.py:194, in BasePredictor.__call__(self, source, model, stream, *args, **kwargs) 192 return self.stream_inference(source, model, *args, **kwargs) 193 else: --> 194 return list(self.stream_inference(source, model, *args, **kwargs)) File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\_contextlib.py:35, in _wrap_generator.<locals>.generator_context(*args, **kwargs) 32 try: 33 # Issuing `None` to a generator fires it up 34 with ctx_factory(): ---> 35 response = gen.send(None) 37 while True: 38 try: 39 # Forward the response to our caller and get its next request File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\engine\predictor.py:257, in BasePredictor.stream_inference(self, source, model, *args, **kwargs) 255 # Postprocess 256 with profilers[2]: --> 257 self.results = self.postprocess(preds, im, im0s) 258 self.run_callbacks('on_predict_postprocess_end') 260 # Visualize, save, write results File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\models\yolo\detect\predict.py:25, in DetectionPredictor.postprocess(self, preds, img, orig_imgs) 23 def postprocess(self, preds, img, orig_imgs): 24 """Post-processes predictions and returns a list of Results objects.""" ---> 25 preds = ops.non_max_suppression(preds, 26 self.args.conf, 27 self.args.iou, 28 agnostic=self.args.agnostic_nms, 29 max_det=self.args.max_det, 30 classes=self.args.classes) 32 if not isinstance(orig_imgs, list): # input images are a torch.Tensor, not a list 33 orig_imgs = ops.convert_torch2numpy_batch(orig_imgs) File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\ultralytics\utils\ops.py:242, in non_max_suppression(prediction, conf_thres, iou_thres, classes, agnostic, multi_label, labels, max_det, nc, max_time_img, max_nms, max_wh) 240 c = x[:, 5:6] * (0 if agnostic else max_wh) # classes 241 boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores --> 242 i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS 243 i = i[:max_det] # limit detections 245 # # Experimental 246 # merge = False # use merge-NMS 247 # if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean) (...) 254 # if redundant: 255 # i = i[iou.sum(1) > 1] # require redundancy File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torchvision\ops\boxes.py:41, in nms(boxes, scores, iou_threshold) 39 _log_api_usage_once(nms) 40 _assert_has_ops() ---> 41 return torch.ops.torchvision.nms(boxes, scores, iou_threshold) File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\_ops.py:502, in OpOverloadPacket.__call__(self, *args, **kwargs) 497 def __call__(self, *args, **kwargs): 498 # overloading __call__ to ensure torch.ops.foo.bar() 499 # is still callable from JIT 500 # We save the function ptr as the `op` attribute on 501 # OpOverloadPacket to access it here. --> 502 return self._op(*args, **kwargs or {}) NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastCUDA, FuncTorchBatched, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PythonDispatcher]. CPU: registered at C:\Users\circleci\project\torchvision\csrc\ops\cpu\nms_kernel.cpp:112 [kernel] QuantizedCPU: registered at C:\Users\circleci\project\torchvision\csrc\ops\quantized\cpu\qnms_kernel.cpp:124 [kernel] BackendSelect: fallthrough registered at ..\aten\src\ATen\core\BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at ..\aten\src\ATen\core\PythonFallbackKernel.cpp:144 [backend fallback] FuncTorchDynamicLayerBackMode: registered at ..\aten\src\ATen\functorch\DynamicLayer.cpp:491 [backend fallback] Functionalize: registered at ..\aten\src\ATen\FunctionalizeFallbackKernel.cpp:280 [backend fallback] Named: registered at ..\aten\src\ATen\core\NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at ..\aten\src\ATen\ConjugateFallback.cpp:17 [backend fallback] Negative: registered at ..\aten\src\ATen\native\NegateFallback.cpp:19 [backend fallback] ZeroTensor: registered at ..\aten\src\ATen\ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:63 [backend fallback] AutogradOther: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:30 [backend fallback] AutogradCPU: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:34 [backend fallback] AutogradCUDA: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:42 [backend fallback] AutogradXLA: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:46 [backend fallback] AutogradMPS: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:54 [backend fallback] AutogradXPU: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:38 [backend fallback] AutogradHPU: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:67 [backend fallback] AutogradLazy: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:50 [backend fallback] AutogradMeta: fallthrough registered at ..\aten\src\ATen\core\VariableFallbackKernel.cpp:58 [backend fallback] Tracer: registered at ..\torch\csrc\autograd\TraceTypeManual.cpp:294 [backend fallback] AutocastCPU: fallthrough registered at ..\aten\src\ATen\autocast_mode.cpp:487 [backend fallback] AutocastCUDA: fallthrough registered at ..\aten\src\ATen\autocast_mode.cpp:354 [backend fallback] FuncTorchBatched: registered at ..\aten\src\ATen\functorch\LegacyBatchingRegistrations.cpp:815 [backend fallback] FuncTorchVmapMode: fallthrough registered at ..\aten\src\ATen\functorch\VmapModeRegistrations.cpp:28 [backend fallback] Batched: registered at ..\aten\src\ATen\LegacyBatchingRegistrations.cpp:1073 [backend fallback] VmapMode: fallthrough registered at ..\aten\src\ATen\VmapModeRegistrations.cpp:33 [backend fallback] FuncTorchGradWrapper: registered at ..\aten\src\ATen\functorch\TensorWrapper.cpp:210 [backend fallback] PythonTLSSnapshot: registered at ..\aten\src\ATen\core\PythonFallbackKernel.cpp:152 [backend fallback] FuncTorchDynamicLayerFrontMode: registered at ..\aten\src\ATen\functorch\DynamicLayer.cpp:487 [backend fallback] PythonDispatcher: registered at ..\aten\src\ATen\core\PythonFallbackKernel.cpp:148 [backend fallback] ### Environment OS: windows 11 Env: jupyter Python: 3.11.3 torch: 2.0.1+cu11.8 RAM: 16.00 GB VRAM: 4.0 GB ### Minimal Reproducible Example _No response_ ### Additional _No response_ ### Are you willing to submit a PR? - [ ] Yes I'd like to help by submitting a PR!

Topic		Replies	Views
Transformers suddenly complaining about pytorch? 🤗Transformers	2	9018	August 3, 2021
Transformers trying to use keras? 🤗Transformers	0	578	June 23, 2023
KeyError when loading any dinov2 model 🤗Transformers	1	3206	August 4, 2023
Model loading gets stuck when calling "from_pretrained" 🤗Transformers	10	1969	October 23, 2025
Python transformers issue Beginners	2	960	March 8, 2023

Using transformers on Kaggle

Related topics