Bug: RuntimeError: CUDA error: an illegal memory access was encountered
Description:
Если указать для forced alignment номер GPU отличный от 0, то появляется ошибка:
Traceback (most recent call last):
File "/home/jovyan/popov_ilya/audio_framework/main.py", line 110, in <module>
main()
File "/home/jovyan/popov_ilya/audio_framework/main.py", line 83, in main
processor()
File "/home/jovyan/popov_ilya/audio_framework/src/preprocessing/aligner.py", line 324, in __call__
sentence_spans = self.__compute_alignments(emission, texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jovyan/popov_ilya/audio_framework/src/preprocessing/aligner.py", line 369, in __compute_alignments
alignments, scores = forced_align(emission, targets, blank=self.config.blank_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jovyan/popov_ilya/audio_framework/.venv/lib/python3.12/site-packages/torchaudio/functional/_alignment.py", line 72, in forced_align
paths, scores = torch.ops.torchaudio.forced_align(log_probs, targets, input_lengths, target_lengths, blank)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/jovyan/popov_ilya/audio_framework/.venv/lib/python3.12/site-packages/torch/_ops.py", line 1061, in __call__
return self_._op(*args, **(kwargs or {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Solution
Для выбора определенной карты необходимо указать переменную окружения CUDA_VISIBLE_DEVICES
TODO
Добавить определение CUDA_VISIBLE_DEVICES в коде