Failed to get convolution algorithm : CRASH
Hi Team,
Here I am again, reporting a crash this time. See below, the crash report.
Please help me!
Thanks again,
Vicky
Code: Select all
08/01/2019 11:43:26 MainProcess training_0 _base __init__ DEBUG Initialized Trainer
08/01/2019 11:43:26 MainProcess training_0 train load_trainer DEBUG Loaded Trainer
08/01/2019 11:43:26 MainProcess training_0 train run_training_cycle DEBUG Running Training Cycle
08/01/2019 11:43:26 MainProcess training_0 training_data minibatch DEBUG Launching minibatch generator for queue (side: 'a', is_display: False)
08/01/2019 11:43:26 MainProcess training_0 _base generate_preview DEBUG Generating preview
08/01/2019 11:43:26 MainProcess training_0 _base set_preview_feed DEBUG Setting preview feed: (side: 'a')
08/01/2019 11:43:26 MainProcess training_0 _base load_generator DEBUG Loading generator: a
08/01/2019 11:43:26 MainProcess training_0 _base load_generator DEBUG input_size: 64, output_shapes: [(64, 64, 3)]
08/01/2019 11:43:26 MainProcess training_0 training_data __init__ DEBUG Initializing TrainingDataGenerator: (model_input_size: 64, model_output_shapes: [(64, 64, 3)], training_opts: {'alignments': {'a': '/home/vicky/Vicky/Projects/facial/faceswap/dataset/cageO/alignments.json', 'b': '/home/vicky/Vicky/Projects/facial/faceswap/dataset/trumpO/alignments.json'}, 'preview_scaling': 0.5, 'warp_to_landmarks': False, 'augment_color': True, 'no_flip': False, 'pingpong': False, 'snapshot_interval': 25000, 'training_size': 256, 'no_logs': False, 'mask_type': None, 'coverage_ratio': 0.625}, landmarks: False, config: {'mask_type': None, 'icnr_init': False, 'conv_aware_init': False, 'subpixel_upscaling': False, 'reflect_padding': False, 'dssim_loss': True, 'penalized_mask_loss': True, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
08/01/2019 11:43:26 MainProcess training_0 training_data set_mask_class DEBUG Mask class: None
08/01/2019 11:43:26 MainProcess training_0 training_data __init__ DEBUG Initializing ImageManipulation: (input_size: 64, output_shapes: [(64, 64, 3)], coverage_ratio: 0.625, config: {'mask_type': None, 'icnr_init': False, 'conv_aware_init': False, 'subpixel_upscaling': False, 'reflect_padding': False, 'dssim_loss': True, 'penalized_mask_loss': True, 'preview_images': 14, 'zoom_amount': 5, 'rotation_range': 10, 'shift_range': 5, 'flip_chance': 50, 'color_lightness': 30, 'color_ab': 8, 'color_clahe_chance': 50, 'color_clahe_max_size': 4})
08/01/2019 11:43:26 MainProcess training_0 training_data __init__ DEBUG Output sizes: [64]
08/01/2019 11:43:26 MainProcess training_0 training_data __init__ DEBUG Initialized ImageManipulation
08/01/2019 11:43:26 MainProcess training_0 training_data __init__ DEBUG Initialized TrainingDataGenerator
08/01/2019 11:43:26 MainProcess training_0 training_data minibatch_ab DEBUG Queue batches: (image_count: 319, batchsize: 14, side: 'a', do_shuffle: True, is_preview, True, is_timelapse: False)
08/01/2019 11:43:26 MainProcess training_0 training_data make_queues DEBUG ['preview_a_in', 'preview_a_out']
08/01/2019 11:43:26 MainProcess training_0 queue_manager get_queue DEBUG QueueManager getting: 'preview_a_in'
08/01/2019 11:43:26 MainProcess training_0 queue_manager add_queue DEBUG QueueManager adding: (name: 'preview_a_in', maxsize: 0)
08/01/2019 11:43:26 MainProcess training_0 queue_manager add_queue DEBUG QueueManager added: (name: 'preview_a_in')
08/01/2019 11:43:26 MainProcess training_0 queue_manager get_queue DEBUG QueueManager got: 'preview_a_in'
08/01/2019 11:43:26 MainProcess training_0 queue_manager get_queue DEBUG QueueManager getting: 'preview_a_out'
08/01/2019 11:43:26 MainProcess training_0 queue_manager add_queue DEBUG QueueManager adding: (name: 'preview_a_out', maxsize: 0)
08/01/2019 11:43:26 MainProcess training_0 queue_manager add_queue DEBUG QueueManager added: (name: 'preview_a_out')
08/01/2019 11:43:26 MainProcess training_0 queue_manager get_queue DEBUG QueueManager got: 'preview_a_out'
08/01/2019 11:43:26 MainProcess training_0 training_data minibatch_ab DEBUG Batch shapes: [(14, 256, 256, 3), (14, 64, 64, 3), (14, 64, 64, 3)]
08/01/2019 11:43:26 MainProcess training_0 multithreading __init__ DEBUG Initializing FixedProducerDispatcher: (method: '<bound method TrainingDataGenerator.load_batches of <lib.training_data.TrainingDataGenerator object at 0x7f13347c59b0>>', shapes: [(14, 256, 256, 3), (14, 64, 64, 3), (14, 64, 64, 3)], ctype: <class 'ctypes.c_float'>, workers: 1, buffers: None)
08/01/2019 11:43:26 MainProcess training_0 multithreading __init__ DEBUG Initialized FixedProducerDispatcher
08/01/2019 11:43:26 MainProcess training_0 training_data minibatch_ab DEBUG Batching to queue: (side: 'a', is_display: True)
08/01/2019 11:43:26 MainProcess training_0 _base set_preview_feed DEBUG Set preview feed. Batchsize: 14
08/01/2019 11:43:26 MainProcess training_0 training_data minibatch DEBUG Launching minibatch generator for queue (side: 'a', is_display: True)
08/01/2019 11:43:26 SpawnProcess-4 MainThread multithreading _runner DEBUG FixedProducerDispatcher worker for <bound method TrainingDataGenerator.load_batches of <lib.training_data.TrainingDataGenerator object at 0x7f6ebee575c0>> started
08/01/2019 11:43:26 SpawnProcess-4 MainThread training_data load_batches DEBUG Loading batch: (image_count: 319, side: 'a', is_display: True, do_shuffle: True)
08/01/2019 11:43:26 MainProcess training_0 _base largest_face_index DEBUG 0
08/01/2019 11:43:26 MainProcess training_0 deprecation new_func WARNING From /home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\nInstructions for updating:\nUse tf.cast instead.
08/01/2019 11:43:29 MainProcess training_0 training_data join_subprocess DEBUG Joining FixedProducerDispatcher
08/01/2019 11:43:29 SpawnProcess-2 MainThread training_data load_batches DEBUG Finished batching: (epoch: 128, side: 'a', is_display: False)
08/01/2019 11:43:29 SpawnProcess-2 MainThread multithreading _runner DEBUG FixedProducerDispatcher worker for <bound method TrainingDataGenerator.load_batches of <lib.training_data.TrainingDataGenerator object at 0x7f60dff6a550>> shutdown
08/01/2019 11:43:29 MainProcess training_0 training_data join_subprocess DEBUG Joined FixedProducerDispatcher
08/01/2019 11:43:29 MainProcess training_0 training_data join_subprocess DEBUG Joining FixedProducerDispatcher
08/01/2019 11:43:29 SpawnProcess-3 MainThread training_data load_batches DEBUG Finished batching: (epoch: 128, side: 'b', is_display: False)
08/01/2019 11:43:29 SpawnProcess-3 MainThread multithreading _runner DEBUG FixedProducerDispatcher worker for <bound method TrainingDataGenerator.load_batches of <lib.training_data.TrainingDataGenerator object at 0x7f919a390550>> shutdown
08/01/2019 11:43:29 MainProcess training_0 training_data join_subprocess DEBUG Joined FixedProducerDispatcher
08/01/2019 11:43:29 MainProcess training_0 multithreading run DEBUG Error in thread (training_0): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.\n [[{{node encoder/conv_0_conv2d/convolution}}]]\n [[{{node decoder_a/face_out/Sigmoid-2-0-TransposeNCHWToNHWC-LayoutOptimizer}}]]
08/01/2019 11:43:29 MainProcess MainThread train monitor DEBUG Thread error detected
08/01/2019 11:43:29 MainProcess MainThread train monitor DEBUG Closed Monitor
08/01/2019 11:43:29 MainProcess MainThread train end_thread DEBUG Ending Training thread
08/01/2019 11:43:29 MainProcess MainThread train end_thread CRITICAL Error caught! Exiting...
08/01/2019 11:43:29 MainProcess MainThread multithreading join DEBUG Joining Threads: 'training'
08/01/2019 11:43:29 MainProcess MainThread multithreading join DEBUG Joining Thread: 'training_0'
08/01/2019 11:43:29 MainProcess MainThread multithreading join ERROR Caught exception in thread: 'training_0'
Traceback (most recent call last):
File "/home/vicky/Vicky/Projects/facial/faceswap/lib/cli.py", line 122, in execute_script
process.process()
File "/home/vicky/Vicky/Projects/facial/faceswap/scripts/train.py", line 98, in process
self.end_thread(thread, err)
File "/home/vicky/Vicky/Projects/facial/faceswap/scripts/train.py", line 124, in end_thread
thread.join()
File "/home/vicky/Vicky/Projects/facial/faceswap/lib/multithreading.py", line 460, in join
raise thread.err[1].with_traceback(thread.err[2])
File "/home/vicky/Vicky/Projects/facial/faceswap/lib/multithreading.py", line 391, in run
self._target(*self._args, **self._kwargs)
File "/home/vicky/Vicky/Projects/facial/faceswap/scripts/train.py", line 150, in training
raise err
File "/home/vicky/Vicky/Projects/facial/faceswap/scripts/train.py", line 140, in training
self.run_training_cycle(model, trainer)
File "/home/vicky/Vicky/Projects/facial/faceswap/scripts/train.py", line 222, in run_training_cycle
trainer.train_one_step(viewer, timelapse)
File "/home/vicky/Vicky/Projects/facial/faceswap/plugins/train/trainer/_base.py", line 211, in train_one_step
raise err
File "/home/vicky/Vicky/Projects/facial/faceswap/plugins/train/trainer/_base.py", line 176, in train_one_step
loss[side] = batcher.train_one_batch(do_preview)
File "/home/vicky/Vicky/Projects/facial/faceswap/plugins/train/trainer/_base.py", line 276, in train_one_batch
loss = self.model.predictors[self.side].train_on_batch(*batch)
File "/home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/keras/engine/training.py", line 1217, in train_on_batch
outputs = self.train_function(ins)
File "/home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2715, in __call__
return self._call(inputs)
File "/home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2675, in _call
fetched = self._callable_fn(*array_vals)
File "/home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1439, in __call__
run_metadata_ptr)
File "/home/vicky/miniconda3/envs/env_faceswap/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node encoder/conv_0_conv2d/convolution}}]]
[[{{node decoder_a/face_out/Sigmoid-2-0-TransposeNCHWToNHWC-LayoutOptimizer}}]]
============ System Information ============
encoding: UTF-8
git_branch: master
git_commits: c3adc93 Update GUI Graph + Stats when model has finished saving. 2610eff Bugfix: GUI: Progress bar on times over 1 hour (extract/convert). c1c60a9 bugfix: Clip output from scaling in convert. 8b2f166 Update helptext for CA Initialization. b6c830c Bugfix: Alignments tool: Correctly set items attribute on Check job
gpu_cuda: 10.1
gpu_cudnn: 7.6.0
gpu_devices: GPU_0: GeForce RTX 2080 Ti
gpu_devices_active: GPU_0
gpu_driver: 418.56
gpu_vram: GPU_0: 10986MB
os_machine: x86_64
os_platform: Linux-4.18.0-25-generic-x86_64-with-debian-buster-sid
os_release: 4.18.0-25-generic
py_command: /home/vicky/Vicky/Projects/facial/faceswap/faceswap.py train -A /home/vicky/Vicky/Projects/facial/faceswap/dataset/cageO -B /home/vicky/Vicky/Projects/facial/faceswap/dataset/trumpO -m /home/vicky/Vicky/Projects/facial/faceswap/dataset/trump-cage-model -t original -s 100 -ss 25000 -bs 64 -it 1000000 -g 1 -ps 50 -L INFO -gui
py_conda_version: conda 4.7.10
py_implementation: CPython
py_version: 3.6.6
py_virtual_env: True
sys_cores: 8
sys_processor: x86_64
sys_ram: Total: 32102MB, Available: 20682MB, Used: 10020MB, Free: 2429MB
=============== Pip Packages ===============
absl-py==0.7.1
astor==0.7.1
astroid==2.2.5
certifi==2019.6.16
cloudpickle==1.2.1
cycler==0.10.0
cytoolz==0.10.0
dask==2.1.0
decorator==4.4.0
fastcluster==1.1.25
ffmpy==0.2.2
gast==0.2.2
google-pasta==0.1.7
grpcio==1.14.1
h5py==2.9.0
imageio==2.5.0
imageio-ffmpeg==0.3.0
isort==4.3.21
joblib==0.13.2
Keras==2.2.4
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
kiwisolver==1.1.0
lazy-object-proxy==1.4.1
Markdown==3.1.1
matplotlib==2.2.2
mccabe==0.6.1
mock==3.0.5
networkx==2.3
numpy==1.16.2
nvidia-ml-py3==7.352.1
olefile==0.46
opencv-python==4.1.0.25
pathlib==1.0.1
Pillow==5.1.0
protobuf==3.8.0
psutil==5.6.3
pylint==2.3.1
pyparsing==2.4.0
python-dateutil==2.8.0
pytz==2019.1
PyWavelets==1.0.3
PyYAML==5.1.1
scikit-image==0.15.0
scikit-learn==0.21.2
scipy==1.3.0
six==1.12.0
tensorboard==1.13.1
tensorflow==1.13.1
tensorflow-estimator==1.13.0
termcolor==1.1.0
toolz==0.10.0
toposort==1.5
tornado==6.0.3
tqdm==4.32.1
typed-ast==1.4.0
Werkzeug==0.15.4
wrapt==1.11.2
============== Conda Packages ==============
# packages in environment at /home/vicky/miniconda3/envs/env_faceswap:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_tflow_select 2.1.0 gpu
absl-py 0.7.1 py36_0
astor 0.7.1 py36_0
blas 1.0 openblas
bzip2 1.0.8 h516909a_0 conda-forge
c-ares 1.15.0 h7b6447c_1001
ca-certificates 2019.5.15 0
cairo 1.14.12 h77bcde2_0
certifi 2019.6.16 py36_1
cloudpickle 1.2.1 py_0
cudatoolkit 10.0.130 0
cudnn 7.6.0 cuda10.0_0
cupti 10.0.130 0
cycler 0.10.0 py36_0
cytoolz 0.10.0 py36h7b6447c_0
dask-core 2.1.0 py_0
dbus 1.13.2 hc3f9b76_0
decorator 4.4.0 py36_1
expat 2.2.5 he1b5a44_1003 conda-forge
ffmpeg 4.0 h04d0a96_0
fontconfig 2.12.6 h49f89f6_0
freetype 2.8 hab7d2ae_1
gast 0.2.2 py36_0
gettext 0.19.8.1 hc5be6a0_1002 conda-forge
giflib 5.1.9 h516909a_0 conda-forge
glib 2.53.6 h5d9569c_2
gmp 6.1.2 hf484d3e_1000 conda-forge
gnutls 3.6.5 hd3a4fd2_1002 conda-forge
google-pasta 0.1.7 py_0
graphite2 1.3.13 hf484d3e_1000 conda-forge
grpcio 1.14.1 py36h9ba97e2_0
gst-plugins-base 1.12.4 h33fb286_0
gstreamer 1.12.4 hb53b477_0
h5py 2.9.0 pypi_0 pypi
harfbuzz 1.7.6 hc5b324e_0
hdf5 1.10.2 hba1933b_1
icu 58.2 h9c2bf20_1
imageio 2.5.0 py36_0
jasper 1.900.1 h07fcdf6_1006 conda-forge
jpeg 9c h14c3975_1001 conda-forge
keras 2.2.4 0
keras-applications 1.0.8 py_0
keras-base 2.2.4 py36_0
keras-preprocessing 1.1.0 py_1
kiwisolver 1.1.0 py36he6710b0_0
lame 3.100 h14c3975_1001 conda-forge
libblas 3.8.0 10_openblas conda-forge
libcblas 3.8.0 10_openblas conda-forge
libedit 3.1.20181209 hc058e9b_0
libffi 3.2.1 hd88cf55_4
libgcc-ng 9.1.0 hdf63c60_0
libgfortran-ng 7.3.0 hdf63c60_0
libiconv 1.15 h516909a_1005 conda-forge
liblapack 3.8.0 10_openblas conda-forge
liblapacke 3.8.0 10_openblas conda-forge
libopenblas 0.3.6 h6e990d7_6 conda-forge
libopus 1.3 h7b6447c_0
libpng 1.6.37 hed695b0_0 conda-forge
libprotobuf 3.8.0 hd408876_0
libstdcxx-ng 9.1.0 hdf63c60_0
libtiff 4.0.10 h57b8799_1003 conda-forge
libuuid 2.32.1 h14c3975_1000 conda-forge
libvpx 1.7.0 h439df22_0
libwebp 1.0.2 h576950b_1 conda-forge
libxcb 1.13 h14c3975_1002 conda-forge
libxml2 2.9.9 hea5a465_1
lz4-c 1.8.3 he1b5a44_1001 conda-forge
markdown 3.1.1 py36_0
matplotlib 2.2.2 py36h0e671d2_1
mock 3.0.5 py36_0
ncurses 6.1 he6710b0_1
nettle 3.4.1 h1bed415_1002 conda-forge
networkx 2.3 py_0
numpy 1.16.4 py36h95a1406_0 conda-forge
olefile 0.46 py36_0
openblas 0.3.6 h6e990d7_6 conda-forge
opencv 3.4.1 py36h6fd60c2_1
openh264 1.8.0 hdbcaa40_1000 conda-forge
openssl 1.0.2s h7b6447c_0
pathlib 1.0.1 py36_1
pcre 8.41 hf484d3e_1003 conda-forge
pillow 5.1.0 py36h3deb7b8_0
pip 19.1.1 py36_0
pixman 0.38.0 h516909a_1003 conda-forge
protobuf 3.8.0 py36he6710b0_0
pthread-stubs 0.4 h14c3975_1001 conda-forge
pyparsing 2.4.0 py_0
pyqt 5.9.2 py36h751905a_0
python 3.6.6 h6e4f718_2
python-dateutil 2.8.0 py36_0
pytz 2019.1 py_0
pywavelets 1.0.3 py36hdd07704_1
pyyaml 5.1.1 py36h7b6447c_0
qt 5.9.4 h4e5bff0_0
readline 7.0 h7b6447c_5
scikit-image 0.15.0 py36he6710b0_0
scipy 1.3.0 py36he2b7bc3_0
setuptools 41.0.1 py36_0
sip 4.19.8 py36hf484d3e_0
six 1.12.0 py36_0
sqlite 3.29.0 h7b6447c_0
tensorboard 1.13.1 py36hf484d3e_0
tensorflow 1.13.1 gpu_py36h3991807_0
tensorflow-base 1.13.1 gpu_py36h8d69cac_0
tensorflow-estimator 1.13.0 py_0
tensorflow-gpu 1.13.1 h0d30ee6_0
termcolor 1.1.0 py36_1
tk 8.6.8 hbc83047_0
toolz 0.10.0 py_0
tornado 6.0.3 py36h7b6447c_0
tqdm 4.32.1 py_0
werkzeug 0.15.4 py_0
wheel 0.33.4 py36_0
wrapt 1.11.2 py36h7b6447c_0
x264 1!152.20180806 h14c3975_0 conda-forge
xorg-kbproto 1.0.7 h14c3975_1002 conda-forge
xorg-libice 1.0.10 h516909a_0 conda-forge
xorg-libsm 1.2.3 h84519dc_1000 conda-forge
xorg-libx11 1.6.8 h516909a_0 conda-forge
xorg-libxau 1.0.9 h14c3975_0 conda-forge
xorg-libxdmcp 1.1.3 h516909a_0 conda-forge
xorg-libxext 1.3.4 h516909a_0 conda-forge
xorg-libxrender 0.9.10 h516909a_1002 conda-forge
xorg-renderproto 0.11.1 h14c3975_1002 conda-forge
xorg-xextproto 7.3.0 h14c3975_1002 conda-forge
xorg-xproto 7.0.31 h14c3975_1007 conda-forge
xz 5.2.4 h14c3975_4
yaml 0.1.7 had09818_2
zlib 1.2.11 h7b6447c_3
zstd 1.4.0 h3b9ef0a_0 conda-forge