2017-03-14 2 views
0

Ich versuche, die Verwendung meiner Tensorflow-Modelle mit Timeline zu überwachen. Dieser Link erklärt, wie man es benutzt: https://stackoverflow.com/a/37774470/6716760. Das minimale Beispiel hierfür ist:Tensorflow-Kernel stürzt ab, wenn ich versuche, trace_level = tf.RunOptions.FULL_TRACE

import tensorflow as tf 
from tensorflow.python.client import timeline 

x = tf.random_normal([1000, 1000]) 
y = tf.random_normal([1000, 1000]) 
res = tf.matmul(x, y) 

# Run the graph with full trace option 
with tf.Session() as sess: 
    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) 
    run_metadata = tf.RunMetadata() 
    sess.run(res, options=run_options, run_metadata=run_metadata) 

    # Create the Timeline object, and write it to a json 
    tl = timeline.Timeline(run_metadata.step_stats) 
    ctf = tl.generate_chrome_trace_format() 
    with open('timeline.json', 'w') as f: 
    f.write(ctf) 

Leider ich folgende Fehlermeldung erhalten, wenn ich versuche, das Skript auszuführen:

An error ocurred while starting the kernel 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally 
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: 
name: TITAN X (Pascal) 
major: 6 minor: 1 memoryClockRate (GHz) 1.531 
pciBusID 0000:0a:00.0 
Total memory: 11.90GiB 
Free memory: 11.61GiB 
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x28f93b0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 1 with properties: 
name: TITAN X (Pascal) 
major: 6 minor: 1 memoryClockRate (GHz) 1.531 
pciBusID 0000:09:00.0 
Total memory: 11.90GiB 
Free memory: 11.75GiB 
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2c976b0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 2 with properties: 
name: TITAN X (Pascal) 
major: 6 minor: 1 memoryClockRate (GHz) 1.531 
pciBusID 0000:06:00.0 
Total memory: 11.90GiB 
Free memory: 11.75GiB 
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x2ba5d80 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 3 with properties: 
name: TITAN X (Pascal) 
major: 6 minor: 1 memoryClockRate (GHz) 1.531 
pciBusID 0000:05:00.0 
Total memory: 11.89GiB 
Free memory: 11.52GiB 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 2 3 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1: Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 2: Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 3: Y Y Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) ‑> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:0a:00.0) 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) ‑> (device: 1, name: TITAN X (Pascal), pci bus id: 0000:09:00.0) 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:2) ‑> (device: 2, name: TITAN X (Pascal), pci bus id: 0000:06:00.0) 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:3) ‑> (device: 3, name: TITAN X (Pascal), pci bus id: 0000:05:00.0) 
I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcupti.so.8.0. LD_LIBRARY_PATH: /usr/local/cuda/lib64 
F tensorflow/core/platform/default/gpu/cupti_wrapper.cc:59] Check failed: ::tensorflow::Status::OK() == (::tensorflow::Env::Default()‑>GetSymbolFromLibrary(GetDsoHandle(), kName, &f)) (OK vs. Not found: /home/sysgen/anaconda3/lib/python3.5/site‑packages/tensorflow/python/_pywrap_tensorflow.so: undefined symbol: cuptiActivityRegisterCallbacks)could not find cuptiActivityRegisterCallbacksin libcupti DSO 

Der Fehler in der letzten Zeile am Ende versteckt ist. Aber was bedeutet das? Wie kann ich es beheben?

+0

Dieser Thread könnte einige hilfreiche Informationen bieten: https://github.com/tensorflow/tensorflow/issues/2626 –

Antwort

0

Sie tun müssen:

sudo apt install libcupti-dev 

und fügen Sie diese zu Ihrem bashrc/zshrc:

export LD_LIBRARY_PATH=/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH 

Hoffe, dass es

helfen
0

es mir passiert ist und Grund war die Datei cupti64_80.dll das konnte nicht gefunden werden. Cuda 8 installieren Sie diese Datei in C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\extras\CUPTI\libx64 Ordner, der nicht im Pfad ist. So kopieren Sie die DLL zu C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\bin, und die Lib-Datei zu C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64

Verwandte Themen