2017-05-03 4 views
2

Hallo, ich versuche, Tensorflow 1.0 zu installieren und auszuführen.Tensorflow Basic Beispielfehler: CUBLAS_STATUS_NOT_INITIALIZED

Ich verwende die folgende Anleitung https://www.tensorflow.org/get_started/mnist/beginners

Allerdings, wenn ich die Datei mnist_softmax.py ich die folgenden Fehler bekommen laufen.

python3 mnist_softmax.py 
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz 
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz 
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz 
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz 
2017-05-03 14:25:28.243213: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. 
2017-05-03 14:25:28.243234: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 
2017-05-03 14:25:28.243238: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 
2017-05-03 14:25:28.243241: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 
2017-05-03 14:25:28.243244: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. 
2017-05-03 14:25:28.436478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 1080 Ti 
major: 6 minor: 1 memoryClockRate (GHz) 1.582 
pciBusID 0000:02:00.0 
Total memory: 10.91GiB 
Free memory: 349.06MiB 
2017-05-03 14:25:28.436501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-05-03 14:25:28.436505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y 
2017-05-03 14:25:28.436510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:02:00.0) 
2017-05-03 14:25:30.507057: E tensorflow/stream_executor/cuda/cuda_blas.cc:365] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED 
2017-05-03 14:25:30.507091: W tensorflow/stream_executor/stream.cc:1550] attempting to perform BLAS operation using StreamExecutor without BLAS support 
Traceback (most recent call last): 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1039, in _do_call 
    return fn(*args) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _run_fn 
    status, run_metadata) 
    File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__ 
    next(self.gen) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status 
    pywrap_tensorflow.TF_GetCode(status)) 
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784 
    [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]] 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "mnist_softmax.py", line 79, in <module> 
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run 
    _sys.exit(main(_sys.argv[:1] + flags_passthrough)) 
    File "mnist_softmax.py", line 66, in main 
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 778, in run 
    run_metadata_ptr) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 982, in _run 
    feed_dict_string, options, run_metadata) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run 
    target_list, options, run_metadata) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call 
    raise type(e)(node_def, op, message) 
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784 
    [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]] 

Caused by op 'MatMul', defined at: 
    File "mnist_softmax.py", line 79, in <module> 
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run 
    _sys.exit(main(_sys.argv[:1] + flags_passthrough)) 
    File "mnist_softmax.py", line 43, in main 
    y = tf.matmul(x, W) + b 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/ops/math_ops.py", line 1801, in matmul 
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_math_ops.py", line 1263, in _mat_mul 
    transpose_b=transpose_b, name=name) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 768, in apply_op 
    op_def=op_def) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op 
    original_op=self._default_original_op, op_def=op_def) 
    File "/home/fernando/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__ 
    self._traceback = _extract_stack() 

InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(100, 784), b.shape=(784, 10), m=100, n=10, k=784 
    [[Node: MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](_recv_Placeholder_0/_9, Variable/read)]] 

Ich bin nicht sicher, warum ich diese Störung erhalte, kann ich auch nicht das matrixMulCUBLAS cuda Beispiel entweder ausführen und die folgende Fehlermeldung erhalten.

./matrixMulCUBLAS 
[Matrix Multiply CUBLAS] - Starting... 
GPU Device 0: "GeForce GTX 1080 Ti" with compute capability 6.1 

MatrixA(640,480), MatrixB(480,320), MatrixC(640,320) 
CUDA error at matrixMulCUBLAS.cpp:277 code=1(CUBLAS_STATUS_NOT_INITIALIZED) "cublasCreate(&handle)" 

ALL CUDA Beispiele funktionieren, wenn sie CUBLAS verwenden, nicht sicher, ob dies zu meinem tensorflow Fehler in Zusammenhang steht.

+0

Ich habe den gleichen Fehler bei einem Skript, das ich versuche zu machen. Würde jemand bitte erklären, was der Fehler "Tensorflow.python.framework.errors_impl.InternalError: Blas GEMM Start fehlgeschlagen" bedeutet? – Teancum

Antwort

0

@FernandoMM Ich habe mein Skript ausgeführt, wo ich den gleichen Fehler bekam. In meinem Fall habe ich externe Displays meiner GPU laufen lassen und es hat den ganzen GPU-RAM aufgefressen. Ich habe alle Anzeigen getrennt und Python neu gestartet (in meinem Fall benutzte ich einen Jupyter Server) und es funktionierte. Es sieht so aus als hättest du nur 'Freier Speicher: 349.06MiB'. Vielleicht wird es auch für dich funktionieren, etwas Speicher freizugeben? Ich bin mir noch nicht sicher, warum das bei mir funktioniert hat und wie es sich auf den Fehler bezieht, vielleicht kann uns also jemand anders aufklären :).