Ressource erschöpfte Fehler OOM beim Zuordnen von Tensor mit Form [384,192]

Ich habe es jetzt mit einem medizinischen Bildklassifikationsproblem zu tun. Die Bilder sind mehrere Schichten des Gehirns des Patienten. Die Daten wurden bereits bereinigt. Ich habe 150 AD-Patienten, 150 MCI-Patienten (milde Kognitionsstörung), 150 NC-Patienten (normal). Jeder Patient hat 96 Dicom-Dateien oder gleichlautende Aussagen, 96 Schnitte. Jede Scheibe ist 160 * 160.Ressource erschöpfte Fehler OOM beim Zuordnen von Tensor mit Form [384,192]

Ich benutze den Tensorflow cifar10 Code als meine Vorlage und um den Code arbeiten zu können muss ich seinen read_cifar10 Teil ändern. Ich ändere den Code, wie der folgende Link suggeriert.

Attach a queue to a numpy array in tensorflow for data fetch instead of files?

Zunächst ändere ich die Daten in binären Dateien mit dem meinem eigenen Python-Modul load_img_M.py. Um die Datenmenge zu verringern, wähle ich nur die mittleren Scheiben, von 30 bis 60.

import numpy as np 
    import dicom 
    import os      
    ad_path = '/home/zmz/Pictures/AD' 
    mci_path = '/home/zmz/Pictures/MCI' 
    nc_path = '/home/zmz/Pictures/NC' 
    data_path =['/home/zmz/Pictures/%s' %i for i in ['NC','MCI','AD']] 
    KDOWN = 29 
    KUP = 61 
    SLICESNUM = KUP - KDOWN + 1 
    RECORDBYTES = 160*160*SLICESNUM + 1 
    #load image from the directory and save to binary file 
    def img2binary():   
     train_arr = np.zeros([100,SLICESNUM*160*160+1]) 
     test_arr = np.zeros([50,SLICESNUM*160*160+1]) 
     for p in range(len(data_path)): 
      Patientlist = os.listdir(data_path[p])   
      for q in range(len(Patientlist)): 
       Dicompath = os.path.join(data_path[p],Patientlist[q]) 
       Dicomlist = os.listdir(Dicompath) 
       if q<100: 
        train_arr[q,0] = p 
       else: 
        test_arr[q-100,0] = p    
       for k in range(len(Dicomlist)): 
        if k>KDOWN and k<KUP:#select the middle slices which have the most information 
         Picturepath = os.path.join(Dicompath,Dicomlist[k]) 
         img = dicom.read_file(Picturepath) 
         #print(type(img.pixel_array)) 
         imgpixel = img.pixel_array.reshape(25600) 
         if q <100: 
          # assign the label of the picture 
          train_arr[q,(1+(k-KDOWN-1)*25600):(1+(k-KDOWN)*25600)] = imgpixel #assign the pixel 
         else:       
          test_arr[q-100,(1+(k-KDOWN-1)*25600):(1+(k-KDOWN)*25600)] = imgpixel             
      train_arr.tofile("/home/zmz/Pictures/tmp/images/train%s.bin"%p) 
      test_arr.tofile("/home/zmz/Pictures/tmp/images/test%s.bin"%p)

Und die binären Dateien wie folgt aussehen:

How binary files look like

Als nächstes ändere ich das cifar10_input Modul:

"""Routine for decoding the Alzeheimer dicom format""" 

from __future__ import absolute_import 
from __future__ import division 
from __future__ import print_function 
import load_img_M 
import os 
import numpy as np 
import dicom 

from six.moves import xrange # pylint: disable=redefined-builtin 
import tensorflow as tf 

# Process images of this size. Note that this differs from the original CIFAR 
# image size of 32 x 32. If one alters this number, then the entire model 
# architecture will change and any model would need to be retrained. 
IMAGE_SIZE = 160 

# Global constants describing the ADNI data set. 
IMAGE_HEIGHT = 160 
IMAGE_WIDTH = 160 
IMAGE_CHANNEL = 1 
SLICES_NUM = load_img_M.SLICESNUM 
NUM_CLASSES = 3 
NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 300 
NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 150 
#define a dicom reader to read a record 


def read_ADNI(filename_queue): 
    """Reads and parses examples from ADNI data files. 

    Recommendation: if you want N-way read parallelism, call this function 
    N times. This will give you N independent Readers reading different 
    files & positions within those files, which will give better mixing of 
    examples. 

    Args: 
    filename_queue: A queue of strings with the filenames to read from. 

    Returns: 
    An object representing a single example, with the following fields: 
     height: number of rows in the result (160) 
     width: number of columns in the result (160) 
     channels: number of color channels in the result (1) 
     key: a scalar string Tensor describing the filename & record number 
     for this example. 
     label: an int32 Tensor with the label in the range 0,1,2. 
     uint8image: a [slice, height, width, channels] uint8 Tensor with the image data 
    """ 

    class ADNIRecord(object): 
    pass#do nothing the class is vacant 
    result = ADNIRecord() 


    label_bytes = 1 
    result.height = IMAGE_HEIGHT 
    result.width = IMAGE_WIDTH 
    result.depth = IMAGE_CHANNEL 
    result.slice = SLICES_NUM 
    image_bytes = result.height * result.width * result.depth * result.slice 
    record_bytes = label_bytes + image_bytes 

    assert record_bytes == load_img_M.RECORDBYTES 

    reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) 
    result.key, value = reader.read(filename_queue) 

    # Convert from a string to a vector of uint8 that is record_bytes long. 
    record_bytes = tf.decode_raw(value, tf.uint8) 

    # The first bytes represent the label, which we convert from uint8->int32. 
    result.label = tf.cast(
     tf.strided_slice(record_bytes, [0], [label_bytes]), tf.int32) 

    # The remaining bytes after the label represent the image, which we reshape 
    # from [depth * height * width] to [depth, height, width]. 
    depth_major = tf.reshape(
     tf.strided_slice(record_bytes, [label_bytes], 
         [label_bytes + image_bytes]), 
     [result.slice,result.height, result.width, result.depth]) 
    # Convert from [depth, height, width] to [height, width, depth]. 
    #result.uint8image = tf.transpose(depth_major, [1, 2, 0]) 
    result.uint8image = depth_major 
    return result

Schließlich ändere ich den verzerrten Eingang. Ich lösche diese Blöcke für die Bildvorverarbeitung wie Beschneiden und Spiegeln:

Dieses Problem ist ein 3d Conv Netzwerkproblem. Ich benutze tf.conv3d und Änderungen vornehmen, damit der Code funktionieren könnte:

def inference(images):#core code 
    """Build the ADNI model. 

    Args: 
    images: Images returned from distorted_inputs() or inputs(). 

    Returns: 
    Logits. 
    """ 
    # We instantiate all variables using tf.get_variable() instead of 
    # tf.Variable() in order to share variables across multiple GPU training runs. 
    # If we only ran this model on a single GPU, we could simplify this function 
    # by replacing all instances of tf.get_variable() with tf.Variable(). 
    # 
    # conv1 
    with tf.variable_scope('conv1') as scope: 
    kernel = _variable_with_weight_decay('weights', 
             shape=[3, 3, 3, 1, 64], 
             stddev=5e-2, 
             wd=0.0) 
    conv = tf.nn.conv3d(images, kernel, [1, 1, 1, 1, 1], padding='SAME') 
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0)) 
    pre_activation = tf.nn.bias_add(conv, biases) 
    conv1 = tf.nn.relu(pre_activation, name=scope.name) 
    _activation_summary(conv1) 

    # pool1 
    pool1 = tf.nn.max_pool3d(conv1, ksize=[1, 3, 3, 3, 1], strides=[1, 1, 2, 2, 1], 
         padding='SAME', name='pool1') 
    # norm1 
    #norm1 = tf.nn.lrn3d(pool1, 4, bias=1.0, alpha=0.001/9.0, beta=0.75,name='norm1') 
    norm1 = pool1 

    # conv2 
    with tf.variable_scope('conv2') as scope: 
    kernel = _variable_with_weight_decay('weights', 
             shape=[3, 3, 3, 64, 64], 
             stddev=5e-2, 
             wd=0.0) 
    conv = tf.nn.conv3d(norm1, kernel, [1, 1, 1, 1, 1], padding='SAME') 
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.1)) 
    pre_activation = tf.nn.bias_add(conv, biases) 
    conv2 = tf.nn.relu(pre_activation, name=scope.name) 
    _activation_summary(conv2) 

    # norm2 
    #norm2 = tf.nn.lrn3d(conv2, 4, bias=1.0, alpha=0.001/9.0, beta=0.75,name='norm2') 
    norm2 = conv2 
    # pool2 
    pool2 = tf.nn.max_pool3d(norm2, ksize=[1, 3, 3 ,3, 1], 
         strides=[1, 1, 2, 2, 1], padding='SAME', name='pool2') 

    # local3 
    with tf.variable_scope('local3') as scope: 
    # Move everything into depth so we can perform a single matrix multiply. 
    reshape = tf.reshape(pool2, [FLAGS.batch_size, -1]) 
    dim = reshape.get_shape()[1].value 
    weights = _variable_with_weight_decay('weights', shape=[dim, 384], 
              stddev=0.04, wd=0.004) 
    biases = _variable_on_cpu('biases', [384], tf.constant_initializer(0.1)) 
    local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name) 
    _activation_summary(local3) 

    # local4 
    with tf.variable_scope('local4') as scope: 
    weights = _variable_with_weight_decay('weights', shape=[384, 192], 
              stddev=0.04, wd=0.004) 
    biases = _variable_on_cpu('biases', [192], tf.constant_initializer(0.1)) 
    local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name) 
    _activation_summary(local4) 

    # linear layer(WX + b), 
    # We don't apply softmax here because 
    # tf.nn.sparse_softmax_cross_entropy_with_logits accepts the unscaled logits 
    # and performs the softmax internally for efficiency. 
    with tf.variable_scope('softmax_linear') as scope: 
    weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES], 
              stddev=1/192.0, wd=0.0) 
    biases = _variable_on_cpu('biases', [NUM_CLASSES], 
           tf.constant_initializer(0.0)) 
    softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name) 
    _activation_summary(softmax_linear) 

    return softmax_linear

Ich habe Angst vor dem Computer nicht diese große Menge an data.So umgehen konnte ich die Losgröße 1 gewählt hatte auf dem ersten zu sein, in der Hoffnung der Code könnte mindestens laufen.Noch ich stoße auf das Problem im Titel. Eigentlich habe ich sehr gute Linux-Workstation.12G GPU Titan xp, 64G Speicher. Aber ich teile es mit meinen Klassenkameraden. Daher sind die Ressourcen, die meinem Konto zugewiesen sind, möglicherweise niedrig. My Linux GPU parameters

Und es ist noch besser, wenn Sie eine Berechnung durchführen können, um zu demonstrieren, warum die Ressource erschöpft ist.

Quelle

2017-10-10 oliverzzm

Selbst für Losgröße 1 ist das Netzwerk zu groß. Das Problem kann in der Anzahl der benötigten Gewichte gefunden werden, nicht in der Charge. Versuchen Sie ein paar Schichten der Convonet zu entfernen oder reduzieren Sie die Pixel der 3D-Bilder und sehen Sie, wo es Ihre Grenze ist.

Quelle

2017-10-10 12:07:23

Gibt es bessere Möglichkeiten, die Dicom-Daten einzugeben? Werden bei der Einreihung dieser Binärdateien mit 16 Threads viele RAMs benötigt? – oliverzzm

Ich glaube nicht, dass Ihr Problem die Daten sind. Der Speicher ist zu klein, um die Gewichte Ihres Faltungsnetzwerkes zu speichern. Die Daten anders zu füttern wird in Ihrem Fall keinen signifikanten Unterschied machen. –

Ich ändere die Größe auf 80 * 80, und keine erschöpften Ressourcen Problem. – oliverzzm

Ressource erschöpfte Fehler OOM beim Zuordnen von Tensor mit Form [384,192]

Antwort

Verwandte Themen