Bildklassifizierung in Caffe gibt immer dieselbe Klasse zurück

Ich habe ein Problem mit einer Bildklassifizierung in Caffe. Ich benutze das Imagnet-Modell (aus dem Caffe-Tutorial) für die Klassifizierung von Daten, die ich erstellt habe, aber ich bekomme immer das gleiche Klassifizierungsergebnis (gleiche Klasse, d.h. Klasse 3). Dies ist, wie ich vorgehen:Bildklassifizierung in Caffe gibt immer dieselbe Klasse zurück

I verwenden Caffe für Fenster und Python als Schnittstelle

(1) I, die Daten zu sammeln. Meine Beispielbilder (Training & Testen) sind Bilder, die eine Größe von 5x5x3 (RGB) uint8 haben, so dass die Pixelwerte von 0-255 reichen.
(2) Ich skaliere sie auf die Größe, die Imagnet erfordert: 256x256x3. Daher verwende ich die resize-Funktion in Matlab (nearest neighbor interpolation).
(3) Ich erstelle eine LevelDB und image_mean.
(4) Trainiere mein Netzwerk (3000 Iterationen). Die einzigen Parameter, die ich in der IMAGEnet Definition ändern, ist der Pfad zu dem mittleren Bild und die LevelDBs.The Ergebnisse erhalte ich:

I0428 12:38:04.350100 3236 solver.cpp:245]  Train net output #0: loss = 1.91102 (* 1 = 1.91102 loss) 
I0428 12:38:04.350100 3236 sgd_solver.cpp:106] Iteration 2900, lr = 0.0001 
I0428 12:38:30.353361 3236 solver.cpp:229] Iteration 2920, loss = 2.18008 
I0428 12:38:30.353361 3236 solver.cpp:245]  Train net output #0: loss = 2.18008 (* 1 = 2.18008 loss) 
I0428 12:38:30.353361 3236 sgd_solver.cpp:106] Iteration 2920, lr = 0.0001 
I0428 12:38:56.351630 3236 solver.cpp:229] Iteration 2940, loss = 1.90925 
I0428 12:38:56.351630 3236 solver.cpp:245]  Train net output #0: loss = 1.90925 (* 1 = 1.90925 loss) 
I0428 12:38:56.351630 3236 sgd_solver.cpp:106] Iteration 2940, lr = 0.0001 
I0428 12:39:22.341891 3236 solver.cpp:229] Iteration 2960, loss = 1.98917 
I0428 12:39:22.341891 3236 solver.cpp:245]  Train net output #0: loss = 1.98917 (* 1 = 1.98917 loss) 
I0428 12:39:22.341891 3236 sgd_solver.cpp:106] Iteration 2960, lr = 0.0001 
I0428 12:39:48.334151 3236 solver.cpp:229] Iteration 2980, loss = 2.45919 
I0428 12:39:48.334151 3236 solver.cpp:245]  Train net output #0: loss = 2.45919 (* 1 = 2.45919 loss) 
I0428 12:39:48.334151 3236 sgd_solver.cpp:106] Iteration 2980, lr = 0.0001 
I0428 12:40:13.040398 3236 solver.cpp:456] Snapshotting to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel 
I0428 12:40:15.080418 3236 sgd_solver.cpp:273] Snapshotting solver state to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.solverstate 
I0428 12:40:15.820426 3236 solver.cpp:318] Iteration 3000, loss = 2.08741 
I0428 12:40:15.820426 3236 solver.cpp:338] Iteration 3000, Testing net (#0) 
I0428 12:41:50.398375 3236 solver.cpp:406]  Test net output #0: accuracy = 0.11914 
I0428 12:41:50.398375 3236 solver.cpp:406]  Test net output #1: loss = 2.71476 (* 1 = 2.71476 loss) 
I0428 12:41:50.398375 3236 solver.cpp:323] Optimization Done. 
I0428 12:41:50.398375 3236 caffe.cpp:222] Optimization Done.

(5) Ich laufe folgenden Code in Python ein einzelnes Bild zu klassifizieren:

# set up Python environment: numpy for numerical routines, and matplotlib for plotting 
import numpy as np 
import matplotlib.pyplot as plt 
# display plots in this notebook 


# set display defaults 
plt.rcParams['figure.figsize'] = (10, 10)  # large images 
plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show square pixels 
plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than a (potentially misleading) color heatmap 

# The caffe module needs to be on the Python path; 
# we'll add it here explicitly. 
import sys 
caffe_root = '../' # this file should be run from {caffe_root}/examples (otherwise change this line) 
sys.path.insert(0, caffe_root + 'python') 

import caffe 
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path. 


caffe.set_mode_cpu() 

model_def = 'C:/Caffe/caffe-windows-master/models/bvlc_reference_caffenet/deploy.prototxt' 
model_weights = 'Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel' 

net = caffe.Net(model_def,  # defines the structure of the model 
       model_weights, # contains the trained weights 
       caffe.TEST)  # use test mode (e.g., don't perform dropout) 

#load mean image file and convert it to a .npy file-------------------------------- 
blob = caffe.proto.caffe_pb2.BlobProto() 
data = open('Z:/DeepLearning/S1S2/Stockholm/S1S2train256.binaryproto',"rb").read() 
blob.ParseFromString(data) 
nparray = caffe.io.blobproto_to_array(blob) 
f = file('Z:/DeepLearning/PythonCalssification/imgmean.npy',"wb") 
np.save(f,nparray) 

f.close() 


# load the mean ImageNet image (as distributed with Caffe) for subtraction 
mu1 = np.load('Z:/DeepLearning/PythonCalssification/imgmean.npy') 
mu1 = mu1.squeeze() 
mu = mu1.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values 
print 'mean-subtracted values:', zip('BGR', mu) 
print 'mean shape: ',mu1.shape 
print 'data shape: ',net.blobs['data'].data.shape 

# create transformer for the input called 'data' 
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) 

# set the size of the input (we can skip this if we're happy 

transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension 
transformer.set_mean('data', mu)   # subtract the dataset-mean value in each channel 
transformer.set_raw_scale('data', 255)  # rescale from [0, 1] to [0, 255] 
transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR 

# set the size of the input (we can skip this if we're happy 
# with the default; we can also change it later, e.g., for different batch sizes) 
net.blobs['data'].reshape(50,  # batch size 
          3,   # 3-channel (BGR) images 
          227, 227) # image size is 227x227 

#load image 
image = caffe.io.load_image('Z:/DeepLearning/PythonCalssification/380.tiff') 
transformed_image = transformer.preprocess('data', image) 
#plt.imshow(image) 

# copy the image data into the memory allocated for the net 
net.blobs['data'].data[...] = transformed_image 

### perform classification 
output = net.forward() 

output_prob = output['prob'][0] # the output probability vector for the first image in the batch 

print 'predicted class is:', output_prob.argmax()

Es spielt keine Rolle, welches Eingabebild ich verwende, ich bekomme immer die Klasse "3" als Klassifizierungsergebnis. Hier ist ein Beispielbild, das ich trainiere/klassifiziere:

Ich wäre sehr glücklich, wenn jemand eine Idee hat, was falsch ist? Danke im Voraus!

Quelle

2016-04-28 Mr M

Wie viele Daten verwenden Sie? Wie viele Klassen und Beispiele pro Klasse? –

Wenn Sie immer die gleiche Klasse bekommen, bedeutet dies, dass der NN nicht richtig trainiert wurde.

Stellen Sie sicher, dass das Trainingssatz ausgewogen ist. Wenn ein Klassifikator immer die gleiche Klasse vorhersagt, dann oft, weil eine Klasse entsprechend den anderen überrepräsentiert ist. Nehmen wir zum Beispiel an, dass Sie zwei Klassen haben, die erste durch 95 Instanzen und die zweite durch 5. Wenn der Klassifikator alles als zur ersten Klasse gehörend klassifiziert, dann ist er bereits bei 95%.
Eine offensichtliche Sache ist, dass Sie die Eingänge image/255.0 - 0.5 normalisieren sollten, wird es die Eingabe zentrieren und die Standardabweichung verringern.
Danach, stellen Sie sicher, dass Sie mindestens 4 mal mehr Daten in Ihrem Trainingssatz haben, dass Sie Gewichte in Ihrem NN haben.
Zu guter Letzt, stellen Sie sicher, dass der Trainingssatz richtig gemischt ist.

Quelle

2016-04-28 21:03:04 FiReTiTi

Ich werde versuchen, Ihre Vorschläge Schritt für Schritt durchzugehen: 1) Ich habe 8 Klassen. Sie sind durch folgende Probengrößen dargestellt: Klasse 1: 918 Klasse 2: 897 Klasse 3: 922 Klasse 4: 799 Klasse 5:69 Klasse 6: 277 Klasse 7: 718 Klasse 8: 691 –

2) Soweit ich es erlebe, benötigt Imagenet eine Bildnormalisierung, die das Bild/Pixelmittel verwendet. Daher werden die folgenden Schritte im obigen Python-Code ausgeführt: transformer.set_transpose ('data', (2,0,1)) # Bildkanäle auf äußerste Dimension verschieben transformer.set_mean ('data', mu) # subtrahiere die Datensatz-Mittelwert in jedem Kanal Transformator.set_raw_scale ('data', 255) # rescale von [0, 1] nach [0, 255] transformer.set_channel_swap ('data', (2,1,0)) # tausche Kanäle von RGB nach BGR. –

In diesen Schritten wird das Bildmittel subtrahiert, das Bild wird auf 0-255 skaliert, die Kanäle werden umgeschaltet, da sie in der umgekehrten Reihenfolge geladen werden, schließlich wird die Transponieroperation ausgeführt (ich bin nicht 100% sicher, warum dies so ist) benötigt aber) –

Bildklassifizierung in Caffe gibt immer dieselbe Klasse zurück

Antwort

Verwandte Themen