Ich habe ein trainiertes Modell mit Caffe (über die Befehlszeile), wo ich eine Genauigkeit von 63% (gemäß den Protokolldateien) erhalten. Wenn ich jedoch versuche, ein Skript in Python auszuführen, um die Genauigkeit zu testen, bekomme ich alle Vorhersagen in der gleichen Klasse mit sehr ähnlichen Vorhersagewerten, aber nicht ganz identisch. Mein Ziel ist es, die Genauigkeit pro Klasse zu berechnen.Caffe gibt immer die gleiche Vorhersage in Python, aber Trainingsgenauigkeit ist gut
Hier sind einige Beispiele von Prognosen:
[ 0.20748076 0.20283087 0.04773897 0.28503627 0.04591063 0.21100247] (label 0)
[ 0.21177764 0.20092578 0.04866471 0.28302929 0.04671735 0.20888527] (label 4)
[ 0.19711637 0.20476575 0.04688895 0.28988105 0.0465695 0.21477833] (label 3)
[ 0.21062914 0.20984225 0.04802448 0.26924771 0.05020727 0.21204917] (label 1)
Hier ist der Vorhersage-Skript (nur der Teil, der die Vorhersagen für ein bestimmtes Bild gibt):
import numpy as np
import matplotlib.pyplot as plt
import sys
import os
import caffe
caffe.set_device(0)
caffe.set_mode_gpu()
# Prepare Network
MODEL_FILE = 'finetune_deploy.prototxt'
PRETRAINED = 'finetune_lr3_iter_25800.caffemodel.h5'
MEAN_FILE = 'balanced_dataset_256/Training/labels_mean/trainingMean_original.binaryproto'
blob = caffe.proto.caffe_pb2.BlobProto()
dataBlob = open(MEAN_FILE , 'rb').read()
blob.ParseFromString(dataBlob)
dataMeanArray = np.array(caffe.io.blobproto_to_array(blob))
mu = dataMeanArray[0].mean(1).mean(1)
net = caffe.Classifier(MODEL_FILE, PRETRAINED,
mean=mu,
channel_swap=(2,1,0),
raw_scale=255,
image_dims=(256, 256))
PREFIX='balanced_dataset_256/PrivateTest/'
LABEL = '1'
imgName = '33408.jpg'
IMAGE_PATH = PREFIX + LABEL + '/' + imgName
input_image = caffe.io.load_image(IMAGE_PATH)
plt.imshow(input_image)
prediction = net.predict([input_image]) # predict takes any number of images, and formats them for the Caffe net automatically
print 'prediction shape:', prediction[0].shape
plt.plot(prediction[0])
print 'predicted class:', prediction[0].argmax()
print prediction[0]
Die Eingangsdaten sind Graustufen-, aber ich konvertiere es in RGB durch Duplizieren der Kanäle.
Hier ist die Architektur-Datei finetune_deploy.prototxt:
name: "FlickrStyleCaffeNetTest"
layer {
name: "data"
type: "Input"
top: "data"
# top: "label"
input_param { shape: { dim: 1 dim: 3 dim: 256 dim: 256 } }
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
# Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_flickr"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_flickr"
# lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
param {
lr_mult: 10
decay_mult: 1
}
param {
lr_mult: 20
decay_mult: 0
}
inner_product_param {
num_output: 6
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "prob"
type: "Softmax"
bottom: "fc8_flickr"
top: "prob"
}
Wie haben Sie die Caffemodeldatei im .h5-Format erhalten? – malreddysid
So wurde es während des Trainings generiert, indem snapshot_format: HDF5 im Solver angegeben wurde – alinafdima
Sind Sie sicher, dass dieses Format vom Python-Skript eingelesen werden kann? Können Sie die Gewichte überprüfen, um zu sehen, dass es richtig ist? – malreddysid