Dieser Code ist mit der eingebauten MATLAB-Funktion ‚k-Mittel ". Sie müssen es mit Ihrem eigenen Algorithmus für k-means modifizieren. Es zeigt die Berechnung von Cluster-Centoirds und die Summe der Quadratfehler (auch als Distrotion bezeichnet).
clc; close all; clear all;
data = readtable('data.txt'); % Importing the data-set
d1 = table2array(data(:, 2)); % Data in first dimension
d2 = table2array(data(:, 3)); % Data in second dimension
d3 = table2array(data(:, 4)); % Data in third dimension
d4 = table2array(data(:, 5)); % Data in fourth dimension
X = [d1, d2, d3, d4]; % Combining the data into a matrix
k = 3; % Number of clusters
idx = kmeans(X, 3); % Alpplying the k-means using inbuilt funciton
%% Separating the data in different dimension
d1_1 = d1(idx == 1); % d1 for the data in cluster 1
d2_1 = d2(idx == 1); % d2 for the data in cluster 1
d3_1 = d3(idx == 1); % d3 for the data in cluster 1
d4_1 = d4(idx == 1); % d4 for the data in cluster 1
%==============================
d1_2 = d1(idx == 2); % d1 for the data in cluster 2
d2_2 = d2(idx == 2); % d2 for the data in cluster 2
d3_2 = d3(idx == 2); % d3 for the data in cluster 2
d4_2 = d4(idx == 2); % d4 for the data in cluster 2
%==============================
d1_3 = d1(idx == 3); % d1 for the data in cluster 3
d2_3 = d2(idx == 3); % d2 for the data in cluster 3
d3_3 = d3(idx == 3); % d3 for the data in cluster 3
d4_3 = d4(idx == 3); % d4 for the data in cluster 3
%% Finding the co-ordinates of the cluster centroids
c1_d1 = mean(d1_1); % d1 value of the centroid for cluster 1
c1_d2 = mean(d2_1); % d2 value of the centroid for cluster 1
c1_d3 = mean(d3_1); % d2 value of the centroid for cluster 1
c1_d4 = mean(d4_1); % d2 value of the centroid for cluster 1
%====================================
c2_d1 = mean(d1_2); % d1 value of the centroid for cluster 2
c2_d2 = mean(d2_2); % d2 value of the centroid for cluster 2
c2_d3 = mean(d3_2); % d2 value of the centroid for cluster 2
c2_d4 = mean(d4_2); % d2 value of the centroid for cluster 2
%====================================
c3_d1 = mean(d1_3); % d1 value of the centroid for cluster 3
c3_d2 = mean(d2_3); % d2 value of the centroid for cluster 3
c3_d3 = mean(d3_3); % d2 value of the centroid for cluster 3
c3_d4 = mean(d4_3); % d2 value of the centroid for cluster 3
%% Calculating the distortion
distortion = 0; % Initialization
for n1 = 1 : length(d1_1)
distortion = distortion + (((c1_d1 - d1_1(n1)).^2) + ((c1_d2 - d2_1(n1)).^2) + ...
((c1_d3 - d3_1(n1)).^2) + ((c1_d4 - d4_1(n1)).^2));
end
for n2 = 1 : length(d1_2)
distortion = distortion + (((c2_d1 - d1_2(n2)).^2) + ((c2_d2 - d2_2(n2)).^2) + ...
((c2_d3 - d3_2(n2)).^2) + ((c2_d4 - d4_2(n2)).^2));
end
for n3 = 1 : length(d1_3)
distortion = distortion + (((c3_d1 - d1_3(n3)).^2) + ((c3_d2 - d2_3(n3)).^2) + ...
((c3_d3 - d3_3(n3)).^2) + ((c3_d4 - d4_3(n3)).^2));
end
fprintf('The unnormalized sum of square error is %f\n', distortion);
fprintf('The co-ordinate of the cluster 1 is \t d1 = %f, d2 = %f, d3 = %f, d4 = %f\n', c1_d1, c1_d2, c1_d3, c1_d4);
fprintf('The co-ordinate of the cluster 2 is \t d1 = %f, d2 = %f, d3 = %f, d4 = %f\n', c2_d1, c2_d2, c2_d3, c2_d4);
fprintf('The co-ordinate of the cluster 3 is \t d1 = %f, d2 = %f, d3 = %f, d4 = %f\n', c3_d1, c3_d2, c3_d3, c3_d4);
Haben Sie Probleme mit der eingebauten 'Kmeans' Funktion, oder bauen Sie eine von Grund auf neu? –
@LeanderMoesinger Vielen Dank für Ihren Kommentar. Eigentlich bin ich in der Lage, die kmeans eingebaute Funktion zu verwenden, aber im Beispiel in Matlab Hilfe konnte ich nicht verstehen, wie ich den Mittelwert, die Mitte, die Größe des Clusters und die Liste der Daten berechnen soll, die jedem Cluster zugewiesen sind. – Bilgin