Gradient Descent & lineare Regression - Code konvergiert nicht

Ich versuche, den Gradienten-Abstieg-Algorithmus von Grund auf ein Spielzeugproblem zu implementieren. Mein Code gibt immer einen Vektor von NaN ‚s:Gradient Descent & lineare Regression - Code konvergiert nicht

from sklearn.linear_model import LinearRegression 
import numpy as np 
import matplotlib.pyplot as plt 

np.random.seed(45) 
x = np.linspace(0, 1000, num=1000) 
y = 3*x + 2 + np.random.randn(len(x)) 

# sklearn output - This works (returns intercept = 1.6, coef = 3) 
lm = LinearRegression() 
lm.fit(x.reshape(-1, 1), y.reshape(-1, 1)) 
print("Intercept = {:.2f}, Coef = {:.2f}".format(lm.coef_[0][0], lm.intercept_[0])) 

# BGD output 
theta = np.array((0, 0)).reshape(-1, 1) 
X = np.hstack([np.ones_like(x.reshape(-1, 1)), x.reshape(-1, 1)]) # [1, x] 
Y = y.reshape(-1, 1) # Column vector 
alpha = 0.05 
for i in range(100): 
    # Update: theta <- theta - alpha * [X.T][X][theta] - [X.T][Y] 
    h = np.dot(X, theta) # Hypothesis 
    loss = h - Y 
    theta = theta - alpha*np.dot(X.T, loss) 
theta

Der sklearn Teil läuft gut, also muss ich für Schleife etwas falsch in der tun. Ich habe verschiedene alpha Werte ausprobiert und keiner von ihnen konvergiert.

Das Problem ist theta wird immer größer und größer und wird schließlich zu groß für Python zu speichern.

Hier ist ein Konturdiagramm der Funktion Kosten:

J = np.dot((np.dot(X, theta) - y).T, (np.dot(X, theta) - y)) 
plt.contour(J)

Offensichtlich gibt es keine Mindest hier. Wo bin ich falsch gelaufen?

Dank

Quelle

2016-12-29 user78655

Im Theta-Update sollte der zweite Begriff, der von der Größe des Trainingssatzes aufgeteilt werden. Mehr Details sind dort: gradient descent using python and numpy

Quelle

2016-12-29 18:27:59

Gradient Descent & lineare Regression - Code konvergiert nicht

Antwort

Verwandte Themen