Gradient Descent

About 863,000 results

Open links in new tab

Any time

stackexchange.com
https://stats.stackexchange.com › questions
optimization - Batch gradient descent versus stochastic gradient ...
Additionally, batch gradient descent, given an annealed learning rate, will eventually find the minimum located in it's basin of attraction. Stochastic gradient descent (SGD) computes the gradient using a …
stackoverflow.com
https://stackoverflow.com › questions
machine learning - why gradient descent when we can solve linear ...
Aug 12, 2013 · what is the benefit of using Gradient Descent in the linear regression space? looks like the we can solve the problem (finding theta0-n that minimum the cost func) with analytical method so …
stackexchange.com
https://stats.stackexchange.com › questions
Why use gradient descent with neural networks?
Jul 8, 2017 · When training a neural network using the back-propagation algorithm, the gradient descent method is used to determine the weight updates. My question is: Rather than using gradient descent …
stackoverflow.com
https://stackoverflow.com › questions
gradient descent using python and numpy - Stack Overflow
Jul 22, 2013 · Below you can find my implementation of gradient descent for linear regression problem. At first, you calculate gradient like X.T * (X * w - y) / N and update your current theta with this …
stackoverflow.com
https://stackoverflow.com › questions
What is the difference between Gradient Descent and Newton's …
82 I understand what Gradient Descent does. Basically it tries to move towards the local optimal solution by slowly moving down the curve. I am trying to understand what is the actual difference between the …
stackexchange.com
https://stats.stackexchange.com › questions
Why is Newton's method not widely used in machine learning?
Dec 29, 2016 · 161 Gradient descent maximizes a function using knowledge of its derivative. Newton's method, a root finding algorithm, maximizes a function using knowledge of its second derivative. That …
stackoverflow.com
https://stackoverflow.com › questions
Gradient Descent with constraints (lagrange multipliers)
Since the gradient descent algorithm is designed to find local minima, it fails to converge when you give it a problem with constraints. There are typically three solutions: Use a numerical method which is …
stackoverflow.com
https://stackoverflow.com › questions
Adam Optimizer vs Gradient Descent - Stack Overflow
Aug 25, 2018 · AdamOptimizer is using the Adam Optimizer to update the learning rate. Its is an adaptive method compared to the gradient descent which maintains a single learning rate for all …
stackexchange.com
https://stats.stackexchange.com › questions
optimization - What is the computational cost of gradient descent vs ...
May 12, 2019 · Gradient descent has a time complexity of O (ndk), where d is the number of features, and n Is the number of rows. So, when d and n and large, it is better to use gradient descent.
stackoverflow.com
https://stackoverflow.com › questions
machine learning - Gradient descent convergence How to decide ...
Jun 25, 2013 · 15 I learnt gradient descent through online resources (namely machine learning at coursera). However the information provided only said to repeat gradient descent until it converges. …

Pagination
- 1
- 2
- 3
- Next