When can we use the normal equation instead of gradient descent to minimize J(theta)? — - when we have fewer than 10,000 features - have to ensure invertibility
G
1.2K
Google Interview
This flashcard deck made by jwasham contains knowledge about google interview. For more details, please follow https://github.com/jwasham/google-interview-university