Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Machine Learning Algorithm Application

Prioritizing What to Work On System Desing Example: ์ŠคํŒธ ๋ฉ”์ผ์„ ๋ถ„๋ฅ˜ํ•œ๋‹ค๊ณ  ํ•  ๋•Œ, ์ด๋ฉ”์ผ ์„ธํŠธ๊ฐ€ ์ฃผ์–ด์ง€๋ฉด ๊ฐ ์ด๋ฉ”์ผ์— ๋Œ€ํ•œ ๋ฒกํ„ฐ๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•œ๋‹ค. ์ด ๋ฒกํ„ฐ์˜ ๊ฐ๊ฐ์˜ entry๋Š” ๋‹จ์–ด๋“ค์„ ๋‚˜ํƒ€๋‚ธ๋‹ค. ๋ฒกํ„ฐ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ๋ฐ์ดํ„ฐ์…‹์—์„œ ํ”ํ•˜๊ฒŒ ๋ฐœ๊ฒฌ๋˜๋Š” ๋‹จ์–ด๋“ค์„ ๋ชจ์•„์„œ 10,000๊ฐœ์—์„œ 50,000๊ฐœ์˜ entry๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ๋งŒ์•ฝ ์ด๋ฉ”์ผ์—์„œ ๋‹จ์–ด๊ฐ€ ์ฐพ์•„์ง€๋ฉด, ์ด์— ๋Œ€ํ•œ entry๋ฅผ 1๋กœ ํ•˜๊ณ , ์ฐพ์•„์ง€์ง€ ์•Š์œผ๋ฉด entry๋ฅผ 0์œผ๋กœ ํ•œ๋‹ค. $x$ ๋ฒกํ„ฐ๋“ค์ด ๋ชจ๋‘ ์ค€๋น„๋˜๋ฉด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ํ•™์Šต์‹œํ‚ค๊ณ  ์ตœ์ข…์ ์œผ๋กœ ์ด๋ฉ”์ผ์— ์ ์šฉํ•ด์„œ ์ŠคํŒธ์ธ์ง€ ์•„๋‹Œ์ง€๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ•œ๋‹ค. ์–ด๋–ป๊ฒŒ ํ•˜๋ฉด ๋ถ„๋ฅ˜๊ธฐ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์„๊นŒ? ๋งŽ์€ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ธฐ ์ •๊ตํ•œ feature ์‚ฌ์šฉ$($ex. ์ŠคํŒธ..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Bias vs Variance

Diagnosing Bias vs Variance ์ด ์„น์…˜์—์„œ๋Š” polynomial d์™€ hypothesis์˜ underfitting ํ˜น์€ overfitting์˜ ๊ด€๊ณ„์— ๋Œ€ํ•ด์„œ ์กฐ์‚ฌํ•˜์˜€๋‹ค. ์ž˜๋ชป๋œ ์˜ˆ์ธก์— ๊ณตํ—Œํ•˜๋Š” bias์™€ variance๋ฅผ ๊ตฌ๋ถ„ํ•˜์˜€๋‹ค. ๋†’์€ bias๋Š” underfitting์„ ์•ผ๊ธฐํ•˜๊ณ , ๋†’์€ variance๋Š” overfitting์„ ์•ผ๊ธฐํ•œ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋‘˜ ๊ฐ„์˜ ํ™ฉ๊ธˆ ํ‰๊ท ์„ ์ฐพ์•„์•ผ ํ•  ํ•„์š”๊ฐ€ ์žˆ๋‹ค. polynomial์˜ degree d๋ฅผ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜๋ก training error๋Š” ๊ฐ์†Œํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค. ๋™์‹œ๊ฐ„๋Œ€์— cross validation error๋Š” ์ผ์ • ํฌ์ธํŠธ๊นŒ์ง€๋Š” ๊ฐ์†Œํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๊ณ , ๊ทธ ๋‹ค์Œ์—๋Š” d๊ฐ’์ด ์˜ค๋ฆ„์— ๋”ฐ๋ผ ์ƒ์Šนํ•˜๋ฉด์„œ, convex curve๋ฅผ ๋งŒ๋“ค์–ด ๋‚ธ๋‹ค $($์ตœ์†Ÿ..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Evaluating a Learning Algorithm

Evaluating a Hypothesis ๋‹ค์Œ์„ ํ†ตํ•ด ์˜ˆ์ธก ์˜ค๋ฅ˜์— ๋Œ€ํ•œ ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๋” ๋งŽ์€ training example์„ ๊ฐ€์ ธ์˜ค๊ธฐ ์ž‘์€ feature ์„ธํŠธ๋ฅผ ์‹œ๋„ ์ถ”๊ฐ€์ ์ธ feature์„ ์‹œ๋„ ๋‹คํ•ญ์˜ feature์„ ์‹œ๋„ $\lambda$ ๊ฐ’์„ ๋Š˜๋ฆฌ๊ฑฐ๋‚˜ ์ค„์ด๊ธฐ ์ด์ œ ์ƒˆ๋กœ์šด hypothesis๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž. hypothesis๋Š” training example์— ๋Œ€ํ•ด ๋‚ฎ์€ ์˜ค์ฐจ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์„ ์ˆ˜ ์žˆ์ง€๋งŒ, ์•„์ง ๋ถ€์ •ํ™•ํ•˜๋‹ค $($overfitting ๋•Œ๋ฌธ$)$. ๋”ฐ๋ผ์„œ hypothesis๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ฃผ์–ด์ง„ training example ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด ๋‘ ๊ฐœ์˜ ์„ธํŠธ๋กœ ๋ถ„๋ฆฌํ•˜์—ฌ์•ผ ํ•œ๋‹ค: training set & test set. ๋ณดํ†ต training set๋Š”..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Backpropagation in Practice

Gradient Checking Gradient checkin ์ฆ‰, ๊ธฐ์šธ๊ธฐ ์ฒดํฌ๋Š” ์—ญ์ „ํŒŒ๊ฐ€ ์˜๋„ํ•œ ๋Œ€๋กœ ์ž˜ ๋˜๊ณ  ์žˆ๋Š”์ง€ ๋ณด์žฅํ•ด์ค€๋‹ค. cost function์˜ ๋ฏธ๋ถ„์„ ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ทผ์‚ฌํ•  ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์„ธํƒ€ ํ–‰๋ ฌ์„ ์‚ฌ์šฉํ•˜๋ฉด, $\theta_{j}$์— ๊ด€ํ•œ ๋ฏธ๋ถ„์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ทผ์‚ฌํ•  ์ˆ˜ ์žˆ๋‹ค. $\epsilon$์€ $\epsilon = 10^{-4}$ ๊ฐ™์€ ์ž‘์€ ๊ฐ’์ด์–ด์•ผ ์ ์ ˆํ•˜๊ฒŒ ์ž˜ ์ž‘๋™ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. $\epsilon$์˜ ๊ฐ’์ด ๋„ˆ๋ฌด ์ž‘์œผ๋ฉด ์ˆ˜์น˜์  ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜๋„ ์žˆ๋‹ค. ๊ทธ๋ž˜์„œ $\theta_{j}$ ํ–‰๋ ฌ์— epsilon์„ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ๋นผ๊ธฐ๋งŒ ํ•œ๋‹ค. ์ด์ „์— deltaVector๋ฅผ ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐํ•˜๋Š”์ง€์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์•˜๋‹ค. ๊ทธ๋ž˜์„œ gradApprox๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋ฉด, $gradAppr..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Cost Function & Backpropagation

Cost Function cost function์„ ์„ค๋ช…ํ•  ๋•Œ ์‚ฌ์šฉํ•  ๋ช‡ ๊ฐœ์˜ ๋ณ€์ˆ˜๋“ค์„ ์ •์˜ํ•ด๋ณด๋„๋ก ํ•˜์ž. $L$: ๋„คํŠธ์›Œํฌ์— ์žˆ๋Š” ์ด ๋ ˆ์ด์–ด์˜ ์ˆ˜ $s_l$: ๋ ˆ์ด์–ด $l$์— ์žˆ๋Š” unit์˜ ์ˆ˜ $K$: ์ถœ๋ ฅ ์œ ๋‹›๊ณผ ํด๋ž˜์Šค์˜ ์ˆ˜ ์ด ๋ณ€์ˆ˜๋“ค์„ ์‹ ๊ฒฝ๋ง ๋„คํŠธ์›Œํฌ์—์„œ ๋– ์˜ฌ๋ ค๋ณด๋ฉด, ๋งŽ์€ ์ถœ๋ ฅ ๋…ธ๋“œ๋“ค์„ ๊ฐ€์ง€๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค. $h_{\theta}(x)_{k}$๋Š” $k$๋ฒˆ์งธ ์ถœ๋ ฅ์œผ๋กœ ๊ฒฐ๊ณผ๋กœ ๋‚ด๋†“๋Š” hypothesis๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค. ์‹ ๊ฒฝ๋ง์„ ์œ„ํ•œ cost function์€ ์šฐ๋ฆฌ๊ฐ€ logistic function์—์„œ ์‚ฌ์šฉํ•œ ๊ฒƒ์˜ ์ผ๋ฐ˜ํ™”์ด๋‹ค. ์ •๊ทœํ™”๋œ logistic regression์˜ cost function์„ ๋– ์˜ฌ๋ ค๋ณด๋„๋ก ํ•˜์ž. ์‹ ๊ฒฝ๋ง ๋„คํŠธ์›Œํฌ์— ๋Œ€ํ•ด์„œ๋Š” ์ด๊ฒƒ์˜ ํ˜•ํƒœ๊ฐ€ ์‚ด์ง ๋ฐ”๋€Œ๊ฒŒ ๋œ๋‹ค. ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์ถœ๋ ฅ ๋…ธ๋“œ๋ฅผ ์„ค๋ช…ํ•˜๊ธฐ..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Neural Networks

Model Representation I ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•ด์„œ ์–ด๋–ป๊ฒŒ hypothesis function์„ ํ‘œํ˜„ํ•  ์ง€ ์ƒ๊ฐํ•ด๋ณด๋„๋ก ํ•˜์ž. ๋งค์šฐ ๊ฐ„๋‹จํ•œ ์ˆ˜์ค€์—์„œ, ๋‰ด๋Ÿฐ์€ ์ „๊ธฐ์  ์‹ ํ˜ธ๋กœ ์ž…๋ ฅ์„ ๋ฐ›์•„์„œ ์ถœ๋ ฅ์„ ์ฑ„๋„๋งํ•˜๋Š” ๊ณ„์‚ฐ ์œ ๋‹›์œผ๋กœ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ชจ๋ธ์˜ ๊ฐœ๋…์œผ๋กœ ์ƒ๊ฐํ•ด๋ณด๋ฉด ์ž…๋ ฅ์€ feature $x_1, \cdots x_n$์ด ๋˜๊ณ , ์ถœ๋ ฅ์€ hypothesis function์˜ ๊ฒฐ๊ณผ๊ฐ€ ๋œ๋‹ค. ๋ชจ๋ธ์—์„œ $x_0$ ์ž…๋ ฅ ๋…ธ๋“œ๋Š” bias unit์œผ๋กœ ๋ถˆ๋ฆฌ๊ธฐ๋„ ํ•˜๋Š”๋ฐ, ์ด ๋…ธ๋“œ๋Š” ํ•ญ์ƒ 1์˜ ๊ฐ’์„ ๊ฐ€์ง„๋‹ค. ์‹ ๊ฒฝ๋ง์—์„œ ๋ถ„๋ฅ˜์ฒ˜๋Ÿผ ๋˜‘๊ฐ™์€ logistic function์ด๊ณ , sigmoid ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋Š” $\frac {1}{1+e^{-\theta^{T}x}}$์„ ์‚ฌ์šฉํ•œ๋‹ค. ์ด ์ƒํ™ฉ์—์„œ ์„ธํƒ€ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ๊ฐ€์ค‘์น˜๋ผ๊ณ ๋„..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Solving the Problem of Overfitting

The Problem of Overfitting $x \in \mathbb{R}$๋กœ๋ถ€ํ„ฐ $y$๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ์ƒ๊ฐํ•ด๋ณด๋„๋ก ํ•˜์ž. ์•„๋ž˜์˜ ์™ผ์ชฝ ๊ทธ๋ฆผ์€ $y = \theta_{0} + \theta_{1}x$๋ฅผ ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉ์‹œ์ผฐ์„ ๋•Œ์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ๊ทธ๋ฆผ์„ ๋ณด๋ฉด ์ง์„ ์ด ์ •ํ™•ํžˆ ์ ๋“ค์˜ ์œ„์— ์žˆ์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ง์„ ์ด ์ ๋“ค์— ์ •ํ™•ํžˆ ๋งž์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค. ์—ฌ๊ธฐ์— ์ถ”๊ฐ€์ ์œผ๋กœ feature์„ ์ถ”๊ฐ€ํ•˜๋ฉด, $y = \theta_{0} + \theta_{1}x + \theta_{2}x^{2}$์„ ์–ป๊ฒŒ ๋˜๊ณ , ๊ทธ ์ „๋ณด๋‹ค ์ข€ ๋” ๋ฐ์ดํ„ฐ์— ๋งž๋Š” ํ•จ์ˆ˜๋ฅผ ์–ป๊ฒŒ ๋œ๋‹ค. ์ด๋ฅผ ๋ฏธ๋ฃจ์–ด ๋ณด์•„, feature๊ฐ€ ๋” ๋งŽ์•„์ง€๋ฉด ๋” ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ง์ž‘ํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ๊ทธ๋ ‡..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Multiclass Classification

Multiclass Classification: One-vs-all ์ด์ œ๋ถ€ํ„ฐ ๋‘ ๊ฐœ ์ด์ƒ์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ๊ฐ€์ง€๋Š” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ๋Š” ์ด ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„๋ฅ˜ํ•  ๊ฒƒ์ด๋‹ค. $y={0,1}$ ๋Œ€์‹ ์—, ์ข€ ๋” ํ™•์žฅ๋œ $y={0,1,...,n}$์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค. $y={0,1,...,n}$์ด๋ฏ€๋กœ ๋ฌธ์ œ๋ฅผ $n+1$ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋กœ ๋‚˜๋ˆˆ๋‹ค. ๊ฐ๊ฐ์—์„œ $y$๊ฐ€ ์šฐ๋ฆฌ์˜ ํด๋ž˜์Šค ์ค‘ ํ•˜๋‚˜์˜ ๊ตฌ์„ฑ์›์ผ ํ™•๋ฅ ์„ ์˜ˆ์ธกํ•œ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ ํ•˜๋‚˜์˜ ํด๋ž˜์Šค๋ฅผ ์„ ํƒํ•ด์„œ ๋‹ค๋ฅธ ๋ชจ๋“  ํด๋ž˜์Šค๋ฅผ ํ•˜๋‚˜์˜ ๋‘ ๋ฒˆ์งธ ํด๋ž˜์Šค๋กœ ๋ฌถ๋Š”๋‹ค. ์ด๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ํ•˜์—ฌ, ์ด์ง„ logistic regression์„ ๊ฐ ์ผ€์ด์Šค์— ์ ์šฉ์‹œํ‚จ ๋‹ค์Œ์—, ๊ฐ€์žฅ ๋†’์€ ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•œ hypothesis๋ฅผ ์˜ˆ์ธก์œผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค. ๋‹ค์Œ์˜ ๊ทธ๋ฆผ์€ ํ•˜๋‚˜์˜ ๋ฐ์ดํ„ฐ์…‹์ด 3๊ฐœ์˜ ํด๋ž˜์Šค๋กœ ๋ถ„๋ฅ˜๋˜๋Š” ๊ณผ์ •์„..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Classification & Representation

Classification ๋ถ„๋ฅ˜๋ฅผ ํ•˜๊ธฐ ์œ„ํ•œ ํ•œ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š”, ์„ ํ˜• ํšŒ๊ท€๋ฅผ ์‚ฌ์šฉํ•ด์„œ 0.5๋ณด๋‹ค ํฐ ๊ฐ’์€ 1๋กœ, 0.5๋ณด๋‹ค ์ž‘์€ ๊ฐ’์€ 0์œผ๋กœ ๋งคํ•‘์„ ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค. ํ•˜์ง€๋งŒ, ์ด ๋ฐฉ๋ฒ•์€ ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋ฐ, ๋ณดํ†ต ๋ถ„๋ฅ˜ ๋ฌธ์ œ๊ฐ€ ์„ ํ˜• ํ•จ์ˆ˜ ๋ฌธ์ œ๊ฐ€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋Š” ์˜ˆ์ธกํ•˜๋ ค๋Š” ๊ฐ’์ด ์†Œ์ˆ˜์˜ ์ด์‚ฐ๊ฐ’๋งŒ ์ทจํ•œ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด ํšŒ๊ท€ ๋ฌธ์ œ์™€ ๊ฐ™๋‹ค. ์ง€๊ธˆ๋ถ€ํ„ฐ๋Š” ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์— ์ง‘์ค‘ํ•  ๊ฒƒ์ด๋‹ค. ์ด ๋ฌธ์ œ์—์„œ $y$๋Š” 0๊ณผ 1, ์˜ค์ง ๋‘ ๊ฐ’๋งŒ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ŠคํŒธ ๋ถ„๋ฅ˜๊ธฐ๋ฅผ ๋งŒ๋“ ๋‹ค๊ณ  ํ•  ๋•Œ, $x_{i}$๋Š” ์ด๋ฉ”์ผ์˜ feature๊ฐ€ ๋˜๊ณ , $y$๋Š” ์ŠคํŒธ ๋ฉ”์ผ์ผ ๊ฒฝ์šฐ 1, ์•„๋‹ ๊ฒฝ์šฐ 0์ด ๋œ๋‹ค. ๊ทธ๋ž˜์„œ $y \in {1, 0}$์ด ๋œ๋‹ค. 0์€ negative class๋กœ ๋ถˆ๋ฆฌ๊ณ , 1์€ positiv..

Lecture ๐Ÿง‘‍๐Ÿซ/Coursera

[Machine Learning] Computing Parameters Analytically

Normal Equation ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•์€ cost function $J$์˜ ๊ฐ’์„ ์ค„์ด๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ด๋‹ค. cost function์˜ ๊ฐ’์„ ์ค„์ด๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๋„๋ก ํ•˜์ž! ์ด๋ฒˆ์— ์•Œ์•„๋ณผ ๋ฐฉ๋ฒ•์€ ๋ช…์พŒํ•˜๊ฒŒ ์ตœ์†Œํ™”๋ฅผ ํ•˜๊ณ , ๋ฐ˜๋ณต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค. "Normal Equation"์€ $\theta_{j}$์— ๊ด€ํ•˜์—ฌ ๋ฏธ๋ถ„์„ ํ•จ์œผ๋กœ์จ 0์œผ๋กœ ์„ค์ •ํ•˜์—ฌ $J$๋ฅผ ์ตœ์†Œํ™”ํ•œ๋‹ค. ์ด๋Š” ๋ฐ˜๋ณต ์—†์ด ์ตœ์ ์˜ $\theta$๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค€๋‹ค. normal equation ๊ณต์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. $\theta = (X^{T}X)^{-1}X^{T}y$ normal equation์—์„œ๋Š” feature scaling์„ ํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค. ๋‹ค์Œ์˜ ํ‘œ๋Š” ๊ฒฝ์‚ฌ ํ•˜๊ฐ•๋ฒ•๊ณผ normal equation ๊ฐ„์˜ ๋น„๊ต๋ฅผ ๋ณด์—ฌ์ค€๋‹ค. G..

Cartinoe
'Lecture ๐Ÿง‘‍๐Ÿซ/Coursera' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก