9.4.4 Hessian-Vector Product

Hessian-vector vTHv^TH๋Š” ๋งŽ์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•˜๋Š” ๊ฐ’์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค. DiCE๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด, vTHv^TH๋Š” ์ „์ฒด Hessian์„ ๊ณ„์‚ฐํ•˜์ง€์•Š๊ณ ๋„ ํšจ๊ณผ์ ์œผ๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” vv๊ฐ€ ฮธ\theta์— ์˜์กด์ ์ด์ง€์•Š๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๊ณ  ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.

vTH=vTโˆ‡2Lโ–ก v^TH = v^T\nabla^2\mathcal{L}_\square

=vT(โˆ‡Tโˆ‡Lโ–ก)= v^T(\nabla^T\nabla\mathcal{L}_\square)

=โˆ‡T(vTโˆ‡Lโ–ก)= \nabla^T(v^T\nabla\mathcal{L}_\square)

์ด ๋•Œ, vTโˆ‡Lโ–ก v^T\nabla\mathcal{L}_\square์€ scalar์ด๋ฏ€๋กœ auto-diff library๋ฅผ ์ด์šฉํ•˜๋ฉด ๊ทธ๋Œ€๋กœ loss๋กœ ์‚ฌ์šฉํ•ด ๊ตฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Last updated