7.3.5 Self-Consistent Beliefs

사실 이전에 근사했던 factorization을 다시 생각해보면, 내가 들고 있는 어떤 카드들에 의해 상대방이 들고 있을 카드는 영향을 받을 수 밖에 없습니다. 하지만 이를 무시한 채 어쩔 수 없는 근사를 하였습니다.

게다가 이런 근사를 통해선 결과적으로 marginal하지 않은 $f^{\mathrm{pri}}$ 를 구하게 될 수도 있습니다. 그 것을 해결하기 위해 이 후로는 BAD의 메인 아이디어는 아니지만, 어떻게 marginal하게 $\mathcal{B}_t$ 를 업데이트할 수 있을지에 대한 iterative procedure를 제안합니다.

$\mathcal{B}^0 = \mathcal{B}_t$

$\mathcal{B}^{k+1}(f[i]) = \sum_{f[-i]}\mathcal{B}^k(f[-i])P(f[i]|f[-i],f^{\mathrm{pub}}_{\leq t}, u^a_{\leq t}, \hat{\pi}_{\leq t})$

$\propto \mathbb{E}_{f[-i] \sim\mathcal{B}^k}[\mathcal{L}_t(f[i])P(f[i]|f[-i],f^{\mathrm{pub}}_t)]$

마지막 텀은 expectation 형태로 변환함으로써 sampling을 가능하도록 하였습니다.

Previous7.3.4 Factorized Belief Updates Next7.4 Experiments and Results

Last updated 5 years ago