日韩黑丝制服一区视频播放|日韩欧美人妻丝袜视频在线观看|九九影院一级蜜桃|亚洲中文在线导航|青草草视频在线观看|婷婷五月色伊人网站|日本一区二区在线|国产AV一二三四区毛片|正在播放久草视频|亚洲色图精品一区

<td id="gup6s"><code id="gup6s"><small id="gup6s"></small></code></td>

<strike id="gup6s"><code id="gup6s"></code></strike>

<fieldset id="gup6s"><table id="gup6s"></table></fieldset>

<menu id="gup6s"><form id="gup6s"></form></menu>

<span id="gup6s"><i id="gup6s"></i></span>

<legend id="7p6dv"></legend>

<menuitem id="7p6dv"></menuitem>

搜索

分享

QQ空間 QQ好友新浪微博微信

深度學習的 weight decay

醫(yī)學數(shù)據(jù)科學 2021-04-21

展開全文

What is weight decay?權(quán)值衰減

Weight decay is a regularization technique by adding a small penalty, usually the L2 norm of the weights (all the weights of the model), to the loss function.

loss = loss + weight decay parameter * L2 norm of the weights

Some people prefer to only apply weight decay to the weights and not the bias. PyTorch applies weight decay to both weights and bias.

Why do we use weight decay?

To prevent overfitting.
To keep the weights small and avoid exploding gradient. Because the L2 norm of the weights are added to the loss, each iteration of your network will try to optimize/minimize the model weights in addition to the loss. This will help keep the weights as small as possible, preventing the weights to grow out of control, and thus avoid exploding gradient.

How do we use weight decay?

To use weight decay, we can simply define the weight decay parameter in the torch.optim.SGD optimizer or the torch.optim.Adam optimizer. Here we use 1e-4 as a default for weight_decay.

optimizer = torch.optim.SGD(model.parameters(), lr=1e-3, weight_decay=1e-4)optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4)

本站是提供個人知識管理的網(wǎng)絡(luò)存儲空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點。請注意甄別內(nèi)容中的聯(lián)系方式、誘導購買等信息，謹防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點擊一鍵舉報。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻花（0） +1

來自：醫(yī)學數(shù)據(jù)科學 > 《R語言》

舉報/認領(lǐng)

0條評論

請遵守用戶評論公約

類似文章 更多

醫(yī)學數(shù)據(jù)科學

科學領(lǐng)域優(yōu)質(zhì)作者

關(guān)注對話

TA的最新館藏

如何處理spyder點擊后啟動前出現(xiàn)的彈框
關(guān)于軌跡模型中mixture參數(shù)的通俗解釋
臨床研究中經(jīng)常使用的數(shù)據(jù)集分析
lcmm包的兩個重要參數(shù)mixture和classmb
臨床研究分期——來自DeepSeek
幸存者偏倚和永恒時間偏倚

喜歡該文的人也喜歡更多

熱門閱讀換一換

<del id="jgy7i"></del><menuitem id="jgy7i"></menuitem>

<menuitem id="jgy7i"></menuitem>

<legend id="jgy7i"></legend>