Commit Graph

29 Commits

Author SHA1 Message Date
Zchen
6c7abfcca8 f 2025-10-17 10:53:58 +08:00
Zchen
7ede7b5f12 f 2025-10-17 02:09:14 +08:00
Zchen
ca8c615505 f 2025-10-17 02:01:48 +08:00
Zchen
49700456b8 f 2025-10-17 01:58:28 +08:00
Zchen
8ee09b6b5e f 2025-10-17 01:54:32 +08:00
Zchen
a5a3179ca6 f 2025-10-17 01:49:03 +08:00
Zchen
59fb73ee9f f 2025-10-17 01:36:08 +08:00
Zchen
0a72143513 legacy adam 2025-10-17 01:26:02 +08:00
Zchen
7df78244e6 adamw to adam 2025-10-17 01:07:01 +08:00
Zchen
a96e272f7b fix twice gradient cut 2025-10-17 00:51:53 +08:00
Zchen
7a43ebfb71 refactor: streamline model building and ensure dtype consistency in L2 loss calculation 2025-10-16 23:06:09 +08:00
Zchen
7efa33d730 f 2025-10-16 22:42:33 +08:00
Zchen
982d2dc256 f 2025-10-16 22:20:08 +08:00
Zchen
bd61136f93 f 2025-10-16 22:02:11 +08:00
Zchen
6f94ad5fae f 2025-10-16 21:51:43 +08:00
Zchen
eefff1ce5e fix 2025-10-16 21:40:43 +08:00
Zchen
426b72ef25 fix 2025-10-16 21:26:00 +08:00
Zchen
dde6378481 fixed 2025-10-16 21:13:42 +08:00
Zchen
a0b59c6987 fix 2025-10-16 21:06:01 +08:00
Zchen
ed6e21bfe4 fix 'NoneType' object has no attribute 'extended' 2025-10-16 20:57:40 +08:00
Zchen
1e7077bba7 adamw修复 2025-10-16 20:44:55 +08:00
Zchen
c2661550ef 内存泄漏修复 2025-10-16 20:26:32 +08:00
Zchen
1b9e0d9bdf 调整batch_size 2025-10-16 17:37:59 +08:00
Zchen
be578f2e1d 修复数据加载器低效问题 2025-10-16 17:14:06 +08:00
Zchen
a545cc5648 tpu维护 2025-10-16 13:39:05 +08:00
Zchen
5a1e446219 HBM 2025-10-16 11:42:56 +08:00
Zchen
0ff6634192 简单修复 2025-10-16 10:53:42 +08:00
Zchen
69a7285886 数据加载器多线程加速 2025-10-16 01:17:36 +08:00
Zchen
3b242b908d trainer 2025-10-15 19:04:42 +08:00