Zchen
|
0a72143513
|
legacy adam
|
2025-10-17 01:26:02 +08:00 |
|
Zchen
|
7df78244e6
|
adamw to adam
|
2025-10-17 01:07:01 +08:00 |
|
Zchen
|
a96e272f7b
|
fix twice gradient cut
|
2025-10-17 00:51:53 +08:00 |
|
Zchen
|
7a43ebfb71
|
refactor: streamline model building and ensure dtype consistency in L2 loss calculation
|
2025-10-16 23:06:09 +08:00 |
|
Zchen
|
9453b70fad
|
remove quick test script for TensorFlow implementation fixes
|
2025-10-16 23:05:53 +08:00 |
|
Zchen
|
7efa33d730
|
f
|
2025-10-16 22:42:33 +08:00 |
|
Zchen
|
982d2dc256
|
f
|
2025-10-16 22:20:08 +08:00 |
|
Zchen
|
bd61136f93
|
f
|
2025-10-16 22:02:11 +08:00 |
|
Zchen
|
6f94ad5fae
|
f
|
2025-10-16 21:51:43 +08:00 |
|
Zchen
|
eefff1ce5e
|
fix
|
2025-10-16 21:40:43 +08:00 |
|
Zchen
|
426b72ef25
|
fix
|
2025-10-16 21:26:00 +08:00 |
|
Zchen
|
dde6378481
|
fixed
|
2025-10-16 21:13:42 +08:00 |
|
Zchen
|
a0b59c6987
|
fix
|
2025-10-16 21:06:01 +08:00 |
|
Zchen
|
ed6e21bfe4
|
fix 'NoneType' object has no attribute 'extended'
|
2025-10-16 20:57:40 +08:00 |
|
Zchen
|
1e7077bba7
|
adamw修复
|
2025-10-16 20:44:55 +08:00 |
|
Zchen
|
c2661550ef
|
内存泄漏修复
|
2025-10-16 20:26:32 +08:00 |
|
Zchen
|
1b9e0d9bdf
|
调整batch_size
|
2025-10-16 17:37:59 +08:00 |
|
Zchen
|
be578f2e1d
|
修复数据加载器低效问题
|
2025-10-16 17:14:06 +08:00 |
|
Zchen
|
a545cc5648
|
tpu维护
|
2025-10-16 13:39:05 +08:00 |
|
Zchen
|
5a1e446219
|
HBM
|
2025-10-16 11:42:56 +08:00 |
|
Zchen
|
0ff6634192
|
简单修复
|
2025-10-16 10:53:42 +08:00 |
|
Zchen
|
df4a914bbd
|
小参数
|
2025-10-16 09:22:25 +08:00 |
|
Zchen
|
25561a7615
|
超大batch_size
|
2025-10-16 01:57:19 +08:00 |
|
Zchen
|
69a7285886
|
数据加载器多线程加速
|
2025-10-16 01:17:36 +08:00 |
|
Zchen
|
f84d6254e3
|
tf 环境问题
|
2025-10-16 00:53:42 +08:00 |
|
Zchen
|
f9d3f47d20
|
fixed : tf call cuda
|
2025-10-15 23:37:24 +08:00 |
|
Zchen
|
01024678c1
|
tpu not find
|
2025-10-15 23:29:32 +08:00 |
|
Zchen
|
ec8509ad31
|
fix
|
2025-10-15 23:21:06 +08:00 |
|
Zchen
|
6c400a066c
|
fixed:'str' object has no attribute 'base_dtype'
|
2025-10-15 23:13:34 +08:00 |
|
Zchen
|
83621f91f0
|
fixed:'str' object has no attribute 'base_dtype'
|
2025-10-15 23:11:02 +08:00 |
|
Zchen
|
e8f0308fef
|
tpu
|
2025-10-15 20:45:25 +08:00 |
|
Zchen
|
3b242b908d
|
trainer
|
2025-10-15 19:04:42 +08:00 |
|
Zchen
|
7965f7dbfe
|
TPU
|
2025-10-15 16:55:52 +08:00 |
|
Zchen
|
b466e97463
|
tpu test
|
2025-10-15 15:22:13 +08:00 |
|
Zchen
|
082018cd46
|
tpu-test
|
2025-10-15 15:14:01 +08:00 |
|
Zchen
|
7bdfc0d257
|
tpu
|
2025-10-15 14:38:56 +08:00 |
|
Zchen
|
e7947f310c
|
tpu
|
2025-10-15 14:33:49 +08:00 |
|
Zchen
|
56fa336af0
|
tpu
|
2025-10-15 14:26:11 +08:00 |
|
Zchen
|
11ee6ebc51
|
tpu
|
2025-10-15 00:44:08 +08:00 |
|
Zchen
|
5dcbf28c96
|
tpu
|
2025-10-15 00:30:56 +08:00 |
|
Zchen
|
9025267400
|
tpu without bf16
|
2025-10-15 00:25:39 +08:00 |
|
Zchen
|
603bb12220
|
tpu
|
2025-10-15 00:18:05 +08:00 |
|
Zchen
|
4a3d3f35ec
|
Merge branch 'dev2' of http://ecs.zchens.cn:3000/zchen/b2txt25 into dev2
|
2025-10-15 00:08:56 +08:00 |
|
Zchen
|
aef96f5646
|
tpu
|
2025-10-15 00:08:51 +08:00 |
|
Zchen
|
ec4f6a25ef
|
tpu
|
2025-10-14 23:54:53 +08:00 |
|
Zchen
|
4b6d680283
|
tpu
|
2025-10-14 23:35:42 +08:00 |
|
Zchen
|
cd52ba51ba
|
tpu
|
2025-10-14 23:22:59 +08:00 |
|
Zchen
|
989ba67618
|
tpu
|
2025-10-14 23:11:54 +08:00 |
|
Zchen
|
f67ed2b820
|
修复B模型未启用的错误
|
2025-10-14 22:48:28 +08:00 |
|
Zchen
|
9288bde126
|
Merge branch 'dev2' of http://ecs.zchens.cn:3000/zchen/b2txt25 into dev2
|
2025-10-14 13:31:28 +08:00 |
|