Are Straight-Through gradients and Soft-Thresholding all you need for Sparse Training? Paper • 2212.01076 • Published Dec 2, 2022