Abstract is missing.
- Designing a Synchronization-reducing Clustering Method on Manycores: Some Issues and ImprovementsWeijian Zheng, Fengguang Song, Lan Lin. [doi]
- An In-depth Performance Characterization of CPU- and GPU-based DNN Training on Modern ArchitecturesAmmar Ahmad Awan, Hari Subramoni, Dhabaleswar K. Panda. [doi]
- Evolving Deep Networks Using HPCSteven R. Young, Derek C. Rose, Travis Johnston, William T. Heller, Thomas P. Karnowski, Thomas E. Potok, Robert M. Patton, Gabriel N. Perdue, Jonathan Miller. [doi]
- Optimizing Convolutional Neural Networks for Cloud DetectionTravis Johnston, Steven R. Young, David Hughes, Robert M. Patton, Devin White. [doi]
- BlazingText: Scaling and Accelerating Word2Vec using Multiple GPUsSaurabh Gupta, Vineet Khare. [doi]
- Training distributed deep recurrent neural networks with mixed precision on GPU clustersAlexey Svyatkovskiy, Julian Kates-Harbeck, William Tang. [doi]
- TensorQuant: A Simulation Toolbox for Deep Neural Network QuantizationDominik Marek Loroch, Franz-Josef Pfreundt, Norbert Wehn, Janis Keuper. [doi]
- Towards Scalable Parallel Training of Deep Neural NetworksSam Ade Jacobs, Nikoli Dryden, Roger A. Pearce, Brian Van Essen. [doi]
- Accelerating deep neural network learning for speech recognition on a cluster of GPUsGuojing Cong, Brian Kingsbury, Soumyadip Gosh, George Saon, Fan Zhou. [doi]
- An Efficient Task-based All-Reduce for Machine Learning ApplicationsZhenyu Li, James Davis, Stephen Jarvis. [doi]