1. 65 GOPS/neuron Photonic Tensor Core with Thin-film Lithium Niobate Photonics
- Author
-
Lin, Zhongjin, Shastri, Bhavin J., Yu, Shangxuan, Song, Jingxiang, Zhu, Yuntao, Safarnejadian, Arman, Cai, Wangning, Lin, Yanmei, Ke, Wei, Hammood, Mustafa, Wang, Tianye, Xu, Mengyue, Zheng, Zibo, Al-Qadasi, Mohammed, Esmaeeli, Omid, Rahim, Mohamed, Pakulski, Grzegorz, Schmid, Jens, Barrios, Pedro, Jiang, Weihong, Morison, Hugh, Mitchell, Matthew, Qiang, Xiaogang, Guan, Xun, Jaeger, Nicolas A. F., Rusch, Leslie A. n, Shekhar, Sudip, Shi, Wei, Yu, Siyuan, Cai, Xinlun, Chrostowski, Lukas, Lin, Zhongjin, Shastri, Bhavin J., Yu, Shangxuan, Song, Jingxiang, Zhu, Yuntao, Safarnejadian, Arman, Cai, Wangning, Lin, Yanmei, Ke, Wei, Hammood, Mustafa, Wang, Tianye, Xu, Mengyue, Zheng, Zibo, Al-Qadasi, Mohammed, Esmaeeli, Omid, Rahim, Mohamed, Pakulski, Grzegorz, Schmid, Jens, Barrios, Pedro, Jiang, Weihong, Morison, Hugh, Mitchell, Matthew, Qiang, Xiaogang, Guan, Xun, Jaeger, Nicolas A. F., Rusch, Leslie A. n, Shekhar, Sudip, Shi, Wei, Yu, Siyuan, Cai, Xinlun, and Chrostowski, Lukas
- Abstract
Photonics offers a transformative approach to artificial intelligence (AI) and neuromorphic computing by providing low latency, high bandwidth, and energy-efficient computations. Here, we introduce a photonic tensor core processor enabled by time-multiplexed inputs and charge-integrated outputs. This fully integrated processor, comprising only two thin-film lithium niobate (TFLN) modulators, a III-V laser, and a charge-integration photoreceiver, can implement an entire layer of a neural network. It can execute 65 billion operations per second (GOPS) per neuron, including simultaneous weight updates-a hitherto unachieved speed. Our processor stands out from conventional photonic processors, which have static weights set during training, as it supports fast "hardware-in-the-loop" training, and can dynamically adjust the inputs (fan-in) and outputs (fan-out) within a layer, thereby enhancing its versatility. Our processor can perform large-scale dot-product operations with vector dimensions up to 131,072. Furthermore, it successfully classifies (supervised learning) and clusters (unsupervised learning) 112*112-pixel images after "hardware-in-the-loop" training. To handle "hardware-in-the-loop" training for clustering AI tasks, we provide a solution for multiplications involving two negative numbers based on our processor., Comment: 19 pages, 6 figures
- Published
- 2023