The following publications are possibly variants of this publication:
- CSS: A Large-scale Cross-schema Chinese Text-to-SQL Medical DatasetHanchong Zhang, Jieyu Li, Lu Chen, Ruisheng Cao, Yunyan Zhang, Yu Huang, Yefeng Zheng 0001, Kai Yu 0004. acl 2023: 6970-6983 [doi]
- DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL DatasetLijie Wang, Ao Zhang, Kun Wu, Ke Sun 0005, Zhenghua Li, Hua Wu 0003, Min Zhang 0005, Haifeng Wang. emnlp 2020: 6923-6935 [doi]
- Chase: A Large-Scale and Pragmatic Chinese Dataset for Cross-Database Context-Dependent Text-to-SQLJiaqi Guo, Ziliang Si, Yu Wang, Qian Liu, Ming Fan, Jian-Guang Lou, Zijiang Yang, Ting Liu. acl 2021: 2316-2331 [doi]
- CATS: A Pragmatic Chinese Answer-to-Sequence Dataset with Large Scale and High QualityLiang Li, Ruiying Geng, Chengyang Fang, Bing Li, Can Ma, Rongyu Cao, Binhua Li, Fei Huang, Yongbin Li. acl 2023: 2983-3000 [doi]
- A Large-Scale Chinese Short-Text Conversation DatasetYida Wang, Pei Ke, Yinhe Zheng, Kaili Huang, Yong Jiang, Xiaoyan Zhu 0001, Minlie Huang. nlpcc 2020: 91-103 [doi]
- A Large Chinese Text Dataset in the WildTai-Ling Yuan, Zhe Zhu, Kun Xu, Cheng-Jun Li, Tai-Jiang Mu, Shi-Min Hu. jcst, 34(3):509-521, 2019. [doi]
- TaiSu: A 166M Large-scale High-Quality Dataset for Chinese Vision-Language Pre-trainingYulong Liu, Guibo Zhu, Bin Zhu, Qi Song, Guojing Ge, Haoran Chen, Guanhui Qiao, Ru Peng, Lingxiang Wu, Jinqiao Wang. nips 2022: [doi]