语言和语言学的论文(评估ChatGPT和语言大模型)
下面是好好范文网小编收集整理的语言和语言学的论文(评估ChatGPT和语言大模型),仅供参考,欢迎大家阅读!
ChatGPT这一类语言模型真是能力是什么,极限到哪里,又有哪些不足。不如看看最近3年,各大研究机构在权威期刊上发表的相关专业论文。
我们收录到
收录的详细论文如下
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivityby Yejin Bang, Samuel Cahyawijaya, Nayeon Lee, Wenliang Dai, Dan Su, Bryan Wilie, Holy Lovenia, Ziwei Ji et al.
Is ChatGPT a General-Purpose Natural Language Processing Task Solver?by Qin, Chengwei, Zhang, Aston, Zhang, Zhuosheng, Chen, Jiaao, Yasunaga, Michihiro and Yang, Diyi
ChatGPT versus Traditional Question Answering for Knowledge Graphs: Current Status and Future Directions Towards Knowledge Graph Chatbots,
by Reham Omar, Omij Mangukiya, Panos Kalnis and Essam Mansour
Mathematical Capabilities of ChatGPTby Simon Frieder, Luca Pinchetti, Ryan-Rhys Griffiths, Tommaso Salvatori, Thomas Lukasiewicz, Philipp Christian Petersen, Alexis Chevalier and Julius Berner
Exploring the Limits of ChatGPT for Query or Aspect-based Text Summarization,
by Xianjun Yang, Yan Li, Xinlu Zhang, Haifeng Chen and Wei Cheng
On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective,
by Jindong Wang, Xixu Hu, Wenxin Hou, Hao Chen, Runkai Zheng, Yidong Wang, Linyi Yang, Haojun Huang et al.
ChatGPT is not all you need. A State of the Art Review of large Generative AI models,
by Roberto Gozalo-Brizuela and Eduardo C. Garrido-Merch'an
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT,
by Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du and Dacheng Tao
Evaluation of ChatGPT as a Question Answering System for Answering Complex Questions,
by Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen and Guilin Qi
Holistic Evaluation of Language Modelsby Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan et al.
Evaluating the Text-to-SQL Capabilities of Large Language Models, by Nitarshan Rajkumar, Raymond Li and Dzmitry Bahdanau
Are Visual-Linguistic Models Commonsense Knowledge Bases?by Hsiu-Yu Yang and Carina Silberer
Is GPT-3 a Psychopath? Evaluating Large Language Models from a Psychological Perspective,
by Xingxuan Li, Yutong Li, Linlin Liu, Lidong Bing and Shafiq R. Joty
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models,
by Da Yin, Hritik Bansal, Masoud Monajatipoor, Liunian Harold Li and Kai-Wei Chang
RobustLR: A Diagnostic Benchmark for Evaluating Logical Robustness of Deductive Reasoners,
by Soumya Sanyal, Zeyi Liao and Xiang Ren
Evaluating Large Language Models Trained on Codeby Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pond'e de Oliveira Pinto, Jared Kaplan, Harrison Edwards, Yuri Burda et al.
GLGE: A New General Language Generation Evaluation Benchmarkby Dayiheng Liu, Yu Yan, Yeyun Gong, Weizhen Qi, Hang Zhang, Jian Jiao, Weizhu Chen, Jie Fu et al.
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviewsby Mohammad Abdul Hadi and Fatemeh H. Fard
Do Language Models Perform Generalizable Commonsense Inference?, by Peifeng Wang, Filip Ilievski, Muhao Chen and Xiang Ren
RICA: Evaluating Robust Inference Capabilities Based on Commonsense Axiomsby Pei Zhou, Rahul Khanna, Seyeon Lee, Bill Yuchen Lin, Daniel Ho, Jay Pujara and Xiang Ren
Evaluation of Text Generation: A Surveyby Asli Celikyilmaz, Elizabeth Clark and Jianfeng Gao
Neural Language Generation: Formulation, Methods, and Evaluation,
BERTScore: Evaluating Text Generation with BERT,
论文列表整理自 ,相关截图和简介为个人整理。