Conghui He
heconghui@sensetime.com
heconghui@pjlab.org.cn
Senior Research Director, Sensetime
Research Scientist & PI, Shanghai AI Laboratory
I am currently a Research Director at SenseTime Inc., as well as a Research Scientist and PI at the Shanghai AI Laboratory. Prior to this, I worked at WeChat as a Senior Researcher, where I initiated and developed the high-performance graph computing framework, Plato. Before joining WeChat, I earned my PhD degree (2013-2018) from the Department of Computer Science at Tsinghua University under the supervision of Prof. Haohuan Fu, and my Bachelor’s degree (2009-2013) from the Department of Software Engineering at Sun Yat-Sen University.
My research interests include High Performance Computing, Computer Vision, and Large Language Models. In 2017, I was honored with the Gordon Bell Prize , which is the highest distinction in the high-performance computing application domain. Currently, I lead the OpenDataLab team, which aims to build an influential open dataset platform that facilitates the development, analysis and research of Artificial General Intelligence (AGI). Additionally, I oversee a data team that collects and curates massive datasets for large language models.
At SenseTime and the Shanghai AI Laboratory, we are actively hiring PhDs, postdocs, interns, and full-time researchers. If you’re interested in joining our team, please feel free to reach out to me via email.
You can check out my CV here.
news
Mar 19, 2024 | We release Wanjuan-CC, a safe and high-quality Webtext dataset. |
---|---|
Feb 27, 2024 | 3 papers are accepted by CVPR 2024. |
Sep 09, 2023 | We release InternLM2. See arXiv for details. |
Aug 21, 2023 | We release Wanjuan 1.0, a large-scale multi-modal dataset for pretraining. |
Jun 03, 2023 | VIGC is accepted by AAAI 2024. |
Jun 03, 2023 | We release InternLM. You can find technical report here. |
Mar 21, 2022 | We launch OpenDataLab, an open data platform that enpowers AGI. |
selected publications
-
- AAAIVigc: Visual instruction generation and correctionIn Proceedings of the AAAI Conference on Artificial Intelligence , 2024
- WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext DatasetarXiv preprint arXiv:2402.19282, 2024
- Wanjuan: A comprehensive multimodal dataset for advancing english and chinese large modelsarXiv preprint arXiv:2308.10755, 2023
- Mmbench: Is your multi-modal model an all-around player?arXiv preprint arXiv:2307.06281, 2023
- SC179-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenariosIn Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , 2017