On June 19, the 2021 ActivityNet competition under CVPR, the top international academic conference in the field of computer vision, came to an end. The team composed of doctoral students Teng Wangand Zhichao Lu from the EMI research group performed outstandingly. Under the leadership of Ran Cheng, the team members overcame a number of challenges and finally won the world runner-up with a score of 10.00 in the intensive event description track.
The ActivityNet, a large-scale video behavior recognition competition, has been successfully held for six times since 2016. It is the largest and most influential event in the field of video understanding. The organizers come from KAUST, DeepMind, Stanford University and other institutions. There are many previous contestants. From well-known structures around the world, including Stanford, MSRA, POSTECH, Shanghai Jiaotong, People’s Congress, Tencent, Ali, Baidu, etc. The ActivityNet intensive event description track requires participating teams to use the large-scale video data set ActivityNet Captions to accurately locate all events in the long video, and generate natural language to describe the human behavior in it. At the same time, the team members in 2020 also won the runner-up in the competition and were invited to give a report in the conference workshop.