It is based on tracking the objects in a video. RetinaNet framework and V-IoU tracker have been used for detection and tracking tasks respectively. Every tracked object is assigned with a unique tracking ID that has been used for counting the objects in the video.

