This repository contains code and datasets for our research on developing machine learning models that mimic human visual motion perception. While state-of-the-art computer vision (CV) models, such as ...
We present SegLLM, a novel multi-round interactive segmentation model that leverages conversational memory of both visual and textual outputs to reason over previously segmented objects and past ...