Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Trans

Last updated