| license: apache-2.0 | |
| pipeline_tag: any-to-any | |
| This repository contains the models of the paper [Generalized Decoding for Pixel, Image, and Language](https://huggingface.co/papers/2212.11270). | |
| Github: https://github.com/microsoft/X-Decoder | |
| ***Click to Download!*** | |
| ## -> Models | |
| *Focal-T:* <br/> | |
| [xdecoder_focalt_last_novg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last_novg.pt) <br/> | |
| [xdecoder_focalt_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_last.pt) <br/> | |
| [xdecoder_focalt_best_openseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focalt_best_openseg.pt) <br/> | |
| *Focal-L:* <br/> | |
| [xdecoder_focall_last.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_last.pt) <br/> | |
| [xdecoder_focall_bestseg.pt](https://huggingface.co/xdecoder/X-Decoder/resolve/main/xdecoder_focall_bestseg.pt) <br/> | |
| ## -> Datasets | |
| [caption_class_similarity.pth](https://huggingface.co/xdecoder/X-Decoder/resolve/main/caption_class_similarity.pth) <br/> | |
| [captions_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/captions_train2017_filtrefgumdval_filtvlp.json) <br/> | |
| [grounding_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/grounding_train2017_filtrefgumdval_filtvlp.json) <br/> | |
| [panoptic_train2017_filtrefgumdval_filtvlp.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/panoptic_train2017_filtrefgumdval_filtvlp.json) <br/> | |
| [refcocog_umd_val.json](https://huggingface.co/xdecoder/X-Decoder/resolve/main/refcocog_umd_val.json) <br/> | |
| ## -> Evaluations | |
| [coco_caption.zip](https://huggingface.co/xdecoder/X-Decoder/resolve/main/coco_caption.zip) <br/> |