-
Notifications
You must be signed in to change notification settings - Fork 149
models Models documentation
-
bytetrack_yolox_x_crowdhuman_mot17-private-half
bytetrack_yolox_x_crowdhuman_mot17-private-half
model is from OpenMMLab's MMTracking library. This model is <a href="https://github.com/open-mmlab/mmtracking/blob/master/configs/mot/bytetrack/metafile.yml#L24" t... -
The Model Card for DeciCoder 1B provides details about a 1 billion parameter decoder-only code completion model developed by Deci. The model was trained on Python, Java, and JavaScript subsets of Starcoder Training Dataset and uses Grouped Query Attention with a context window of 2048 tokens. It ...
-
DeciLM-7B is a decoder-only text generation model with 7.04 billion parameters, released by Deci under the Apache 2.0 license. It is the top-performing 7B base language model on the Open LLM Leaderboard and uses variable Grouped-Query Attention (GQA) to achieve a superior balance between accuracy...
-
DeciLM-7B-instruct is a model for short-form instruction following, built by LoRA fine-tuning on the SlimOrca dataset. It is a derivative of the recently released DeciLM-7B language model, a pre-trained, high-efficiency generative text model with 7 billion parameters. DeciLM-7B-instruct is one of...
-
deformable_detr_twostage_refine_r50_16x2_50e_coco
deformable_detr_twostage_refine_r50_16x2_50e_coco
model is from OpenMMLab's MMDetection library. This model is reported to obtain <a href="https://github.com/open-mmlab/mmdetection/blob/e9cae2d0787cd5c2fc6165a6... -
The DistilBERT model is a smaller, faster version of the BERT model for Transformer-based language modeling with 40% fewer parameters and 60% faster run time while retaining 95% of BERT's performance on the GLUE language understanding benchmark. This English language question answering model has ...
-
The BART model is a transformer encoder-encoder model trained on English language data, and fine-tuned on CNN Daily Mail. It is used for text summarization and has been trained to reconstruct text that has been corrupted using an arbitrary noising function. The model is effective for text generat...
-
Summary: camembert-ner is a NER model fine-tuned from camemBERT on the Wikiner-fr dataset and was validated on email/chat data. It shows better performance on entities that do not start with an uppercase. The model has four classes: O, MISC, PER, ORG and LOC. The model can be loaded using Hugging...
-
Phi-1.5 is a Transformer-based language model with 1.3 billion parameters. It was trained on a combination of data sources, including an additional source of NLP synthetic texts. Phi-1.5 performs exceptionally well on benchmarks testing common sense, language understandi...
The phi-2 is a language model with 2.7 billion parameters. The phi-2 model was trained using the same data sources as phi-1, augmented with a new data source that consists of various NLP synthetic texts and filtered websites (for safety and educational value). When assesse...
-
Whisper is an OpenAI pre-trained speech recognition model with potential applications for ASR solutions for developers. However, due to weak supervision and large-scale noisy data, it should be used with caution in high-risk domains. The model has been trained on 680k hours of audio data represen...
-
Whisper is a model that can recognize and translate speech using deep learning. It was trained on a large amount of data from different sources and languages. Whisper models can handle various tasks and domains without needing to adjust the model.
Whisper large-v3 is similar to the previous larg...
-
RoBERTa Base OpenAI Detector is a language model developed by OpenAI that is fine-tuned using outputs from the 1.5B GPT-2 model. It is designed to detect text generated by GPT-2 and is not meant to be used for malicious purposes or to evade detection. The main focus of the model is to aid in synt...
-
The RoBERTa Large model is a large transformer-based language model that was developed by the Hugging Face team. It is pre-trained on masked language modeling and can be used for tasks such as sequence classification, token classification, or question answering. Its primary usage is as a fine-tun...