|
| 1 | +--- |
| 2 | +title: ModelScope - Open-Source AI Pre-trained AI models hub |
| 3 | +weight: 3 |
| 4 | + |
| 5 | +### FIXED, DO NOT MODIFY |
| 6 | +layout: learningpathall |
| 7 | +--- |
| 8 | + |
| 9 | +## Before you begin |
| 10 | + |
| 11 | +To follow the instructions for this Learning Path, you will need an Arm server running Ubuntu 22.04 LTS or later version with at least 8 cores, 16GB of RAM, and 30GB of disk storage. |
| 12 | + |
| 13 | +## Introduce ModelScope |
| 14 | +[ModelScope](https://github.com/modelscope/modelscope/) is an open-source platform that makes it easy to use AI models in your applications. |
| 15 | +It provides a wide variety of pre-trained models for tasks like image recognition, natural language processing, and audio analysis. With ModelScope, you can easily integrate these models into your projects with just a few lines of code. |
| 16 | + |
| 17 | +Key benefits of ModelScope include: |
| 18 | + |
| 19 | +* **Model Diversity:** |
| 20 | + Access a wide range of models for various tasks, including ASR, natural language processing, and computer vision. |
| 21 | + |
| 22 | +* **Ease of Use:** |
| 23 | + ModelScope provides a user-friendly interface and APIs for seamless model integration. |
| 24 | + |
| 25 | +* **Community Support:** |
| 26 | + Benefit from a vibrant community of developers and researchers contributing to and supporting ModelScope. |
| 27 | + |
| 28 | + |
| 29 | +## Arm CPU Acceleration |
| 30 | +ModelScope fully supports Pytorch 1.8+ and other machine learning frameworks, which can be efficiently deployed on Arm Neoverse CPUs, taking advantage of Arm's performance and power-efficiency characteristics. |
| 31 | + |
| 32 | +Arm provides optimized software and tools, such as Kleidi, to accelerate AI inference on Arm-based platforms. This makes Arm Neoverse CPUs an ideal choice for running ModelScope models in edge devices and other resource-constrained environments. |
| 33 | + |
| 34 | +You can learn more about [Faster PyTorch Inference using Kleidi on Arm Neoverse](https://community.arm.com/arm-community-blogs/b/servers-and-cloud-computing-blog/posts/faster-pytorch-inference-kleidi-arm-neoverse) from Arm community website. |
| 35 | + |
| 36 | + |
| 37 | +## Installing ModelScope |
| 38 | + |
| 39 | +First, ensure your system is up-to-date and install the required tools and libraries: |
| 40 | + |
| 41 | +```bash |
| 42 | +sudo apt-get update -y |
| 43 | +sudo apt-get install -y curl git wget python3 python3-pip python3-venv python-is-python3 |
| 44 | +``` |
| 45 | + |
| 46 | +Create and activate a virtual environment: |
| 47 | +```bash |
| 48 | +python -m venv venv |
| 49 | +source venv/bin/activate |
| 50 | +``` |
| 51 | + |
| 52 | +Install related packages: |
| 53 | +```bash |
| 54 | +pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu |
| 55 | +pip3 install numpy packaging addict datasets simplejson sortedcontainers transformers ffmpeg |
| 56 | + |
| 57 | +``` |
| 58 | +{{% notice Note %}} |
| 59 | +This learning path will execute models on Arm Neoverse, so we only need to install the PyTorch CPU package. |
| 60 | +{{% /notice %}} |
| 61 | + |
| 62 | +## Create a sample example |
| 63 | + |
| 64 | +After completing the installation, we will use an example related to Chinese semantic understanding to illustrate how to use ModelScope. |
| 65 | + |
| 66 | +There is a fundamental difference between Chinese and English writing. |
| 67 | +The relationship between Chinese characters and their meanings is somewhat analogous to the difference between words and phrases in English. |
| 68 | +Some Chinese characters, like English words, have clear meanings on their own, such as “人” (person), “山” (mountain), and “水” (water). |
| 69 | + |
| 70 | +However, more often, Chinese characters need to be combined with other characters to express more complete meanings, just like phrases in English. |
| 71 | +For example, “祝福” (blessing) can be broken down into “祝” (wish) and “福” (good fortune); “分享” (share) can be broken down into “分” (divide) and “享” (enjoy); “生成” (generate) is composed of “生” (produce) and “成” (become). |
| 72 | + |
| 73 | +For computers to understand Chinese sentences, we need to understand the rules of Chinese characters, vocabulary, and grammar to accurately understand and express meaning. |
| 74 | + |
| 75 | +Here ia a simple example using a general-domain Chinese [word segmentation model](https://www.modelscope.cn/models/iic/nlp_structbert_word-segmentation_chinese-base), which can break down Chinese sentences into individual words, facilitating analysis and understanding by computers. |
| 76 | + |
| 77 | +```python |
| 78 | +from modelscope.pipelines import pipeline |
| 79 | + |
| 80 | +word_segmentation = pipeline ('word-segmentation',model='damo/nlp_structbert_word-segmentation_chinese-base') |
| 81 | +text = '一段新年祝福的文字跟所有人分享' |
| 82 | +result = word_segmentation(text) |
| 83 | + |
| 84 | +print(result) |
| 85 | +``` |
| 86 | + |
| 87 | +The output will be like this: |
| 88 | +```output |
| 89 | +2025-01-28 00:30:29,692 - modelscope - WARNING - Model revision not specified, use revision: v1.0.3 |
| 90 | +Downloading Model to directory: /home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base |
| 91 | +2025-01-28 00:30:32,828 - modelscope - WARNING - Model revision not specified, use revision: v1.0.3 |
| 92 | +2025-01-28 00:30:33,332 - modelscope - INFO - initiate model from /home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base |
| 93 | +2025-01-28 00:30:33,333 - modelscope - INFO - initiate model from location /home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base. |
| 94 | +2025-01-28 00:30:33,334 - modelscope - INFO - initialize model from /home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base |
| 95 | +You are using a model of type bert to instantiate a model of type structbert. This is not supported for all configurations of models and can yield errors. |
| 96 | +2025-01-28 00:30:35,522 - modelscope - WARNING - No preprocessor field found in cfg. |
| 97 | +2025-01-28 00:30:35,522 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. |
| 98 | +2025-01-28 00:30:35,522 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base'}. trying to build by task and model information. |
| 99 | +2025-01-28 00:30:35,527 - modelscope - INFO - cuda is not available, using cpu instead. |
| 100 | +2025-01-28 00:30:35,529 - modelscope - WARNING - No preprocessor field found in cfg. |
| 101 | +2025-01-28 00:30:35,529 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file. |
| 102 | +2025-01-28 00:30:35,529 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/home/ubuntu/.cache/modelscope/hub/damo/nlp_structbert_word-segmentation_chinese-base', 'sequence_length': 512}. trying to build by task and model information. |
| 103 | +/home/ubuntu/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:1044: FutureWarning: The `device` argument is deprecated and will be removed in v5 of Transformers. |
| 104 | + warnings.warn( |
| 105 | +{'output': ['生成', '一', '段', '新年', '祝福', '的', '文字', '跟', '所有', '人', '分享']} |
| 106 | +``` |
| 107 | + |
| 108 | +The segmentation model has correctly identified the following words: |
| 109 | + |
| 110 | +- 一 (one): This is a numeral. |
| 111 | + |
| 112 | +- 段 (piece): This is a measure word used for text. |
| 113 | + |
| 114 | +- 新年 (New Year): This is a noun phrase meaning "New Year." |
| 115 | + |
| 116 | +- 祝福 (blessings): This is a noun meaning "blessings" or "good wishes." |
| 117 | + |
| 118 | +- 的 (of): This is a possessive particle. |
| 119 | + |
| 120 | +- 文字 (text): This is a noun meaning "text" or "written words." |
| 121 | + |
| 122 | +- 跟 (with): This is a preposition meaning "with." |
| 123 | + |
| 124 | +- 所有 (all): This is a quantifier meaning "all." |
| 125 | + |
| 126 | +- 人 (people): This is a noun meaning "people." |
| 127 | + |
| 128 | +- 分享 (share): This is a verb meaning "to share." |
| 129 | + |
| 130 | + |
| 131 | +The segmentation model has successfully identified the word boundaries and separated the sentence into meaningful units, which is essential for further natural language processing tasks like machine translation or sentiment analysis. |
0 commit comments