|
| 1 | +# Security Protection of Models |
| 2 | + |
| 3 | +After training and optimizing models locally, AI service providers |
| 4 | +deploy the models on third-party platforms (such as mobile devices, edge |
| 5 | +devices, and cloud servers) to provide inference services. The design |
| 6 | +and training of the AI models require a large amount of time, data, and |
| 7 | +computing power. This is why model and service providers protect the |
| 8 | +intellectual property rights of the models (including model structures |
| 9 | +and parameters) from being stolen during transfer, storage, and running |
| 10 | +in the deployment phase. |
| 11 | + |
| 12 | +## Overview |
| 13 | + |
| 14 | +The security protection of models can be divided into static protection |
| 15 | +and dynamic protection. Static protection refers to protecting models |
| 16 | +during transfer and storage. At present, it is widely implemented based |
| 17 | +on file encryption, in which AI model files are transferred and stored |
| 18 | +in ciphertext and are decrypted in the memory before being used for |
| 19 | +inference. However, throughout the inference process, models remain in |
| 20 | +plaintext in the memory, making it possible for theft. Dynamic |
| 21 | +protection refers to protecting models during runtime. Dynamic |
| 22 | +protection methods currently available can be classified into three |
| 23 | +categories. The first is trusted execution environment-based (TEE-based) |
| 24 | +protection. TEEs are usually secure zones isolated on trusted hardware, |
| 25 | +and AI model files are stored and transferred in non-secure zones and |
| 26 | +running after decryption in the secure zones. Although this method |
| 27 | +involves only a short inference latency on the CPU, it requires specific |
| 28 | +trusted hardware, making it difficult to implement. In addition, due to |
| 29 | +constraints on hardware resources, protecting large-scale deep models is |
| 30 | +difficult and heterogeneous hardware acceleration is still challenging. |
| 31 | +The second is a cryptographic computing-based protection, which ensures |
| 32 | +that models remain in ciphertext during transfer, storage, and running |
| 33 | +using cryptographic techniques (such as homomorphic encryption and |
| 34 | +secure multi-party computation). Although this method is free from |
| 35 | +hardware constraints, it has large computation or communications |
| 36 | +overheads and cannot protect model structure information. The third is |
| 37 | +obfuscation-based protection. This method scrambles the computational |
| 38 | +logic of models with fake nodes, so that attackers cannot understand the |
| 39 | +models even if they obtain them. Compared with the former two methods, |
| 40 | +obfuscation-based protection brings a smaller overhead to the |
| 41 | +performance and neglectable loss of accuracy. Furthermore, it is |
| 42 | +hardware-agnostic, and can support protection of very large models. We |
| 43 | +will focus on protection using the obfuscation-based method. |
| 44 | + |
| 45 | +## Model Obfuscation |
| 46 | + |
| 47 | +Model obfuscation can automatically obfuscate the computational logic of |
| 48 | +plaintext AI models, preventing attackers from understanding the models |
| 49 | +even if they obtain them during transfer and storage. In addition, |
| 50 | +models can run while still being obfuscated, thereby ensuring the |
| 51 | +confidentiality while they are running. Obfuscation does not affect the |
| 52 | +inference results and brings only a low performance overhead. |
| 53 | + |
| 54 | + |
| 55 | +:label:`ch-deploy/model_obfuscate` |
| 56 | + |
| 57 | +Figure :numref:`ch-deploy/model_obfuscate` depicts the model obfuscation |
| 58 | +procedure, which is described as follows. |
| 59 | + |
| 60 | +1. **Interpret the given model into a computational graph:** Based on |
| 61 | + the structure of a trained model, interpret the model file into the |
| 62 | + graph expression (computational graph) of the model computational |
| 63 | + logic for subsequent operations. The resulting computational graph |
| 64 | + contains information such as node identifiers, node operator types, |
| 65 | + node parameters, and network structures. |
| 66 | + |
| 67 | +2. **Scramble the network structure of the computational graph[^1]:** |
| 68 | + Scramble the relationship between nodes in the computational graph |
| 69 | + using graph compression, augmentation, and other techniques in order |
| 70 | + to conceal the true computational logic. In graph compression, the |
| 71 | + key subgraph structure is matched by checking the entire graph. |
| 72 | + These subgraphs are compressed and replaced with a single new |
| 73 | + computing node. Graph augmentation adds new input/output edges to |
| 74 | + the compressed graph in order to further conceal the dependencies |
| 75 | + between nodes. An input/output edge comes from or points to an |
| 76 | + existing node in the graph, or comes from or points to the new |
| 77 | + obfuscation node in this step. |
| 78 | + |
| 79 | +3. **Anonymize nodes in the computational graph:** Traverse the |
| 80 | + computational graph processed in Step (2) and select the nodes to be |
| 81 | + protected. For a node to be protected, we can replace the node |
| 82 | + identifier, operator type, and other attributes that can describe |
| 83 | + the computational logic of the model with non-semantic symbols. For |
| 84 | + node identifier anonymization, the anonymized node identifier must |
| 85 | + be unique in order to distinguish different nodes. For operator type |
| 86 | + anonymization, to avoid operator type explosion caused by |
| 87 | + large-scale computational graph anonymization, we can divide nodes |
| 88 | + with the same operator type into several disjoint sets, and replace |
| 89 | + the operator type of nodes in the same set with the same symbol. |
| 90 | + Step (5) ensures that the model can be identified and executed after |
| 91 | + node anonymization. |
| 92 | + |
| 93 | +4. **Scramble weights of the computational graph:** Add random noise |
| 94 | + and mapping functions to the weights to be protected. The random |
| 95 | + noise and mapping functions can vary with weights. Step (6) ensures |
| 96 | + that the noise of weights does not change the model execution |
| 97 | + result. The computational graph processed after Steps (2), (3), |
| 98 | + and (4) are then saved as a model file for subsequent operations. |
| 99 | + |
| 100 | +5. **Transform operator interfaces:** Steps (5) and (6) transform |
| 101 | + operators to be protected in order to generate candidate obfuscated |
| 102 | + operators. An original operator may correspond to multiple |
| 103 | + obfuscated operators. The quantity of candidate obfuscated operators |
| 104 | + depends on how many sets the nodes are grouped into in Step (3). In |
| 105 | + this step, the operator interfaces are transformed based on the |
| 106 | + anonymized operator types and operator input/output relationship |
| 107 | + obtained after Steps (2), (3), and (4). Such transformation can be |
| 108 | + implemented by changing the input, output, or interface name. |
| 109 | + Changing the input and output involves modification on the input and |
| 110 | + output data, making the form of the obfuscated operator different |
| 111 | + from that of the original operator. The added data includes the data |
| 112 | + dependency introduced by graph augmentation in Step (2) and the |
| 113 | + random noise introduced by weight obfuscation in Step (4). The |
| 114 | + operator name is changed to the name of the anonymized operator |
| 115 | + obtained in Step (3) to ensure that the model can still be |
| 116 | + identified and executed after the nodes are anonymized and that the |
| 117 | + operator name does not reveal the computational logic. |
| 118 | + |
| 119 | +6. **Transform the operator implementation:** Transform the operator |
| 120 | + code implementation by encrypting strings, adding redundant code, |
| 121 | + and employing other code obfuscation techniques in order to keep the |
| 122 | + computational logic consistent between the original operator and |
| 123 | + obfuscated operator while also making the logic more difficult to |
| 124 | + understand. A combination of different code obfuscation techniques |
| 125 | + may be applied to different operators in order to realize the code |
| 126 | + implementation transformation. In addition to equivalent code |
| 127 | + transformation, the obfuscated operators further implement some |
| 128 | + additional computational logic. For example, in Step (4), noise has |
| 129 | + been added to the weights of an operator. The obfuscated operator |
| 130 | + also implements an inverse mapping function of the weight noise, |
| 131 | + dynamically eliminating noise in the operator execution process and |
| 132 | + ensuring that the computation result is the same as the original |
| 133 | + model. The generated obfuscated operators can then be saved as a |
| 134 | + library file for subsequent operations. |
| 135 | + |
| 136 | +7. **Deploy the model and operator library:** Deploy the obfuscated |
| 137 | + model and corresponding operator library file on the desired device. |
| 138 | + |
| 139 | +8. **Load the obfuscated model:** Parse the obfuscated model file and |
| 140 | + obtain the graph expression of the model computational logic, that |
| 141 | + is, the obfuscated computational graph obtained after Step (2), (3), |
| 142 | + and (4). |
| 143 | + |
| 144 | +9. **Initialize the computational graph:** Initialize the computational |
| 145 | + graph to generate an execution task sequence. According to security |
| 146 | + configuration options, if runtime model security needs to be |
| 147 | + protected, the obfuscated graph should be directly initialized to |
| 148 | + generate an execution task sequence. Each compute unit in the |
| 149 | + sequence corresponds to execution of one obfuscated operator or |
| 150 | + original operator. If security protection is required during only |
| 151 | + model transfer and storage, restore the obfuscated graph in the |
| 152 | + memory to the source graph, and then initialize the source graph to |
| 153 | + generate an execution task sequence. Each unit in the sequence |
| 154 | + corresponds to the execution of an original operator. In this way, |
| 155 | + performance overheads during inference can be further reduced. |
| 156 | + |
| 157 | +10. **Execute inference tasks:** The model executes the compute units |
| 158 | + sequentially on the input of the AI application in order to obtain |
| 159 | + an inference result. If a compute unit corresponds to an obfuscated |
| 160 | + operator, the obfuscated operator library is invoked. Otherwise, the |
| 161 | + original operator library is invoked. |
| 162 | + |
| 163 | +[^1]: Scrambling refers to adding noise to the computational graph. |
| 164 | + Common methods include adding redundant nodes and edges and merging |
| 165 | + some subgraphs. |
0 commit comments