Skip to content

Commit 77c6c4f

Browse files
committed
add figure title
1 parent 97333b4 commit 77c6c4f

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

app/projects/hypformer/page.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -119,31 +119,31 @@ By defining these operations through HRC (and HTC for linear transformations), w
119119
Framework of Hypformer. Input data (text, images, graphs) are projected onto the Lorentz model, then transformed via HTC. The result passes through the hyperbolic linear attention block with positional encoding, followed by a Feedforward layer (built by HTC) and LayerNorm (built by HRC). This serves as an encoder which can optionally incorporate a GNN.
120120
For classification tasks in this study, the decoder is the fully connected layer. Dropout, activation, and residual connections are omitted for brevity.
121121

122-
![framework](./assets/framework2024.jpg)
122+
![Framework](./assets/framework2024.jpg)
123123

124124

125125

126126
## 3.Experiments
127127
### 3.1 Experiments on Large-scale Graphs
128128

129129
We first evaluate Hypformer on diverse large-scale graphs for node classification, with node counts ranging from millions to billions, including ogbn-arxiv, ogbn-protein, and Papers100M.
130-
![title|scale=0.5](./assets/exp1_large_scale.png)
130+
![Experiments on Large-scale Graphs|scale=0.5](./assets/exp1_large_scale.png)
131131

132132
Hypformer consistently outperforms other models across various large-scale graph datasets, demonstrating substantial improvements. It is worth noting that models, such as GraphFormer, GraphTrans, and GraphGPS, HAN, HNN++ and F-HNN, have difficulty operating effectively on large-scale graph data.
133133
In addition, our method significantly outperforms the recent approaches such as, SGFormer and NodeFormer across all tested scenarios, highlighting its superior effectiveness. Importantly, Hypformer exhibits robust scalability, maintaining its performance advantage even on the largest dataset, ogbn-papers100M, where previous Transformer-based models have encountered limitations.
134134

135135
### 3.2 Experiments on Medium/Small-scale Graphs
136136
To complement our large-scale evaluations, we assessed Hypformer on small- and medium-scale graph datasets. This additional testing allows for a more comprehensive comparison against current state-of-the-art models, including GNNs, graph transformers, and hyperbolic approaches that may not scale effectively to larger datasets. By expanding our evaluation scope, we aim to isolate Hypformer's effectiveness in graph learning from its scalability advantages.
137137

138-
![title|scale=0.5](./assets/exp2_medium_scale.png)
138+
![Experiments on Medium/Small-scale Graphs|scale=0.5](./assets/exp2_medium_scale.png)
139139

140140
Our findings suggest that the proposed method significantly surpasses both standard GNNs and hyperbolic GNN models by a substantial margin.
141141
Importantly, the method exhibits effectiveness not only in scenarios with hyperbolic datasets (like Disease, Airport) but also in situations with non-hyperbolic dataset (like Cora, CiteSeer and PubMed).
142142

143143
### 3.3 Comparisons on Text and Vision Datasets
144144
Additionally, we apply our model to semi-supervised image and text classification tasks on the Mini-ImageNet and 20News-Groups datasets. We also construct a graph using k-NN (based on input node features) to utilize graph model. These experiments are conducted closely in Nodeformer.
145145

146-
![title|scale=0.8](./assets/exp3_image_text.png)
146+
![Comparisons on Text and Vision Datasets|scale=0.8](./assets/exp3_image_text.png)
147147

148148
Hypformer outperforms in seven out of eight cases. In contrast, the performance of competing baselines models varying significantly with different k values, while our method demonstrates greater stability.
149149

@@ -152,7 +152,7 @@ Hypformer outperforms in seven out of eight cases. In contrast, the performance
152152
**Scalability**
153153
We conducted additional tests on the model’s scalability regarding the number of nodes in a single batch. The Amazon2M dataset was used, and we randomly selected a subset of nodes, with the number of nodes varying from 10K to 200K. We made a comparison between softmax attention defined by Equation (3) and linear attention, keeping all other parameters the same. As depicted in Figure 5, the memory usage of the proposed method exhibits a linear increase with the size of the graph. When the node count exceeds 40K, the softmax attention experiences an out-of-memory (OOM) issue. However, the proposed method continues to function effectively, resulting in a 10X reduction in GPU cost.
154154

155-
![title|scale=0.5](./assets/gpucost.png)
155+
![Scalability|scale=0.5](./assets/gpucost.png)
156156

157157

158158
## 4.Conclusion

0 commit comments

Comments
 (0)