Skip to content

Commit aaadc93

Browse files
authored
Merge pull request #2314 from pareenaverma/content_review
spellcheck fixes
2 parents f653bf9 + a9d21a0 commit aaadc93

File tree

10 files changed

+104
-14
lines changed

10 files changed

+104
-14
lines changed

.wordlist.txt

Lines changed: 91 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4667,7 +4667,7 @@ Sommelier
46674667
chromeos
46684668
linuxcontainers
46694669
XPS
4670-
NIC's
4670+
NIC’s
46714671
offlines
46724672
passthrough
46734673
SLOs
@@ -4722,4 +4722,94 @@ ATtestation
47224722
CoCo
47234723
procedureS
47244724
NIC’s
4725+
httpbin
4726+
proxying
4727+
OpenBMC
4728+
PoC
4729+
PoCs
4730+
evb
4731+
ipmitool
4732+
openbmc
4733+
poc
4734+
IPMI
4735+
integrators
4736+
KCS
4737+
PLDM
4738+
MCTP
4739+
Redfish
4740+
hyperscalers
4741+
BMCs
4742+
OEM
4743+
NetFn
4744+
RDv
4745+
CSSv
4746+
penBmc
4747+
BMC's
4748+
socat
4749+
ZooKeeper
4750+
IRQs
4751+
IRQS
4752+
Friedt
4753+
namespaces
4754+
atlascli
4755+
benchmarkDB
4756+
cursorTest
4757+
replset
4758+
testCollection
4759+
Namespaces
4760+
mongotop
4761+
Mongotop
4762+
baselineDB
4763+
ef
4764+
netstat
4765+
tulnp
4766+
mongostat
4767+
arw
4768+
conn
4769+
getmore
4770+
qrw
4771+
vsize
4772+
conn
4773+
WiredTiger
4774+
GLE
4775+
getLastError
4776+
createIndex
4777+
getMore
4778+
getmore
4779+
RoT
4780+
lkvm
4781+
JMH
4782+
jmh
4783+
UseG
4784+
Xmx
4785+
Xms
4786+
JavaServer
4787+
servlets
4788+
RMSNorm
4789+
RoPE
4790+
FFN
4791+
ukernel
4792+
libstreamline
4793+
prefill
4794+
OpenCL
4795+
subgraphs
4796+
threadpool
4797+
worksize
4798+
Zhilong
4799+
Denoiser
4800+
RGGB
4801+
denoised
4802+
YGGV
4803+
Mohamad
4804+
Najem
4805+
kata
4806+
svl
4807+
svzero
4808+
anf
4809+
DynamIQ
4810+
Zena
4811+
learnt
4812+
lof
4813+
BalenaOS
4814+
balenaCloud
47254815

content/learning-paths/mobile-graphics-and-gaming/ai-camera-pipelines/5-performances.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
8787
Total run time over 20 iterations: 2030.5525 ms
8888
```
8989

90-
Re-run the Low Light Enhancment benchmark:
90+
Re-run the Low Light Enhancement benchmark:
9191

9292
```bash
9393
bin/low_light_image_enhancement_benchmark 20 resources/HDRNetLIME_lr_coeffs_v1_1_0_mixed_low_light_perceptual_l1_loss_float32.tflite

content/learning-paths/servers-and-cloud-computing/envoy-gcp/baseline-testing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,6 @@ A successful test shows HTTP/1.1 200 OK with a JSON body from httpbin.org, for e
149149
- **Successful connection:** The `curl` command successfully connected to the Envoy proxy on `localhost:10000`.
150150
- **Correct status code:** Envoy forwards the request and receives a successful `200 OK` response from the upstream.
151151
- **Host header rewrite:** Envoy rewrites `Host` to `httpbin.org` as configured.
152-
- **End-to-end Success:** The proxy is operational; requests are received, processed, and forwarded to the ackend.
152+
- **End-to-end Success:** The proxy is operational; requests are received, processed, and forwarded to the backend.
153153

154154
To stop Envoy in the first terminal, press **Ctrl+C**. This confirms the end-to-end flow with Envoy server is working correctly.

content/learning-paths/servers-and-cloud-computing/irq-tuning-guide/checking.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ weight: 2
66
layout: learningpathall
77
---
88

9-
First you should run the following command to identify all IRQs on the system. Identify the NIC IRQs and adjust the system by experirmenting and seeing how performance improves.
9+
First you should run the following command to identify all IRQs on the system. Identify the NIC IRQs and adjust the system by experimenting and seeing how performance improves.
1010

1111
```
1212
grep '' /proc/irq/*/smp_affinity_list | while IFS=: read path cpus; do
@@ -47,7 +47,7 @@ IRQ 104 -> CPUs 12 -> Device ens34-Tx-Rx-5
4747
IRQ 105 -> CPUs 5 -> Device ens34-Tx-Rx-6
4848
IRQ 106 -> CPUs 10 -> Device ens34-Tx-Rx-7
4949
```
50-
This can potential hurt performance. Suggestions and patterns to expertiment with will be on the next step.
50+
This can potential hurt performance. Suggestions and patterns to experiment with will be on the next step.
5151

5252
### reset
5353

@@ -69,4 +69,4 @@ done
6969

7070
### Saving these changes
7171

72-
Any changes you make to IRQs will be reset at reboot. You will need to change your systems settings to make your changes permenant.
72+
Any changes you make to IRQs will be reset at reboot. You will need to change your systems settings to make your changes permanant.

content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/Analyzing_token_generation_at_Prefill_and_Decode_stage.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ then add the Annotation Marker generation code here,
9090
}
9191
```
9292
93-
A string is added to the Annotation Marker to record the position of input tokens and numbr of tokens to be processed.
93+
A string is added to the Annotation Marker to record the position of input tokens and number of tokens to be processed.
9494
9595
### Step 3: Build llama-cli executable
9696
For convenience, llama-cli is static linked.
@@ -181,7 +181,7 @@ By monitoring other PMU events, Backend Stall Cycles and Backend Stall Cycles du
181181
We can see that at Prefill stage, Backend Stall Cycles due to Memory stall are only about 10% of total Backend Stall Cycles. However, at Decode stage, Backend Stall Cycles due to Memory stall are around 50% of total Backend Stall Cycles.
182182
All those PMU event counters indicate that it is compute-bound at Prefill stage and memory-bound at Decode stage.
183183

184-
Now, let us further profile the code execution with Streamline. In the ‘Call Paths’ view of Streamline, we can see the percentage of running time of functions that are orginized in form of call stack.
184+
Now, let us further profile the code execution with Streamline. In the ‘Call Paths’ view of Streamline, we can see the percentage of running time of functions that are organized in form of call stack.
185185

186186
![text#center](images/annotation_prefill_call_stack.png "Figure 12. Call stack")
187187

@@ -201,4 +201,4 @@ As we can see, the function, graph_compute, takes the largest portion of the run
201201

202202
* There is a result_output linear layer in Qwen1_5-0_5b-chat-q4_0 model, the wights are with Q6_K data type. The layer computes a huge [1, 1024] x [1024, 151936] GEMV operation, where 1024 is the embedding size and 151936 is the vocabulary size. This operation cannot be handled by KleidiAI yet, it is handled by the ggml_vec_dot_q6_K_q8_K function in ggml-cpu library.
203203
* The tensor nodes for computation of Multi-Head attention are presented as three-dimension matrices with FP16 data type (KV cache also holds FP16 values), they are computed by ggml_vec_dot_f16 function in ggml-cpu library.
204-
* The computation of RoPE, Softmax, RMSNorm layers does not take significant portion of the running time.
204+
* The computation of RoPE, Softmax, RMSNorm layers does not take significant portion of the running time.

content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/Conclusion.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,5 @@ layout: learningpathall
99
# Conclusion
1010
By leveraging the Streamline tool together with a good understanding of the llama.cpp code, the execution process of the LLM model can be visualized, which helps analyze code efficiency and investigate potential optimization.
1111

12-
Note that addtional annotation code in llama.cpp and gatord might somehow affect the performance.
12+
Note that additional annotation code in llama.cpp and gatord might somehow affect the performance.
1313

content/learning-paths/servers-and-cloud-computing/llama_cpp_streamline/_index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ cascade:
77

88
minutes_to_complete: 50
99

10-
who_is_this_for: Engineers who want to learn LLM inference on CPU or proflie and optimize llama.cpp code.
10+
who_is_this_for: Engineers who want to learn LLM inference on CPU or profile and optimize llama.cpp code.
1111

1212
learning_objectives:
1313
- Be able to use Streamline to profile llama.cpp code

content/learning-paths/servers-and-cloud-computing/mongodb-on-azure/benchmarking.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ layout: learningpathall
99
## Benchmark MongoDB with **mongotop** and **mongostat**
1010

1111
In this section, you will measure MongoDB's performance in real time.
12-
You will install the official MongoDB database tools, start MongoDB and run a script to simulate heavy load. With the script running you will then meassure the database's live performance using **mongotop** and **mongostat**.
12+
You will install the official MongoDB database tools, start MongoDB and run a script to simulate heavy load. With the script running you will then measure the database's live performance using **mongotop** and **mongostat**.
1313

1414
1. Install MongoDB Database Tools
1515

content/learning-paths/servers-and-cloud-computing/mongodb-on-azure/create-instance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ Creating a virtual machine based on Azure Cobalt 100 is no different from creati
4343

4444
![Azure portal VM creation — Azure Cobalt 100 Arm64 virtual machine (D4ps_v6) alt-text#center](images/final-vm.png "Figure 5: VM deployment confirmation in Azure portal")
4545

46-
While the virtual machine ready, proceed to the next section to delpoy MongoDB on your running instance.
46+
While the virtual machine ready, proceed to the next section to deploy MongoDB on your running instance.

content/learning-paths/servers-and-cloud-computing/neoverse-rdv3-swstack/3_rdv3_sw_build.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ git config --global user.email "<[email protected]>"
2828

2929
## Step 2: Fetch the source code
3030

31-
The RD‑V3 platform firmware stack consists of multiple components, most maintained in separate Git respositories, such as:
31+
The RD‑V3 platform firmware stack consists of multiple components, most maintained in separate Git repositories, such as:
3232

3333
- TF‑A
3434
- SCP/MCP

0 commit comments

Comments
 (0)