diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_index.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_index.md new file mode 100644 index 0000000000..bcdd9f3272 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_index.md @@ -0,0 +1,58 @@ +--- +title: Deploy TensorFlow on Google Cloud C4A (Arm-based Axion VMs) + +minutes_to_complete: 30 + +who_is_this_for: This learning path is intended for software developers deploying and optimizing TensorFlow workloads on Linux/Arm64 environments, specifically using Google Cloud C4A virtual machines powered by Axion processors. + +learning_objectives: + - Provision an Arm-based SUSE SLES virtual machine on Google Cloud (C4A with Axion processors) + - Install TensorFlow on a SUSE Arm64 (C4A) instance + - Verify TensorFlow by running basic computation and model training tests on Arm64 + - Benchmark TensorFlow using tf.keras to evaluate inference speed and model performance on Arm64 systems. + +prerequisites: + - A [Google Cloud Platform (GCP)](https://cloud.google.com/free) account with billing enabled + - Basic familiarity with [TensorFlow](https://www.tensorflow.org/) + +author: Pareena Verma + +##### Tags +skilllevels: Introductory +subjects: ML +cloud_service_providers: Google Cloud + +armips: + - Neoverse + +tools_software_languages: + - TensorFlow + - Python + - tf.keras + +operatingsystems: + - Linux + +# ================================================================================ +# FIXED, DO NOT MODIFY +# ================================================================================ +further_reading: + - resource: + title: Google Cloud documentation + link: https://cloud.google.com/docs + type: documentation + + - resource: + title: TensorFlow documentation + link: https://www.tensorflow.org/learn + type: documentation + + - resource: + title: Phoronix Test Suite (PTS) documentation + link: https://www.phoronix-test-suite.com/ + type: documentation + +weight: 1 +layout: "learningpathall" +learning_path_main_page: "yes" +--- diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_next-steps.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_next-steps.md new file mode 100644 index 0000000000..c3db0de5a2 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/_next-steps.md @@ -0,0 +1,8 @@ +--- +# ================================================================================ +# FIXED, DO NOT MODIFY THIS FILE +# ================================================================================ +weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation. +title: "Next Steps" # Always the same, html page title. +layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing. +--- diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/background.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/background.md new file mode 100644 index 0000000000..bb3cf5b347 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/background.md @@ -0,0 +1,24 @@ +--- +title: Getting started with TensorFlow on Google Axion C4A (Arm Neoverse-V2) + +weight: 2 + +layout: "learningpathall" +--- + +## Google Axion C4A Arm instances in Google Cloud + +Google Axion C4A is a family of Arm-based virtual machines built on Google’s custom Axion CPU, which is based on Arm Neoverse-V2 cores. Designed for high-performance and energy-efficient computing, these virtual machines offer strong performance for modern cloud workloads such as CI/CD pipelines, microservices, media processing, and general-purpose applications. + +The C4A series provides a cost-effective alternative to x86 virtual machines while leveraging the scalability and performance benefits of the Arm architecture in Google Cloud. + +To learn more about Google Axion, refer to the [Introducing Google Axion Processors, our new Arm-based CPUs](https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu) blog. + +## TensorFlow + +[TensorFlow](https://www.tensorflow.org/) is an **open-source machine learning and deep learning framework** developed by **Google**. It helps developers and researchers **build, train, and deploy AI models** efficiently across **CPUs, GPUs, and TPUs**. + +With support for **neural networks**, **natural language processing (NLP)**, and **computer vision**, TensorFlow is widely used for **AI research and production**. +Its **flexibility** and **scalability** make it ideal for both **cloud** and **edge environments**. + +To learn more, visit the [official TensorFlow website](https://www.tensorflow.org/). diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/baseline.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/baseline.md new file mode 100644 index 0000000000..e4f27097d5 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/baseline.md @@ -0,0 +1,95 @@ +--- +title: TensorFlow Baseline Testing on Google Axion C4A Arm Virtual Machine +weight: 5 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## TensorFlow Baseline Testing on GCP SUSE VMs +This section helps you check if TensorFlow is properly installed and working on your **Google Axion C4A Arm64 VM**. You will run small tests to confirm that your CPU can perform TensorFlow operations correctly. + + +### Verify Installation +This command checks if TensorFlow is installed correctly and prints its version number. + +```console +python -c "import tensorflow as tf; print(tf.__version__)" +``` +### List Available Devices +This command shows which hardware devices TensorFlow can use — like CPU or GPU. On most VMs, you’ll see only CPU listed. + +```console +python -c "import tensorflow as tf; print(tf.config.list_physical_devices())" +``` + +You should see an output similar to: +```output +[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')] +``` + +### Run a Simple Computation +This test multiplies two large matrices to check that TensorFlow computations work correctly on your CPU and measures how long it takes. + +```python +python -c "import tensorflow as tf; import time; +a = tf.random.uniform((1000,1000)); b = tf.random.uniform((1000,1000)); +start = time.time(); c = tf.matmul(a,b); end = time.time(); +print('Computation time:', end - start, 'seconds')" +``` +- This checks **CPU speed** and the correctness of basic operations. +- Note the **computation time** as your baseline. + +You should see an output similar to: +```output +Computation time: 0.008263111114501953 seconds +``` +### Test Neural Network Execution +Create a new file for testing a simple neural network: + +```console +vi test_nn.py +``` +This opens a new Python file where you’ll write a short TensorFlow test program. +Paste the code below into the `test_nn.py` file: + +```python +import tensorflow as tf +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import Dense +import numpy as np + +# Dummy data +x = np.random.rand(1000, 20) +y = np.random.rand(1000, 1) + +# Define the model +model = Sequential([ + Dense(64, activation='relu', input_shape=(20,)), + Dense(1) +]) + +# Compile the model +model.compile(optimizer='adam', loss='mse') + +# Train for 1 epoch +model.fit(x, y, epochs=1, batch_size=32) +``` +This script creates and trains a simple neural network using random data — just to make sure TensorFlow’s deep learning functions work properly. + +**Run the Script** + +Execute the script with Python: + +```console +python test_nn.py +``` + +**Output** + +TensorFlow will print training progress, like: +```output +32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.1024 +``` + +This confirms that TensorFlow is working properly on your Arm64 VM. diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/benchmarking.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/benchmarking.md new file mode 100644 index 0000000000..c41bbf04ff --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/benchmarking.md @@ -0,0 +1,131 @@ +--- +title: TensorFlow Benchmarking +weight: 6 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + + +## TensorFlow Benchmarking with tf.keras +This guide benchmarks multiple TensorFlow models (ResNet50, MobileNetV2, and InceptionV3) using dummy input data. It measures average inference time and throughput for each model running on the CPU. + +`tf.keras` is **TensorFlow’s high-level API** for building, training, and benchmarking deep learning models. It provides access to **predefined architectures** such as **ResNet**, **MobileNet**, and **Inception**, making it easy to evaluate model performance on different hardware setups like **CPU**, **GPU**, or **TPU**. + +### Activate your TensorFlow virtual environment +This step enables your isolated Python environment (`tf-venv`) where TensorFlow is installed. It ensures that all TensorFlow-related packages and dependencies run in a clean, controlled setup without affecting system-wide Python installations. + +```console +source ~/tf-venv/bin/activate +python -c "import tensorflow as tf; print(tf.__version__)" +``` +### Install required packages +Here, you install TensorFlow 2.20.0 and NumPy, the core libraries needed for model creation, computation, and benchmarking. NumPy supports efficient numerical operations, while TensorFlow handles deep learning workloads. + +```console +pip install tensorflow==2.20.0 numpy +``` + +### Create a Python file named tf_cpu_benchmark.py: +This step creates a Python script (`tf_cpu_benchmark.py`) that will run TensorFlow model benchmarking tests. + +```console +vi tf_cpu_benchmark.py +``` + +Paste the following code: +```python +import tensorflow as tf +import time + +# List of models to benchmark +models = { + "ResNet50": tf.keras.applications.ResNet50, + "MobileNetV2": tf.keras.applications.MobileNetV2, + "InceptionV3": tf.keras.applications.InceptionV3 +} + +batch_size = 32 +num_runs = 50 + +for name, constructor in models.items(): + print(f"\nBenchmarking {name}...") + # Create model without pretrained weights + model = constructor(weights=None, input_shape=(224,224,3)) + # Generate dummy input + dummy_input = tf.random.uniform([batch_size, 224, 224, 3]) + # Warm-up + _ = model(dummy_input) + # Benchmark + start = time.time() + for _ in range(num_runs): + _ = model(dummy_input) + end = time.time() + avg_time = (end - start) / num_runs + throughput = batch_size / avg_time + print(f"{name} average inference time per batch: {avg_time:.4f} seconds") + print(f"{name} throughput: {throughput:.2f} images/sec") +``` +- **Import libraries** – Loads TensorFlow and `time` for model creation and timing. +- **Define models** – Lists three TensorFlow Keras models: **ResNet50**, **MobileNetV2**, and **InceptionV3**. +- **Set parameters** – Configures `batch_size = 32` and runs each model **50 times** for stable benchmarking. +- **Create model instances** – Initializes each model **without pretrained weights** for fair CPU testing. +- **Generate dummy input** – Creates random data shaped like real images **(224×224×3)** for inference. +- **Warm-up phase** – Runs one inference to **stabilize model graph and memory usage**. +- **Benchmark loop** – Measures total time for 50 runs and calculates **average inference time per batch**. +- **Compute throughput** – Calculates how many **images per second** the model can process. +- **Print results** – Displays **average inference time and throughput** for each model. + +### Run the benchmark +Execute the benchmarking script: + +```console +python tf_cpu_benchmark.py +``` + +You should see an output similar to: +```output +Benchmarking ResNet50... +ResNet50 average inference time per batch: 1.2051 seconds +ResNet50 throughput: 26.55 images/sec + +Benchmarking MobileNetV2... +MobileNetV2 average inference time per batch: 0.2909 seconds +MobileNetV2 throughput: 110.02 images/sec + +Benchmarking InceptionV3... +InceptionV3 average inference time per batch: 0.8971 seconds +InceptionV3 throughput: 35.67 images/sec +``` + +### Benchmark Metrics Explanation + +- **Average Inference Time per Batch (seconds):** Measures how long it takes to process one batch of input data. Lower values indicate faster inference performance. +- **Throughput (images/sec):** Indicates how many images the model can process per second. Higher throughput means better overall efficiency. +- **Model Type:** Refers to the neural network architecture used for testing (e.g., ResNet50, MobileNetV2, InceptionV3). Each model has different computational complexity. + +### Benchmark summary on x86_64 +To compare the benchmark results, the following results were collected by running the same benchmark on a `x86 - c4-standard-4` (4 vCPUs, 15 GB Memory) x86_64 VM in GCP, running SUSE: + +| **Model** | **Average Inference Time per Batch (seconds)** | **Throughput (images/sec)** | +|------------------|-----------------------------------------------:|-----------------------------:| +| **ResNet50** | 1.3690 | 23.37 | +| **MobileNetV2** | 0.4274 | 74.87 | +| **InceptionV3** | 0.8799 | 36.37 | + +### Benchmark summary on Arm64 +Results from the earlier run on the `c4a-standard-4` (4 vCPU, 16 GB memory) Arm64 VM in GCP (SUSE): + +| **Model** | **Average Inference Time per Batch (seconds)** | **Throughput (images/sec)** | +|------------------|-----------------------------------------------:|-----------------------------:| +| **ResNet50** | 1.2051 | 26.55 | +| **MobileNetV2** | 0.2909 | 110.02 | +| **InceptionV3** | 0.8971 | 35.67 | + +### TensorFlow benchmarking comparison on Arm64 and x86_64 + +- **Arm64 VMs show strong performance** for lightweight CNNs like **MobileNetV2**, achieving over **110 images/sec**, indicating excellent optimization for CPU-based inference. +- **Medium-depth models** like **InceptionV3** maintain a **balanced trade-off between accuracy and latency**, confirming consistent multi-core utilization on Arm. +- **Heavier architectures** such as **ResNet50** show expected longer inference times but still deliver **stable throughput**, reflecting good floating-point efficiency. +- Compared to **x86_64**, **Arm64 provides energy-efficient yet competitive performance**, particularly for **mobile, quantized, or edge AI workloads**. +- **Overall**, Arm64 demonstrates that **TensorFlow workloads can run efficiently on cloud-native ARM processors**, making them a **cost-effective and power-efficient alternative** for AI inference and model prototyping. diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/images/gcp-vm.png b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/images/gcp-vm.png new file mode 100644 index 0000000000..0d1072e20d Binary files /dev/null and b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/images/gcp-vm.png differ diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/installation.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/installation.md new file mode 100644 index 0000000000..54ab1d1d4b --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/installation.md @@ -0,0 +1,72 @@ +--- +title: Install TensorFlow +weight: 4 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## TensorFlow Installation on GCP SUSE VM +TensorFlow is a widely used **open-source machine learning library** developed by Google, designed for building and deploying ML models efficiently. On Arm64 SUSE VMs, TensorFlow can run on CPU natively, or on GPU if available. + +### System Preparation +Update the system and install Python3 and pip: + +```console +sudo zypper refresh +sudo zypper update -y +sudo zypper install -y python3 python3-pip python3-venv +``` +This ensures your system is up-to-date and installs Python with the essential tools required for TensorFlow setup. + +**Verify Python version:** + +Confirm that Python and pip are correctly installed and identify their versions to ensure compatibility with TensorFlow requirements. + +```console +python3 --version +pip3 --version +``` + +### Create a Virtual Environment (Recommended) +Set up an isolated Python environment (`tf-venv`) so that TensorFlow and its dependencies don’t interfere with system-wide packages or other projects. + +```console +python3 -m venv tf-venv +source tf-venv/bin/activate +``` +Create and activate an isolated Python environment to keep TensorFlow dependencies separate from system packages. + +### Upgrade pip +Upgrade pip to the latest version for smooth and reliable package installation. + +```console +pip install --upgrade pip +``` + +### Install TensorFlow +Install the latest stable TensorFlow version for Arm64: + +```console +pip install tensorflow==2.20.0 +``` + +{{% notice Note %}} +TensorFlow 2.18.0 introduced compatibility with NumPy 2.0, incorporating its updated type promotion rules and improved numerical precision. +You can view [this release note](https://blog.tensorflow.org/2024/10/whats-new-in-tensorflow-218.html) + +The [Arm Ecosystem Dashboard](https://developer.arm.com/ecosystem-dashboard/) recommends Tensorflow version 2.18.0, the minimum recommended on the Arm platforms. +{{% /notice %}} + +### Verify installation: +Run a quick Python command to check that TensorFlow was installed successfully and print the installed version number for confirmation. + +```console +python -c "import tensorflow as tf; print(tf.__version__)" +``` + +You should see an output similar to: +```output +2.20.0 +``` +TensorFlow installation is complete. You can now go ahead with the baseline testing of TensorFlow in the next section. diff --git a/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/instance.md b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/instance.md new file mode 100644 index 0000000000..2b93bc950d --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/tensorflow-gcp/instance.md @@ -0,0 +1,31 @@ +--- +title: Create a Google Axion C4A Arm virtual machine on GCP +weight: 3 + +### FIXED, DO NOT MODIFY +layout: learningpathall +--- + +## Overview + +In this section, you will learn how to provision a Google Axion C4A Arm virtual machine on Google Cloud Platform (GCP) using the `c4a-standard-4` (4 vCPUs, 16 GB memory) machine type in the Google Cloud Console. + +{{% notice Note %}} +For support on GCP setup, see the Learning Path [Getting started with Google Cloud Platform](https://learn.arm.com/learning-paths/servers-and-cloud-computing/csp/google/). +{{% /notice %}} + +## Provision a Google Axion C4A Arm VM in Google Cloud Console + +To create a virtual machine based on the C4A instance type: +- Navigate to the [Google Cloud Console](https://console.cloud.google.com/). +- Go to **Compute Engine > VM Instances** and select **Create Instance**. +- Under **Machine configuration**: + - Populate fields such as **Instance name**, **Region**, and **Zone**. + - Set **Series** to `C4A`. + - Select `c4a-standard-4` for machine type. + + ![Create a Google Axion C4A Arm virtual machine in the Google Cloud Console with c4a-standard-4 selected alt-text#center](images/gcp-vm.png "Creating a Google Axion C4A Arm virtual machine in Google Cloud Console") + +- Under **OS and Storage**, select **Change**, then choose an Arm64-based OS image. For this Learning Path, use **SUSE Linux Enterprise Server**. Pick the preferred version for your Operating System. Ensure you select the **Arm image** variant. Click **Select**. +- Under **Networking**, enable **Allow HTTP traffic**. +- Click **Create** to launch the instance.