- 
                Notifications
    You must be signed in to change notification settings 
- Fork 238
Description
Describe the bug
Linux allows mounting disks using a device alias (symlink) but the CWAgent is not able to resolve the EBS VolumeId for the device.
Steps to reproduce
- 
Launch Linux t3 instance (with required instance profile) 
- 
Install, configure, and start the CloudWatch agent 
yum install -y amazon-cloudwatch-agent
cat <<'EOF' > /tmp/amazon-cloudwatch-agent-config.json
{
  "agent": {
    "metrics_collection_interval": 60
  },
  "metrics": {
    "aggregation_dimensions": [
      ["VolumeId"]
    ],
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "disk": {
        "append_dimensions": {
          "VolumeId": "${aws:VolumeId}"
        },
        "ignore_file_system_types": ["devtmpfs", "overlay", "shm", "sysfs", "tmpfs"],
        "measurement": ["used_percent"],
        "metrics_collection_interval": 60,
        "resources": ["*"]
      }
    }
  }
}
EOF
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -c file:/tmp/amazon-cloudwatch-agent-config.json
- 
Confirm the base metrics are reporting and have VolumeId populated 
- 
Create new EBS volume and attach to the instance as /dev/xvdz
- 
Format the EBS volume: mkfs.xfs /dev/xvdz
- 
Mount the EBS volume using /dev/xvdzsource device via a direct syscall:
cat <<'EOF' > ~/mount.py
#!/usr/bin/env python3
import ctypes
import ctypes.util
import os
libc = ctypes.CDLL(ctypes.util.find_library("c"), use_errno=True)
libc.mount.argtypes = (ctypes.c_char_p, ctypes.c_char_p, ctypes.c_char_p, ctypes.c_ulong, ctypes.c_char_p)
def mount(source, target, fs, options=""):
  ret = libc.mount(source.encode(), target.encode(), fs.encode(), 0, options.encode())
  if ret < 0:
    errno = ctypes.get_errno()
    raise OSError(errno, f"Error mounting {source} ({fs}) on {target} with options '{options}': {os.strerror(errno)}")
mount("/dev/xvdz", "/mnt/data-xvdz", "xfs", "")
EOF
mkdir -p /mnt/data-xvdz
python3 ~/mount.py
- 
The mounted volume will show up as /dev/xvdzwhen runningdf -handcat /proc/mounts. Running themountcommand will show the resolved device symlink name.
- 
Check for metrics for the newly mounted EBS volume and if VolumeId is populated 
What did you expect to see?
Expected to see VolumeId populated for all disk mount points.
What did you see instead?
The VolumeId is not populated.
What version did you use?
Version: CWAgent/1.300054.1 (go1.23.8; linux; amd64)
What config did you use?
{
  "agent": {
    "metrics_collection_interval": 60
  },
  "metrics": {
    "aggregation_dimensions": [
      ["VolumeId"]
    ],
    "append_dimensions": {
      "InstanceId": "${aws:InstanceId}"
    },
    "metrics_collected": {
      "disk": {
        "append_dimensions": {
          "VolumeId": "${aws:VolumeId}"
        },
        "ignore_file_system_types": ["devtmpfs", "overlay", "shm", "sysfs", "tmpfs"],
        "measurement": ["used_percent"],
        "metrics_collection_interval": 60,
        "resources": ["*"]
      }
    }
  }
}
Environment
OS: Amazon Linux 2 (amazon/amzn2-ami-ecs-hvm-2.0.20250610-x86_64-ebs)
Additional context
I make use of the Rexray EBS plugin to handle creation and mounting of EBS Volumes for ECS Services. Turns out that Rexray EBS plugin calls the mount syscall without resolving the symlink that the nvme driver creates. This results in the kernel truly mounting the block device as xvd*.
[root@ip-10-0-91-150 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p1  100T  9.7G  90.3T   10% /
/dev/xvdp      50G   25G  25G   50% /var/lib/docker/plugins/cfbcd2009d193760d0b441f622a2385bde857b3f4e1b66c827467e6b47fae543/propagated-mount/volumes/my-app-data
[root@ip-10-0-91-150 ~]# cat /proc/mounts
/dev/nvme0n1p1 / xfs rw,noatime,attr2,inode64,noquota 0 0
/dev/nvme0n1p1 /var/lib/docker/plugins/cfbcd2009d193760d0b441f622a2385bde857b3f4e1b66c827467e6b47fae543/propagated-mount xfs rw,noatime,attr2,inode64,noquota 0 0
/dev/nvme0n1p1 /var/lib/docker/plugins/399504751ea4753b38a6931240b4f1ae63be57bf6edaa50bf3535e11aae9ee34/propagated-mount xfs rw,noatime,attr2,inode64,noquota 0 0
/dev/xvdp /var/lib/docker/plugins/cfbcd2009d193760d0b441f622a2385bde857b3f4e1b66c827467e6b47fae543/propagated-mount/volumes/my-app-data xfs rw,relatime,nouuid,attr2,inode64,noquota 0 0
In order to verify what Telegraf is actually reporting directly, I adjusted the generated config CWAgent generates and ran the latest Telegraf
cat <<'EOF' > ~/telegraf-config.toml
[agent]
  collection_jitter = "0s"
  debug = false
  flush_interval = "1s"
  flush_jitter = "0s"
  hostname = ""
  interval = "60s"
  logtarget = "stderr"
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  omit_hostname = false
  precision = ""
  quiet = false
  round_interval = false
[inputs]
  [[inputs.disk]]
    fieldpass = ["used_percent"]
    ignore_fs = ["devtmpfs", "overlay", "shm", "sysfs", "tmpfs"]
    interval = "60s"
    tagexclude = ["mode"]
    [inputs.disk.tags]
[outputs]
  [[outputs.file]]
    files = ["stdout"]
EOF
curl -LO https://dl.influxdata.com/telegraf/releases/telegraf-1.34.4_linux_amd64.tar.gz
tar -xzf telegraf-1.34.4_linux_amd64.tar.gz
./telegraf-1.34.4/usr/bin/telegraf -config ~/telegraf-config.toml
Telegraf Results:
disk,device=nvme0n1p1,fstype=xfs,host=ip-10-0-91-150.us-east-2.compute.internal,label=/,path=/ used_percent=9.79178633890271363 1749864322000000000
disk,device=nvme0n1p1,fstype=xfs,host=ip-10-0-91-150.us-east-2.compute.internal,label=/,path=/var/lib/docker/plugins/cfbcd2009d193760d0b441f622a2385bde857b3f4e1b66c827467e6b47fae543/propagated-mount used_percent=9.79178633890271363 1749864322000000000
disk,device=nvme0n1p1,fstype=xfs,host=ip-10-0-91-150.us-east-2.compute.internal,label=/,path=/var/lib/docker/plugins/399504751ea4753b38a6931240b4f1ae63be57bf6edaa50bf3535e11aae9ee34/propagated-mount used_percent=9.79178633890271363 1749864322000000000
disk,device=xvdp,fstype=xfs,host=ip-10-0-91-150.us-east-2.compute.internal,path=/var/lib/docker/plugins/cfbcd2009d193760d0b441f622a2385bde857b3f4e1b66c827467e6b47fae543/propagated-mount/volumes/my-app-data used_percent=49.956185744611764 1749864322000000000