Skip to content

Новейшее Pcileech DMA CFW руководство которое тебе понадобится. Спонсировано DMAPolice.com

Notifications You must be signed in to change notification settings

erLCoder/PCILeech-DMA-Firmware-RU

 
 

Repository files navigation

Руководство по разработке пользовательской прошивки для полной эмуляции устройства


Работаю над организацией этого в виде wiki. Помощь приветствуется!


Примечание от автора и статус руководства:

Пишу это откровенно — в последнее время мне пришлось столкнуться с огромными трудностями. Помимо серьёзных финансовых потерь из-за мошеннического возврата платежа, я переживаю множество других проблем, связанных с условиями жизни и здоровьем, что сильно повлияло на мою возможность быть в сети и уделять время проектам. Честно говоря, продолжать создавать столь объёмные ресурсы, как это руководство, — настоящее испытание на фоне личных трудностей.

Ожидается, что это будет последняя крупная версия основного руководства. Для более опытных пользователей, уже знакомых с базовыми аппаратными концепциями (например, с назначением чипа FTDI), будет также доступна сокращённая, упрощённая версия.

Если вы находите эту работу полезной и можете чем-то помочь — любая поддержка будет глубоко оценена. Ваша щедрость позволяет мне продолжать вносить вклад в это сообщество, несмотря на все сложности. Искренне надеюсь, что это руководство уже оказалось и будет оставаться для вас ценным ресурсом.


Памяти и Посвящение

Ross

Это руководство с глубоким уважением посвящается памяти Росса Фримана (1947–1989)

Выдающегося инженера, новатора из Мичигана и сооснователя компании Xilinx. Росс Фриман широко признан как отец технологии программируемых пользователем вентильных матриц (FPGA), которая произвела революцию в мире вычислений.

В 1984 году, в эпоху, когда полупроводниковая промышленность была сосредоточена на микросхемах с фиксированной функцией, Фриман осмелился представить себе иную парадигму: аппаратное обеспечение, которое можно перепрограммировать даже после производства. Его революционный патент (#4,870,302) и неустанная пропаганда реконфигурируемых вычислений открыли технологическую эру, которая продолжает менять наш мир вот уже четыре десятилетия.

Его новаторская идея сделала возможным быструю разработку и внедрение специализированных решений на уровне кремния без колоссальных затрат, характерных для традиционного производства ASIC, демократизировала проектирование аппаратного обеспечения и ускорила технологический прогресс в бесчисленных областях.

Сегодня видение Фримана лежит в основе передовых достижений в области искусственного интеллекта, высокопроизводительных вычислений, телекоммуникаций, автомобильных систем, аэрокосмических технологий и многих других сфер, которые в годы его жизни казались лишь мечтой.

В 2009 году он был посмертно введён в Национальный зал славы изобретателей США. Его наследие живёт не только в кремнии, но и в духе технологической смелости, который побуждает нас бросать вызов ограничениям и представлять новые возможности.

"Конечной целью FPGA было создание программируемых логических устройств, способных заменить стандартные цифровые чипы." — Росс Фриман


Содержание

Часть 1: Базовые концепции

  1. Вступление
  2. Ключевые определения
  3. Совместимость устройств
  4. Требования
  5. Сбор информации о донорском устройстве
  6. Начальная кастомизация прошивки
  7. Настройка и модификация проекта Vivado

Часть 2: Промежуточные концепции и реализация

  1. Продвинутая кастомизация прошивки
  2. Эмуляция специфических возможностей устройства
  3. Эмуляция пакетов уровня транзакций(TLP)

Часть 3: Продвинутые техники и оптимизация

  1. Сборка, прошивка и тестирование
  2. Продвинутые методы отладки
  3. Устранение неполадок
  4. Точность эмуляции и оптимизации
  5. Лучшие практики разработки прошивки
  6. Дополнительные ресурсы
  7. Контактная информация
  8. Поддержка и вклад в проект

Часть 1: Базовые концепции


1. Вступление

1.1 Цель руководства

Основная цель этого руководства — дать вам знания и практические навыки для разработки кастомной прошивки Direct Memory Access (DMA) для FPGA-устройств. Такая специализированная прошивка позволяет вашей FPGA точно эмулировать идентичность и поведение других PCIe (Peripheral Component Interconnect Express) устройств. Эта техника эмуляции открывает мощные возможности и имеет серьёзное значение в нескольких продвинутых областях:

Исследования аппаратной безопасности::

  • Поиск уязвимостей: эмулируя устройство, можно создавать управляемую среду для отправки искажённых или неожиданных данных драйверам, систематически находя уязвимости (например, переполнение буфера, состояния гонки), которые могут быть эксплуатированы через периферийное устройство.
  • Анализ драйверов: наблюдение за взаимодействием ОС и драйверов с аппаратным обеспечением, включая эмуляцию нестандартных или недокументированных функций, помогает понять поведение драйвера и изучить закрытые протоколы.
  • Анализ побочных каналов: эмулированное устройство можно настроить для экспериментов, связанных с утечкой информации через временные задержки или потребление энергии.

Red Team и тестирование на проникновение:

  • Обход мер безопасности: эмуляция безобидного или доверенного устройства (например, сетевой карты или контроллера хранения) для получения прав DMA, что позволяет напрямую взаимодействовать с памятью системы и обходить системы защиты на программном уровне.
  • Скрытое присутствие: эмулированное вредоносное устройство может оставаться незаметным дольше, чем программные импланты.
  • Эксплуатация доверия: системы часто автоматически доверяют подключённому аппаратному обеспечению. Кастомная прошивка может использовать это, подделывая устройства с нужными разрешениями.

Отладка и диагностика систем:

  • Воспроизводимые тестовые среды: создание специфичных аппаратных сценариев для надёжного воспроизведения сложных багов.
  • Инъекция ошибок: намеренная эмуляция неправильного поведения устройства (например, сбоев в формировании TLP или задержек ответов) для проверки устойчивости системы и драйверов.

Тестирование и валидация аппаратуры:

  • Разработка драйверов: тестирование новых или изменённых драйверов с эмулируемым устройством до появления физических прототипов.
  • Предварительное соответствие стандартам: предварительная проверка аспектов протокола PCIe.

Поддержка устаревших систем и совместимость:

  • Эмуляция устаревших или редких PCIe-устройств для обеспечения работы старых систем или совместимости между разными поколениями аппаратуры.

Продвигаясь по этому руководству, вы научитесь:

  • Тщательно извлекать идентифицирующие атрибуты и конфигурационные данные с реального «донорского» PCIe-устройства.
  • Модифицировать и расширять существующие open-source FPGA прошивки (особенно PCILeech-FPGA) для подделки идентичности донорского устройства.
  • Использовать профессиональный инструментарий для FPGA-разработки на базе Xilinx Vivado, а также редакторы кода (например, Visual Studio Code).
  • Понимать архитектуру PCIe, принципы работы DMA и нюансы разработки прошивки, которая достоверно повторяет поведение аппаратуры.

1.2 Целевая аудитория

Это руководство предназначено для тех, кто уже имеет базовые или средние знания в области компьютерных систем, аппаратуры и программирования. Оно технически сложное и требует внимания к низкоуровневым деталям. В частности, оно адресовано:

  • Разработчикам прошивки: инженерам, создающим или адаптирующим прошивку для FPGA, особенно для задач высокоскоростной передачи данных (DMA) и прямого взаимодействия с аппаратурой по PCIe. Желателен опыт работы с Verilog/VHDL и инструментами FPGA.
  • Аппаратным инженерам: специалистам по проектированию и тестированию PCIe-устройств, которые хотят создавать сложные тестовые стенды или эмулировать компоненты. Требуется знание PCIe и цифрового проектирования.
  • Профессионалам в области кибербезопасности и исследованиям:
    • Исследователям уязвимостей и разработчикам эксплойтов: для изучения аппаратных векторов атак и создания доказательств концепции. Нужно знать устройство ОС, управление памятью, архитектуру драйверов.
    • Red Team: специалисты, ищущие расширенные методы проникновения и устойчивого доступа.
    • Цифровым криминалистам и инцидент-респонсерам: понимание этих техник поможет анализировать сложные аппаратные атаки.
  • FPGA-энтузиастам и продвинутым хоббистам: тем, кто уже работал с FPGA и хочет углубиться в PCIe и аппаратную эмуляцию. Важно желание разбираться в технических спецификациях и документации.

Рекомендуемые знания:

  • Основы цифровой логики и компьютерной архитектуры.
  • Знакомство с HDL (Verilog или VHDL).
  • Навыки работы в Linux и с командной строкой.
  • Опыт программирования на C/C++ или скриптовых языках (Python).
  • Понимание принципов работы ОС (управление памятью, драйверы, прерывания).
  • Терпение и системный подход к отладке.

Кривая обучения может быть крутой, особенно если PCIe или продвинутые концепции FPGA для вас новы. Тем не менее, руководство стремится разбить сложные темы на понятные и выполнимые шаги.

1.3 Как использовать руководство

Руководство разделено на три логически связанных части, которые постепенно наращивают ваши знания:

  • Часть 1: Базовые концепции: вводная часть, знакомит с терминологией, основами PCIe и DMA, необходимым аппаратным и программным обеспечением, настройкой инструментов Vivado и PCILeech-FPGA, а также начальным сбором информации с донорского устройства и базовыми изменениями прошивки. Рекомендуется проходить последовательно и вдумчиво.
  • Часть 2: Средний уровень и реализация: (в разработке) Рассмотрит продвинутые настройки прошивки, эмуляцию специфичных регистров и возможностей устройств, работу с пакетами уровня транзакций (TLP).
  • Часть 3: Продвинутые техники и оптимизация: (в разработке) Будет посвящена продвинутой отладке, оптимизации производительности и точности эмуляции, устранению ошибок и лучшим практикам с учётом безопасности.

Советы по работе с руководством:

  • Последовательное изучение: Особенно для частей 1 и 2 следуйте разделам по порядку, так как последующие концепции строятся на предыдущих.
  • Практические упражнения: Это практическое руководство. Активно выполняйте шаги настройки, вносите изменения в код и проводите эксперименты на своем оборудовании.
  • Адаптация к вашей среде: Пути к файлам, конкретные идентификаторы устройств и версии программного обеспечения могут отличаться. Понимайте суть инструкций, чтобы адаптировать их под вашу конфигурацию.
  • Обращение к внешним ресурсам: Спецификация PCIe и документация по FPGA — ваши основные источники информации. Это руководство упрощает и направляет, но для глубокого понимания иногда нужно обращаться к первоисточникам.
  • Итеративная разработка: Разработка прошивки редко бывает линейной. Ожидайте, что придется многократно повторять, отлаживать и улучшать свои проекты. Активно пользуйтесь разделами по устранению неполадок и техниками отладки.

Вы будете работать с языками описания аппаратуры (HDL, в частности SystemVerilog в PCILeech-FPGA), инструментами синтеза и реализации FPGA (Vivado), а также, возможно, с инструментами программирования хоста и анализа PCIe.


2. Ключевые определения

Для успешного освоения эмуляции PCIe-устройств и разработки кастомной прошивки важно чётко понимать следующие термины. Они будут часто использоваться в руководстве.

  • DMA (Direct Memory Access):

    • Определение: Функция современных компьютерных архитектур, позволяющая аппаратным устройствам (например, сетевым картам, графическим процессорам или эмулируемому FPGA-устройству) напрямую читать и записывать данные в основную системную память (RAM), минуя центральный процессор (CPU) при каждом передаваемом байте.
    • Значение: DMA критически важно для высокопроизводительных операций ввода-вывода. Освобождая CPU от задач передачи данных, оно позволяет процессору выполнять другие вычисления, значительно повышая общую пропускную способность и эффективность системы. В контексте этого руководства FPGA будет использовать DMA для взаимодействия с памятью хоста — мощная возможность, активно применяемая в исследованиях безопасности и red teaming.
  • PCIe (Peripheral Component Interconnect Express, периферийная компонентная шина Express):

    • Определение: Высокоскоростной последовательный стандарт расширения компьютера, предназначенный заменить более старые стандарты, такие как PCI, PCI-X и AGP. Использует точечно-точечную топологию, при которой каждое устройство соединяется отдельной последовательной связью с корневым комплексом (обычно часть чипсета или процессора). Передача данных происходит посредством пакетов.
    • Значение: доминирующий стандарт подключения высокопроизводительных периферийных устройств к материнским платам. Понимание его протокола, слоистой архитектуры (физический, канальный и транспортный уровни) и механизмов конфигурации является обязательным для эмуляции современных устройств.
  • TLP (Transaction Layer Packet):

    • Определение: Основная единица обмена данными на транспортном уровне PCIe. TLP предназначены для передачи запросов (например, чтение/запись памяти, чтение/запись ввода-вывода, чтение/запись конфигурации) и ответов (завершение запросов) между PCIe-устройствами. Каждый TLP состоит из заголовка, опциональных данных и, при необходимости, сквозной контрольной суммы (ECRC).
    • Значение: Для точной эмуляции устройство на FPGA должно уметь правильно формировать, отправлять, принимать и интерпретировать TLP, соответствующие поведению исходного (донорского) устройства. Понимание типов TLP, их форматов и управления потоком — ключ к продвинутой эмуляции.
  • BAR (Base Address Register):

    • Определение: : Расположены в пространстве конфигурации PCIe-устройства, BAR — специальные регистры, которые устройство использует для запроса ресурсов адресного пространства у хоста. У устройства может быть до шести 32-битных BAR (или меньше), а некоторые пары 32-битных BAR могут образовывать 64-битные BAR. Эти регистры определяют начальные адреса и размеры областей памяти с отображением ввода-вывода (MMIO) или портов ввода-вывода, через которые устройство предоставляет доступ к своим внутренним регистрами и памяти для процессора хоста.
    • Значение: При перечислении PCIe-устройства хост читает BAR, чтобы понять требования устройства к памяти и I/O, затем выделяет и программирует эти регистры с реальными базовыми адресами в физическом адресном пространстве системы. Для корректного взаимодействия с хостом и драйверами ваше эмулируемое устройство должно точно соответствовать BAR донорского устройства.
  • FPGA (Field-Programmable Gate Array):

    • Определение: Интегральная схема, которую можно конфигурировать после изготовления — именно поэтому она называется «программируемой в поле». FPGA содержит массив программируемых логических блоков и многоуровневую систему перенастраиваемых соединений, позволяющих «сшивать» блоки в пользовательские цифровые схемы.
    • Значение: это основное аппаратное средство в этом руководстве. Благодаря своей перенастраиваемости они идеально подходят для эмуляции других устройств, так как позволяют реализовать точную логику и интерфейсы, необходимые для имитации присутствия и поведения PCIe-устройства-донора.
  • MSI/MSI-X (Message Signaled Interrupts / Message Signaled Interrupts Extended):

    • Определение: Механизмы, позволяющие PCIe-устройству генерировать прерывания CPU путём записи специального сообщения (TLP — в частности Memory Write TLP) в системный адрес памяти, вместо использования выделенных физических линий прерываний, как в устаревших PCI. MSI-X — расширение MSI с поддержкой большего числа векторов прерываний и повышенной гибкости.
    • Значение: Большинство современных PCIe-устройств используют MSI или MSI-X для более эффективного и гибкого управления прерываниями. Точная эмуляция часто требует реализации выбранного механизма прерываний устройства-донора, включая настройку соответствующих структур возможностей MSI/MSI-X и корректную генерацию сообщений прерываний.
  • DSN (Device Serial Number):

    • Определение: 64-битный глобально уникальный идентификатор, который может быть опционально реализован в PCIe-устройстве. Если он присутствует, то, как правило, находится в структуре расширенных возможностей в пространстве конфигурации устройства.
    • Значение: Хотя не все устройства имеют DSN, некоторые драйверы или программное обеспечение могут использовать его для уникальной идентификации, лицензирования или отслеживания устройств. Корректная эмуляция DSN может быть важной для полной прозрачности и избежания обнаружения поддельного устройства.
  • PCIe Configuration Space:

    • Определение: Стандартизированная область адреса объёмом 256 байт (для устройств типа 0, т.е. endpoint) или 4 КБ, ассоциированная с каждой функцией PCIe-устройства (одно устройство может иметь несколько функций). Это пространство содержит важную информацию об устройстве: Vendor ID, Device ID, Class Code, Revision ID, BAR, указатели на возможности (capability pointers), а также различные управляющие и статусные регистры. Доступ к нему осуществляется хост-системой через специальные TLP пакеты конфигурации (Configuration Read/Write TLP).
    • Значение: Пространство конфигурации — это «паспорт» PCIe-устройства. Первый шаг в эмуляции устройства — точная репликация соответствующих полей пространства конфигурации устройства-донора в прошивке FPGA. Хост использует эту информацию для идентификации, настройки и распределения ресурсов для устройства.
  • Donor Device:

    • Определение: Физическое PCIe-устройство, чьё поведение и идентичность вы стремитесь воспроизвести на FPGA. Это устройство служит источником данных для извлечения конфигурационных параметров (Vendor ID, Device ID, BAR и др.) и моделей поведения.
    • Значение: Точность эмуляции напрямую зависит от того, насколько полно и правильно вы сможете собрать и воспроизвести характеристики устройства-донора.
  • Root Complex (RC):

    • Определение: Компонент в иерархии PCIe, соединяющий CPU и подсистему памяти с PCIe-инфраструктурой. Он инициирует PCIe-транзакции от имени CPU и обрабатывает транзакции, инициированные нижестоящими PCIe-устройствами. Root Complex также выполняет начальную инициализацию шины и конфигурацию устройств.
    • Значение: Ваше эмулируемое устройство будет главным образом взаимодействовать именно с Root Complex (или через PCIe-коммутаторы, если они есть) при передаче данных и управлении по PCIe.
  • Endpoint (EP):

    • Определение: Тип PCIe-устройства, находящийся на периферии PCIe-сети, выполняющий функции потребления или генерации данных. Примеры: сетевые карты, видеокарты, контроллеры хранения и программируемое вами устройство на базе FPGA. Endpoint-звенья запрашивают ресурсы у Root Complex и инициируют транзакции.
    • Значение: В этом руководстве ваша FPGA будет запрограммирована как устройство типа Endpoint, эмулирующее поведение и конфигурацию конкретного устройства-донора.
  • HDL (Hardware Description Language):

    • Определение: Специализированный язык программирования, используемый для описания структуры, архитектуры и функционирования электронных схем, в основном цифровых. Наиболее распространённые HDL — это Verilog и VHDL.
    • Значение: В проекте PCILeech-FPGA вы будете использовать SystemVerilog — расширение языка Verilog — для описания логики работы эмулируемого устройства.
  • Bitstream:

    • Определение: Финальный конфигурационный файл, загружаемый в FPGA. Он описывает, как настроить логические блоки и соединения внутри чипа для реализации вашей схемы. Bitstream создаётся в процессе компиляции проекта средствами разработки для FPGA, такими как Xilinx Vivado.
    • Значение: Генерация и прошивка корректного bitstream-файла — заключительный шаг в развертывании вашей прошивки на FPGA. От этого зависит, как именно устройство будет восприниматься и работать в PCIe-системе.

3. Совместимость устройств

Достижение успешной и точной эмуляции PCIe-устройства на базе FPGA требует полной совместимости выбранного оборудования и конфигурации хост-системы. В этом разделе рассматриваются поддерживаемые платформы FPGA, ключевые аппаратные особенности PCIe и системные требования, необходимые для настройки вашей среды разработки.

3.1 Поддерживаемое оборудование на базе FPGA

Хотя в данном руководстве изложена универсальная методология, применимая к различным FPGA-устройствам с поддержкой DMA, основное внимание уделяется FPGA Xilinx 7-серии, которые часто используются в open-source DMA-платах благодаря хорошему балансу между производительностью и доступностью. Особенно выделяется плата Squirrel DMA (35T), поскольку она популярна и хорошо документирована в рамках проекта PCILeech-FPGA.

Базовые принципы настройки PCIe IP-ядер и разработки логики на HDL подходят для следующих FPGA-семейств и плат:

  • Squirrel (Artix-7 35T)
    • Описание: Доступная и недорогая плата с FPGA Xilinx Artix-7 35T, идеально подходящая для задач извлечения памяти и базовой/средней эмуляции устройств. Отличный выбор для новичков.
    • Ключевые особенности: Хорошее соотношение цена/производительность, достаточные ресурсы логики для образовательных целей и исследований.
  • Enigma-X1 (Artix-7 75T)
    • Описание: Средний уровень – основана на FPGA Artix-7 75T. Предоставляет больше логических ресурсов и памяти, что делает её подходящей для более сложной эмуляции и расширенного взаимодействия с хостом.
    • Ключевые особенности: Увеличенное количество логических ячеек и блоков RAM (BRAM) для реализации более сложных дизайнов.
  • ZDMA (Artix-7 100T)
    • Описание: Плата с повышенной производительностью на базе Artix-7 100T, оптимизированная для интенсивной работы с памятью и высокоскоростного DMA. Подходит для масштабных проектов и нагрузок.
    • Ключевые особенности: Существенное увеличение ресурсов логики и памяти для сложных решений.
  • Kintex-7 (K325T, K410T, etc.)
    • Описание: Продвинутые платы на базе семейства Kintex-7, предназначенные для высокопроизводительных и сложных задач: эмуляция крупных устройств, работа по PCIe Gen3 x8/x16, и пр. Более дорогие, но значительно превосходят Artix-7 по ресурсам.
    • Ключевые особенности: Высокоскоростные трансиверы, огромное количество логики и встроенной памяти, поддержка расширенного PCIe.

Важное примечание о семействе FPGA: Хотя общие принципы настройки одинаковы, конкретные конфигурации IP-ядер и структура тактирования могут незначительно различаться между различными FPGA Xilinx 7-серии — такими как Artix-7, Kintex-7 и Zynq-7000 (PS/PL). Всегда обращайтесь к документации именно той платы, которую вы используете, а также к официальным руководствам пользователя на IP-ядро PCIe от Xilinx, соответствующим вашему семейству FPGA.

Проект PCILeech-FPGA часто предоставляет специализированные Tcl-скрипты и исходные файлы для конкретных плат, что упрощает настройку и уменьшает риск ошибок при сборке проекта.

3.2 Особенности оборудования с PCIe

Для корректной и неограниченной работы FPGA-устройства с поддержкой DMA, предназначенного для эмуляции PCIe-устройств, необходимо учитывать и при необходимости модифицировать ряд настроек как на уровне BIOS/UEFI, так и в операционной системе.

  • IOMMU / VT-d / AMD-Vi Settings
    • Рекомендация: На этапе первичной настройки и отладки настоятельно рекомендуется отключить IOMMU (технология Intel VT-d или AMD-Vi) в BIOS/UEFI.
    • Причина: IOMMU (Input-Output Memory Management Unit) ограничивает доступ DMA-устройств к памяти, используя механизм адресной трансляции и защиты, аналогичный MMU у CPU. Это повышает безопасность — но одновременно блокирует прямой доступ к памяти, необходимый для анализа ОЗУ и корректной эмуляции устройств.
    • Где искать: Обычно находится в BIOS/UEFI по пути: Advanced Settings → CPU Configuration, Virtualization, или I/O Virtualization. Отключите опции вроде VT-d, IOMMU, AMD-Vi.
  • Защита от DMA (Windows) / Уровень безопасности Thunderbolt (Linux)
    • Рекомендации для Windows: Отключите Kernel DMA Protection в современных версиях Windows. Это включает параметры, такие как Virtualization-Based Security (VBS) и Memory Integrity (HVCI). Эти функции используют IOMMU для предотвращения несанкционированных DMA-атак со стороны внешних устройств, подключённых через Thunderbolt или PCIe.
    • Шаги (Windows):
      • Откройте параметры безопасности Windows: Пуск > Параметры > Конфидециальность и безопасность > Безопасность Windows > Безопасность устройства.
      • В разделе "Core isolation(Изоляция ядра)," нажмите "Core isolation details.(Подробности изоляции ядра)"
      • Отключите "Memory integrity.(Целостность памяти)"
      • Возможно, потребуется отключить Secure Boot в BIOS/UEFI, так как VBS часто зависит от этой функции.
      • Внимание: Отключение этих функций значительно снижает уровень безопасности вашей системы, делая её уязвимой для различных атак, включая атаки с использованием вредоносных DMA-устройств. Это следует делать только на отдельной тестовой системе, а не на основном компьютере, и только в безопасной, изолированной среде при полном понимании возможных рисков.
    • Рекомендации (Linux/Thunderbolt): Если вы используете систему с портами Thunderbolt, изучите и при необходимости измените Thunderbolt Security Level(уровень безопасности Thunderbolt) в BIOS/UEFI. Более низкие уровни безопасности (например, "Безопасность отключена", "Авторизация пользователя") обычно требуются для того, чтобы произвольные устройства Thunderbolt/PCIe могли выполнять DMA без явного разрешения со стороны хоста.
  • Требования к PCIe-слоту
    • Рекомендации: Используйте совместимый PCIe-слот, который физически соответствует требованиям FPGA-устройства. Большинство карт DMA на базе Artix-7 работают с PCIe Gen2 x1 или x4.
    • Обоснование:
      • Физическая совместимость: Карта x1 может быть установлена в слот x1, x4, x8 или x16, но карта x4 требует как минимум слот x4.
      • Производительность: Хотя карта x4 может работать в слоте x1 (если разъём физически открыт или модифицирован), она будет функционировать на скорости x1, что значительно ограничит скорость передачи данных. Для оптимальной производительности и точной эмуляции возможностей устройства-донора убедитесь, что плата FPGA установлена в слот, обеспечивающий как минимум эмулируемую ширину канала и скорость (например, если вы эмулируете устройство Gen2 x4, используйте слот Gen2 x4 на хост-системе).
    • Настройки BIOS материнской платы: Некоторые материнские платы позволяют настраивать скорость работы PCIe-слотов (например, принудительно установить Gen1 или Gen2). Убедитесь, что эти настройки не противоречат желаемой скорости эмуляции.

3.3 Системные требования

Настройка надёжной среды разработки является ключом к эффективной разработке прошивок, синтезу и отладке.

  • Хост-система
    • Процессор: Современный многоядерный процессор необходим для запуска инструментов разработки FPGA, таких как Vivado, которые требуют значительных вычислительных ресурсов при синтезе и реализации. (например, Intel Core i5/i7/i9 или AMD Ryzen 5/7/9, 8-го поколения или новее рекомендуется).
    • Оперативная память (RAM): Минимум 16 ГБ настоятельно рекомендуется; 32 ГБ или больше — идеально для сложных FPGA-проектов, так как Vivado может потреблять большое количество памяти, особенно на этапе реализации.
    • Хранение данных: Твердотельный накопитель (SSD) с минимум 200 GB свободного места крайне важен. Установка инструментов FPGA (только Vivado может занимать более 50 ГБ), проектные файлы и выходные данные синтеза/реализации быстро заполняют диск. Скорость SSD значительно сокращает время сборки.
    • Операционная система:
      • Windows 10/11 (64-bit Professional or Enterprise edition): Широко поддерживается Xilinx Vivado и многими аппаратными отладочными инструментами. Не забудьте о мерах предосторожности, связанных с защитой от DMA.
      • Совместимый дистрибутив Linux (64-бит): Часто используются и хорошо поддерживаются релизы Ubuntu LTS (например, 20.04, 22.04) для работы с Vivado. Linux часто предоставляет более гибкую среду для написания скриптов и взаимодействия с PCIe на низком уровне.
  • Периферийные устройства
    • Программатор JTAG: Абсолютно необходим для прошивки скомпилированного битстрима в вашу карту DMA на основе FPGA. Примеры: Xilinx Platform Cable USB II, Digilent JTAG-HS3 или встроенные JTAG-программаторы, присутствующие на некоторых отладочных платах. Убедитесь, что он совместим с вашей FPGA-платой и Vivado.
    • Слот PCIe: Как обсуждалось в разделе 3.2, убедитесь, что в вашей хост-системе есть доступный и совместимый слот PCIe для вашей карты DMA.
    • Порт USB: Для подключения JTAG-программатора и, возможно, UART/последовательной консоли к вашей плате FPGA для вывода отладочной информации.

4. Требования

Этот раздел описывает основные аппаратные и программные компоненты, а также рекомендуемую настройку среды, необходимые для начала разработки пользовательской прошивки для эмуляции PCIe-устройств. Наличие этих предварительных условий до начала работы значительно упростит процесс разработки.

4.1 Аппаратное обеспечение

  • Донор-устройство PCIe
    • Назначение: Это физическое аппаратное устройство, чью конфигурацию и поведение вы намерены эмулировать на FPGA. Оно служит авторитетным источником для критически важных идентификаторов, значений регистров и рабочих характеристик.
    • Примеры: Распространённые примеры включают стандартную сетевую карту (NIC), контроллеры хранения SATA или NVMe, USB-контроллер или любую другую PCIe-карту расширения, которую можно безопасно извлечь из системы для анализа. Настоятельно рекомендуется использовать устройство, не являющееся критически важным для функционирования системы, поскольку вы будете исследовать его на низком уровне.
  • DMA FPGA Card
    • Описание: Отладочная плата на базе FPGA, специально разработанная или адаптированная для выполнения операций прямого доступа к памяти (DMA) через интерфейс PCIe. Именно на эту платформу будет загружена ваша пользовательская прошивка.
    • Примеры: Как описано в разделе 3.1, совместимые платы включают Squirrel (Artix-7 35T), Enigma-X1 (Artix-7 75T), ZDMA (Artix-7 100T) или различные решения на базе Kintex-7. Убедитесь, что выбранная плата оснащена разъёмом PCIe.
  • JTAG Programmer
    • Назначение: Этот важный инструмент обеспечивает связь между вашим ПК и FPGA на плате DMA. Используется для прошивки (записи) скомпилированного bitstream-файла в FPGA, а также для интерактивной отладки с помощью таких инструментов, как Vivado Hardware Manager и Integrated Logic Analyzer (ILA).
    • Примеры:
      • Xilinx Platform Cable USB II: Традиционный и широко совместимый программатор для FPGA от Xilinx. Убедитесь, что установлены необходимые драйверы.
      • Digilent JTAG-HS3 / JTAG-HS2: Популярные и надёжные программаторы, известные хорошей интеграцией с Vivado. HS3 обеспечивает более высокую скорость прошивки.
      • Integrated JTAG: Некоторые платы FPGA могут иметь встроенный USB-to-JTAG мост (например, на базе чипа FTDI), что исключает необходимость в отдельном программаторе. Обратитесь к документации вашей платы.

4.2 Программное обеспечение

  • Xilinx Vivado Design Suite
    • Описание: Официальная, комплексная среда разработки FPGA от Xilinx (ныне AMD). Vivado необходим для синтеза HDL-кода, реализации проекта на целевой FPGA, генерации итогового bitstream и отладки аппаратуры. В комплект входят необходимые IP-ядер, компиляторы и утилиты.
    • Скачать: Посетите официальную страницу загрузок Xilinx (AMD):: https://www.xilinx.com/support/download.html.
    • Примечание по версии: Хотя в некоторых старых руководствах может упоминаться Vivado 2020.1, настоятельно рекомендуется скачать актуальную стабильную версию (например, Vivado 2023.x или новее), совместимую с вашей целевой FPGA-семейством (Artix-7, Kintex-7). Проект PCILeech-FPGA обычно поддерживает новые версии Vivado.
  • Visual Studio Code
    • Описание: Высоко настраиваемый и функционально насыщенный редактор кода от Microsoft. Отличный выбор для написания и редактирования HDL-кода на Verilog/SystemVerilog благодаря обширной экосистеме расширений с поддержкой подсветки синтаксиса, линтинга, автозаполнения и интеграции с системами контроля версий.
    • Скачать: https://code.visualstudio.com/
  • PCILeech-FPGA
    • Описание: Открытая платформа и репозиторий исходного кода для разработки DMA на FPGA. Предоставляет готовые к использованию PCIe IP-ядер и хорошо структурированный проект, который служит отличной отправной точкой для создания пользовательской прошивки. Это руководство во многом опирается на архитектуру PCILeech-FPGA.
    • Репозиторий: https://github.com/ufrisk/pcileech-fpga
  • Arbor (MindShare)
    • Описание: Мощный и удобный программный инструмент для глубокой диагностики и анализа PCIe-устройств. Предоставляет детальные сведения о конфигурационном пространстве, возможностях и регистрах подключенного PCIe-оборудования, что делает его незаменимым для сбора информации о донорском устройстве.
    • Скачать: Доступен на сайте MindShare: https://www.mindshare.com/ (вероятно, нужно перейти в раздел программного обеспечения).
    • Примечание: Обычно требуется регистрация аккаунта и возможно предоставляется ограниченный по времени пробный период.
  • Альтернативные инструменты анализа PCIe-устройств
    • Telescan PE (Teledyne LeCroy):
      • Описание: Бесплатная утилита от Teledyne LeCroy для анализа трафика PCIe и перечисления устройств. Хотя это в первую очередь программный продукт для работы с их аппаратными анализаторами протокола, он также может предоставить базовые сведения о конфигурационном пространстве без специализированного оборудования.
      • Скачать: https://www.teledynelecroy.com/protocolanalyzer/pci-express/telescan-pe-software/resources/analysis-software
      • Примечание: Требуется ручная регистрация и одобрение для загрузки.
    • Встроенные средства ОС (для базовой проверки):
      • Диспетчер устройств Windows: Предоставляет базовые сведения о Vendor ID, Device ID, Subsystem ID и Class Code во вкладке «Сведения» в свойствах устройства.
      • **Linux lspci утилита **: Мощный командный инструмент для просмотра PCIe-устройств. Используйте lspci -nn для отображения Vendor/Device ID, lspci -vvv для подробной информации, включая BAR и возможности, и lspci -s <BUS:DEV.FUN> -xxxx для дампа сырого конфигурационного пространства.

4.3 Environment Setup

A clean and correctly configured development environment is crucial to avoid common pitfalls and ensure a smooth workflow.

4.3.1 Install Xilinx Vivado Design Suite

Steps:

  1. Visit the Xilinx (AMD) Vivado Download Page: https://www.xilinx.com/support/download.html.
  2. Download the Appropriate Version: Select the latest stable version of Vivado that is compatible with your operating system and, importantly, with your specific FPGA device (e.g., Artix-7, Kintex-7). Check the Vivado release notes for device support.
  3. Run the Installer: Execute the downloaded installer and follow the on-screen instructions carefully.
  4. Select Necessary Components: During installation, you will be prompted to select which device families to install. Crucially, select the device family corresponding to your FPGA board (e.g., "7 Series" for Artix-7/Kintex-7). This saves significant disk space compared to installing all families. Ensure you select the "Design Tools" (Synthesis, Implementation) and "Programming & Debugging" components.
  5. Launch Vivado: After installation, launch Vivado to confirm it opens without errors and that licenses (if applicable) are correctly configured.

4.3.2 Install Visual Studio Code

Steps:

  1. Visit the Visual Studio Code Download Page: https://code.visualstudio.com/.
  2. Download and Install: Download the installer for your operating system and follow the standard installation prompts.
  3. Install Extensions for HDL Support: Once VS Code is installed, open it and navigate to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X). Search for and install relevant extensions for Verilog/SystemVerilog, such as:
    • Verilog-HDL/SystemVerilog (by mshr-h)
    • VHDL (if you also work with VHDL) These extensions provide syntax highlighting, linting, and other helpful features.

4.3.3 Clone the PCILeech-FPGA Repository

This repository contains the base firmware structure and scripts you'll be modifying.

Steps:

  1. Open a Terminal or Command Prompt: (e.g., Git Bash on Windows, Terminal on Linux).
  2. Navigate to Your Desired Directory: Choose a location where you want to store your projects.
    cd ~/Projects/ # On Linux/macOS
    cd C:\Users\YourUsername\Documents\Projects\ # On Windows
  3. Clone the Repository:
    git clone https://github.com/ufrisk/pcileech-fpga.git
  4. Navigate to the Cloned Directory:
    cd pcileech-fpga
    This will be your main project directory. The PCILeech-FPGA project often contains subdirectories for different board variants (e.g., pcileech-artix-7-50t, pcileech-squirrel-35t). You will navigate into the relevant board-specific directory for your particular hardware.

4.3.4 Set Up a Clean Development Environment

Recommendation: Always work in an isolated or dedicated environment, especially when dealing with low-level hardware and potential security implications.

Steps:

  1. Use a Dedicated Development Machine or Virtual Machine:
    • Physical Machine: If possible, use a separate physical computer for your FPGA development and testing. This prevents accidental system instability or security risks on your primary machine.
    • Virtual Machine (VM): A VM can be a good option for isolating the development environment. However, direct PCIe passthrough (PCIe Hotplug or VT-d passthrough) to the VM is generally required for the FPGA card to be detected and operate correctly, which can be complex to configure and may still expose the host if not done carefully. For initial tool installation and code editing, a VM is perfectly fine.
  2. Minimize Background Applications: Ensure no other resource-intensive applications are running that might interfere with Vivado's performance during synthesis and implementation.
  3. Disable Conflicting Software: Temporarily disable any anti-virus, firewall, or security software that might interfere with low-level hardware access or JTAG communication during development and testing. Remember to re-enable them when you're done with your work.

5. Gathering Donor Device Information

Accurate device emulation hinges on meticulously extracting and replicating critical information from the donor device. This comprehensive data collection enables your FPGA to faithfully mimic the target hardware's PCIe configuration and behavior, ensuring compatibility and functionality when interfacing with the host system.

5.1 Using Arbor for PCIe Device Scanning

Arbor is a robust and user-friendly tool designed for in-depth scanning of PCIe devices. It provides detailed insights into the configuration space of connected hardware, making it an invaluable resource for extracting the necessary information for device emulation.

5.1.1 Install Arbor

To begin utilizing Arbor for device scanning, you must first install the software on your system.

Steps:

  1. Visit the Arbor Download Page:
    • Navigate to the official MindShare website (https://www.mindshare.com/) using your preferred web browser. You'll need to find their "Software" or "Downloads" section to locate Arbor.
    • Ensure you are accessing the site directly to avoid any malicious redirects.
  2. Create an Account (if required):
    • Arbor may require you to create a user account to access the download links.
    • Provide the necessary information, such as your name, email address, and organization.
    • Verify your email if prompted, to activate your account.
  3. Download Arbor:
    • Once logged in, locate the download section for Arbor.
    • Select the version compatible with your operating system (e.g., Windows 10/11 64-bit).
    • Click the Download button and save the installer to a known location on your computer.
  4. Install Arbor:
    • Locate the downloaded installer file (e.g., ArborSetup.exe).
    • Right-click the installer and select Run as administrator to ensure it has the necessary permissions.
    • Follow the on-screen instructions to complete the installation process.
      • Accept the license agreement.
      • Choose the installation directory.
      • Opt to create desktop shortcuts if desired.
  5. Verify Installation:
    • Upon completion, ensure that Arbor is listed in your Start Menu or on your desktop.
    • Launch Arbor to confirm it opens without errors.

5.1.2 Scan PCIe Devices

With Arbor installed, you can proceed to scan your system for connected PCIe devices.

Steps:

  1. Launch Arbor:
    • Double-click the Arbor icon on your desktop or find it via the Start Menu.
    • If prompted by User Account Control (UAC), allow the application to make changes to your device.
  2. Navigate to the Local System Tab:
    • In the Arbor interface, locate the navigation pane or tabs.
    • Click on Local System to access tools for scanning the local machine.
  3. Scan for PCIe Devices:
    • Look for a Scan or Rescan button, typically located at the top or bottom of the interface.
    • Click Scan/Rescan to initiate the detection process.
    • Wait for the scanning process to complete; this may take a few moments depending on the number of devices connected.
  4. Review Detected Devices:
    • Once the scan is complete, Arbor will display a list of all detected PCIe devices.
    • The devices are usually listed with their names, device IDs, and other identifying information.

5.1.3 Identify the Donor Device

Identifying the correct donor device is crucial for accurate emulation.

Steps:

  1. Locate Your Donor Device in the List:
    • Scroll through the list of devices detected by Arbor.
    • Look for the device matching the make and model of your donor hardware.
    • Devices may be listed by their vendor names, device types, or function.
  2. Verify Device Details:
    • Click on the device to select it.
    • Confirm that the Device ID and Vendor ID match those of your donor device.
      • Tip: These IDs are typically found in the device's documentation or on the manufacturer's website. For common devices, a quick web search for "[Device Name] Vendor ID Device ID" often yields results.
  3. View Detailed Configuration:
    • With the device selected, find and click on an option like View Details or Properties.
    • This will open a detailed view showing the device's configuration space and capabilities.
  4. Cross-Reference with Physical Hardware:
    • If multiple similar devices are listed, cross-reference the Slot Number or Bus Address with the physical slot where the donor device is installed. This helps confirm you're analyzing the correct hardware.

5.1.4 Capture Device Data

Extracting detailed information from the donor device is essential for accurate emulation.

Information to Extract:

  • Device ID (0xXXXX): A 16-bit identifier unique to the device model.
  • Vendor ID (0xYYYY): A 16-bit identifier assigned to the manufacturer.
  • Subsystem ID (0xZZZZ): Identifies the specific subsystem or variant (e.g., a specific model within a product line).
  • Subsystem Vendor ID (0xWWWW): Identifies the vendor of the subsystem (often the same as the main Vendor ID, but can differ for OEM versions).
  • Revision ID (0xRR): Indicates the hardware revision level of the device.
  • Class Code (0xCCCCCC): A 24-bit code that defines the primary function/type of device (e.g., 0x020000 for Ethernet Controller, 0x010802 for NVMe controller). This helps the OS load generic drivers.
  • Base Address Registers (BARs):
    • Registers defining the memory or I/O address regions the device uses.
    • Includes BAR0 through BAR5, each potentially 32 or 64 bits. For each BAR, note its Type (Memory or I/O), Bit Width (32-bit or 64-bit), Size (e.g., 256 MB, 4KB), and Prefetchable Status (Yes/No). This is crucial for memory mapping.
  • Capabilities: Lists supported features and their configurations, often found in a linked list structure within the configuration space. Examples include:
    • PCIe Capability Structure: PCIe Link Speed (e.g., Gen2, Gen3), Link Width (e.g., x1, x4), Max Payload Size, Max Read Request Size.
    • MSI/MSI-X Capability Structure: Information on Message Signaled Interrupts, including the number of vectors supported.
    • Power Management Capability Structure: Supported power states (D0, D1, D2, D3hot, D3cold).
  • Device Serial Number (DSN): A 64-bit unique identifier, if the device supports it (found in the "Device Serial Number" Extended Capability). Not all devices implement this.

Steps:

  1. Navigate to the PCI Config Tab:
    • Within the device's detailed view, find and select the PCI Config or Configuration Space tab. This typically presents a decoded view of the raw configuration space registers.
  2. Record Relevant Details:
    • Carefully document each of the required fields listed above.
    • Use screenshots or copy the values into a text file, a dedicated spreadsheet, or a structured documentation format for accuracy.
    • Ensure hexadecimal values are noted correctly, including the 0x prefix if used.
  3. Expand Capability Lists:
    • Look for sections labeled Capabilities or Advanced Features. These are often clickable or expandable to reveal sub-sections.
    • Document each capability present and its relevant parameters (e.g., MSI message control, power state flags, current/max PCIe link settings).
  4. Examine BARs in Detail:
    • Within the Configuration Space, locate the entries for BAR0 through BAR5.
    • For each active BAR, note its allocated size, whether it's Memory-mapped or I/O, its bit width (32-bit or 64-bit), and if it's prefetchable. This information is often presented clearly in Arbor's GUI.
  5. Save the Data for Reference:
    • Compile all the extracted information into a well-organized document (e.g., a Markdown file, a .txt file, or an Excel spreadsheet).
    • Label each section clearly for easy reference during firmware customization.

5.2 Extracting and Recording Device Attributes

After capturing the data, it's crucial to understand the significance of each attribute and ensure they've been accurately documented for successful emulation.

Ensure You Have Accurately Recorded the Following:

  1. Device ID:
    • Purpose: Uniquely identifies the specific model of the PCIe device.
    • Usage in Emulation: Essential for the host Operating System (OS) to correctly identify the emulated device and, crucially, to attempt to load the appropriate device driver.
  2. Vendor ID:
    • Purpose: Identifies the manufacturer of the PCIe device.
    • Usage in Emulation: Used in conjunction with the Device ID to form a unique identifier (VendorID:DeviceID) that the OS uses to match the device to its corresponding driver.
  3. Subsystem ID and Subsystem Vendor ID:
    • Purpose: These optional IDs allow for differentiation between variants of a device from the same vendor, or for OEM-specific versions where the main Vendor/Device ID might be generic.
    • Usage in Emulation: Important for emulating devices with multiple configurations or those supplied by an OEM, as the driver might specifically look for these values.
  4. Revision ID:
    • Purpose: Indicates the hardware revision level of the device.
    • Usage in Emulation: Helps in identifying specific hardware versions that might require different drivers, firmware, or possess subtle behavioral differences.
  5. Class Code:
    • Purpose: A 24-bit code that categorizes the device's general function (e.g., 0x020000 for an Ethernet controller, 0x010802 for an NVMe controller, 0x0C0300 for a USB host controller). It's composed of a Base Class, Sub-Class, and Programming Interface.
    • Usage in Emulation: Allows the OS to understand the device's general function and load a generic class driver if a specific vendor driver isn't found. This is crucial for initial device recognition.
  6. Base Address Registers (BARs):
    • Purpose: Define the memory-mapped or I/O port address regions that the device uses for registers, internal buffers, or configuration space extensions. The host OS allocates physical addresses to these BARs during enumeration.
    • Usage in Emulation: Critical for mapping the emulated device's internal memory and registers into the host system's address space. The size, type (Memory/I/O, 32/64-bit), and prefetchability status of each BAR must be precisely matched to the donor device.
  7. Capabilities:
    • Purpose: Lists the advanced features the device supports, such as advanced error reporting, power management, MSI/MSI-X, PCIe Advanced Capabilities (e.g., AER, VC/PF), etc. Each capability is defined by a structure with its own registers.
    • Usage in Emulation: Essential for accurately replicating how the donor device advertises its features and how the host system interacts with these features (e.g., interrupt delivery mechanisms, power state transitions, error reporting).
  8. Device Serial Number (DSN):
    • Purpose: A unique 64-bit identifier for the device, typically an optional extended capability.
    • Usage in Emulation: While optional, some drivers or management applications might specifically query and rely on the DSN for identification, licensing, or security checks. Emulating this accurately can prevent detection of your device as a generic or modified peripheral.

Best Practices for Data Collection:

  • Organize the Data: Create a structured document or spreadsheet. Use clear headings and subheadings for each attribute. A template can be beneficial.
  • Include Units and Formats: Always indicate units for sizes (e.g., MB, KB) and use consistent formatting for hexadecimal values (e.g., 0x1234, 16'h1234).
  • Cross-Reference with Specifications (If Possible): If available, consult the donor device's datasheet or publicly available specifications to verify values. This can help identify any discrepancies or unusual configurations not immediately obvious from a raw scan.
  • Secure the Data: Store the collected information securely. Be mindful that this data might contain proprietary or sensitive information.
  • Understand "What's Missing": Specialized tools like Arbor are excellent, but they may not capture every single nuance of complex, highly proprietary devices (e.g., specific vendor-defined registers outside standard configuration space). For advanced emulation, you might need to combine this information with reverse engineering of the donor device's drivers.

6. Initial Firmware Customization

With the donor device's information meticulously documented, the next critical phase involves customizing your FPGA's firmware to emulate the donor device accurately. This process begins by modifying key identification registers within the PCIe configuration space and ensuring that specific identifiers, like the Device Serial Number, are correctly integrated.

6.1 Modifying Configuration Space

The PCIe Configuration Space is a fundamental component that defines how the device is recognized and interacts with the host system during enumeration. Customizing this space to precisely match the donor device's profile is absolutely essential for successful emulation, allowing the host OS to load the correct driver and interact as expected.

6.1.1 Navigate to the Configuration File

The PCIe configuration space parameters are typically defined within a specific SystemVerilog (.sv) file in the PCILeech-FPGA project. This file synthesizes into the logic that configures the PCIe IP core and exposes the device's identity to the host.

Common Path for PCILeech-FPGA (Artix-7 based boards like Squirrel): Locate the file responsible for configuring the PCIe parameters for your specific board. For many Artix-7 PCILeech variants, this will be:

pcileech-fpga/<your_board_variant>/src/pcileech_pcie_cfg_a7.sv
  • Example (for Squirrel 35T):
    pcileech-fpga/pcileech-squirrel-35t/src/pcileech_pcie_cfg_a7.sv
    
    Note: The actual folder name like pcileech-squirrel-35t might vary slightly based on the specific PCILeech-FPGA version or fork you cloned. Always navigate to the relevant board-specific subdirectory after cloning the main repository.

6.1.2 Open the File in Visual Studio Code

Editing the configuration file requires a suitable code editor that supports syntax highlighting for SystemVerilog (or Verilog), making the code easier to read and modify.

Steps:

  1. Launch Visual Studio Code:
    • Click on the VS Code icon or find it via the Start Menu.
  2. Open the File:
    • Use File > Open File or press Ctrl + O (or Cmd + O on macOS).
    • Navigate to the configuration file path identified in Section 6.1.1 (e.g., pcileech-fpga/pcileech-squirrel-35t/src/pcileech_pcie_cfg_a7.sv).
    • Select the file and click Open.
  3. Verify Syntax Highlighting:
    • Ensure that the editor recognizes the .sv file extension and applies proper SystemVerilog syntax highlighting. If it doesn't, revisit Section 4.3.2 to ensure you've installed the recommended Verilog/SystemVerilog extensions for VS Code.
  4. Familiarize Yourself with the File Structure:
    • Scroll through the file. You will typically find parameters defined using localparam or reg assignments, often with comments explaining their purpose. Look for sections where standard PCIe configuration registers (Vendor ID, Device ID, etc.) are defined and assigned.

6.1.3 Modify Device ID and Vendor ID

Updating these fundamental identifiers is the most crucial step for the host system to correctly recognize the emulated device as your donor. The Operating System relies heavily on the Vendor ID and Device ID pair to identify connected hardware and load the appropriate device driver.

Steps:

  1. Search for cfg_deviceid:
    • Use the search functionality (Ctrl + F or Cmd + F) within VS Code.
    • Locate the line defining cfg_deviceid. It will typically look something like:
      reg [15:0] cfg_deviceid = 16'hAAAA; // Default or placeholder Device ID
  2. Update Device ID:
    • Replace AAAA with the 16-bit hexadecimal Device ID you extracted from the donor device using Arbor (e.g., 0x1234).
    • Example: If the donor's Device ID is 0x1234, update the line as:
      reg [15:0] cfg_deviceid = 16'h1234; // Updated with donor's Device ID (e.g., from network card)
  3. Search for cfg_vendorid:
    • Locate the line defining cfg_vendorid. It will be similar in format to cfg_deviceid:
      reg [15:0] cfg_vendorid = 16'hBBBB; // Default or placeholder Vendor ID
  4. Update Vendor ID:
    • Replace BBBB with the 16-bit hexadecimal Vendor ID extracted from the donor device (e.g., 0xABCD).
    • Example: If the donor's Vendor ID is 0xABCD, update the line as:
      reg [15:0] cfg_vendorid = 16'hABCD; // Updated with donor's Vendor ID (e.g., Intel Corporation)
  5. Ensure Correct Formatting:
    • Verify that hexadecimal values are correctly prefixed with 16'h (indicating a 16-bit hexadecimal number).
    • Maintain consistent indentation and commenting style for readability.

6.1.4 Modify Subsystem ID and Revision ID

These identifiers provide additional details about the device variant, specific product models, or hardware revisions. While often optional, matching them enhances the authenticity of the emulation and can be critical for drivers that perform granular checks.

Steps:

  1. Search for cfg_subsysid:
    • Locate the line defining cfg_subsysid.
    reg [15:0] cfg_subsysid = 16'hCCCC; // Placeholder Subsystem ID
  2. Update Subsystem ID:
    • Replace CCCC with the 16-bit hexadecimal Subsystem ID from your donor device (e.g., 0x5678).
    • Example:
      reg [15:0] cfg_subsysid = 16'h5678; // Set to donor's Subsystem ID
  3. Search for cfg_subsysvendorid:
    • Locate the line defining cfg_subsysvendorid.
    reg [15:0] cfg_subsysvendorid = 16'hDDDD; // Placeholder Subsystem Vendor ID
  4. Update Subsystem Vendor ID (if applicable):
    • Replace DDDD with the 16-bit hexadecimal Subsystem Vendor ID from your donor device (e.g., 0x9ABC). If your donor device does not have a unique Subsystem Vendor ID (i.e., it's the same as the main Vendor ID), you should still set it to that value.
    • Example:
      reg [15:0] cfg_subsysvendorid = 16'h9ABC; // Set to donor's Subsystem Vendor ID
  5. Search for cfg_revisionid:
    • Locate the line defining cfg_revisionid.
    reg [7:0] cfg_revisionid = 8'hEE; // Placeholder Revision ID
  6. Update Revision ID:
    • Replace EE with the 8-bit hexadecimal Revision ID from your donor device (e.g., 0x01).
    • Example:
      reg [7:0] cfg_revisionid = 8'h01; // Set to donor's Revision ID

6.1.5 Update Class Code

The Class Code informs the host Operating System about the device's general type and function (e.g., network controller, storage device). This is vital for the OS to load generic class drivers, even if a specific vendor driver isn't installed.

Steps:

  1. Search for cfg_classcode:
    • Locate the line defining cfg_classcode.
    reg [23:0] cfg_classcode = 24'hFFFFFF; // Default or placeholder Class Code
  2. Update Class Code:
    • Replace FFFFFF with the 24-bit hexadecimal Class Code you extracted from the donor device (e.g., 0x020000 for an Ethernet Controller). Remember the format: Base Class, Sub-Class, Programming Interface.
    • Example: If the donor's Class Code is 0x020000 (meaning Base Class: 0x02 - Network Controller, Sub-Class: 0x00 - Ethernet Controller, Prog IF: 0x00), update as:
      reg [23:0] cfg_classcode = 24'h020000; // Set to donor's Class Code (e.g., Ethernet Controller)
  3. Verify Correct Bit Width:
    • Ensure that the Class Code is correctly represented as a 24-bit value using the 24'h prefix for hexadecimal.

6.1.6 Save Changes

After making all modifications to the configuration parameters, it's critically important to save and review the changes.

Steps:

  1. Save the File:
    • Click File > Save in VS Code, or press Ctrl + S (or Cmd + S).
  2. Review Changes:
    • Before closing, quickly re-read the modified lines to confirm their accuracy against your documented donor device information.
    • Check for any obvious syntax errors or typos (VS Code's extensions may highlight these).
  3. Optional - Use Version Control:
    • If you are using Git (highly recommended for any code project, especially firmware), commit your changes with a clear and meaningful message. This creates a historical record of your modifications.
    • Example Git Commands:
      git add pcileech_pcie_cfg_a7.sv
      git commit -m "Updated PCIe configuration registers (VID, DID, SubIDs, Revision, Class Code) to match donor device: [Donor Device Name]"

6.2 Inserting the Device Serial Number (DSN)

The Device Serial Number (DSN) is a unique 64-bit identifier that some PCIe devices (especially those with advanced features or specific drivers) may utilize. Including it enhances the authenticity of your emulation and can help bypass checks in drivers that explicitly query for this value.

6.2.1 Locate the DSN Field

The DSN, if implemented by the donor device, is part of the PCIe Extended Capabilities. In the PCILeech-FPGA framework, the DSN field is often exposed as a configurable parameter within the same configuration file you've been editing.

Steps:

  1. Search for cfg_dsn:
    • In pcileech_pcie_cfg_a7.sv (or your board's equivalent configuration file), use the search function (Ctrl + F or Cmd + F) to find cfg_dsn.
  2. Understand the Existing Assignment:
    • The DSN may be set to a default value (often all zeros) or commented out. It will typically look like:
      reg [63:0] cfg_dsn = 64'h0000000000000000; // Default DSN (usually 0 if not used)

6.2.2 Insert the DSN

Updating the DSN involves setting it to the exact 64-bit hexadecimal value captured from your donor device.

Steps:

  1. Update cfg_dsn:
    • Replace the existing hexadecimal value with the 64-bit DSN you extracted from the donor device using Arbor.
    • Example: If the donor's DSN is 0x0011223344556677, update as:
      reg [63:0] cfg_dsn = 64'h0011223344556677; // Donor Device Serial Number
  2. Handle DSN Unavailability or Irrelevance:
    • If your donor device does not have a DSN, or if you've determined it's not a required parameter for the driver you're targeting, you can simply leave it as zeros:
      reg [63:0] cfg_dsn = 64'h0000000000000000; // No specific DSN from donor, leaving as default 0
    • Caution: For critical emulations, if the donor device has a DSN, it's best to emulate it accurately.
  3. Ensure Correct Formatting:
    • The DSN is a 64-bit value; ensure it's properly formatted with the 64'h prefix for hexadecimal values.

6.2.3 Save Changes

Finalize the DSN modification by saving and reviewing the file.

Steps:

  1. Save the File:
    • Click File > Save in VS Code, or press Ctrl + S.
  2. Verify the Syntax:
    • Check for any red underlines or error indications from VS Code's syntax checker. Correct any issues immediately.
  3. Document the Changes:
    • If using version control, commit the updates with an appropriate message.
    • Example Git Commands:
      git commit -am "Inserted donor Device Serial Number (DSN) into PCIe configuration"

7. Vivado Project Setup and Customization

With the firmware files updated to reflect the donor device's critical identification and configuration data, the next crucial step is to integrate these changes into the Vivado project. This involves generating the project files for your specific FPGA board, customizing the embedded PCIe IP core, and preparing the entire design for the synthesis and implementation phases.

7.1 Generating Vivado Project Files

Vivado, the Xilinx (AMD) development suite, uses Tcl (Tool Command Language) scripts to automate project creation, add source files, and configure project settings. By running these scripts provided with the PCILeech-FPGA framework, you ensure that your Vivado project is correctly set up for your target FPGA board.

7.1.1 Open Vivado

Starting with a fresh session of Vivado ensures that no lingering settings or open projects from previous sessions interfere with your current work.

Steps:

  1. Launch Vivado:
    • Find the Vivado application icon in your Start Menu (Windows) or Applications folder (Linux/macOS).
    • Click to open it.
  2. Select the Correct Version:
    • If you have multiple Vivado versions installed, ensure you are launching the one compatible with your FPGA board and the PCILeech-FPGA project (as noted in Section 4.3.1, a recent stable version like Vivado 2023.x is recommended).
  3. Wait for the Startup Screen:
    • Allow Vivado to fully initialize and present its welcome screen or project dashboard before proceeding.

7.1.2 Access the Tcl Console

The Tcl Console within Vivado is your primary interface for executing scripts and direct commands. This is where you will run the project generation scripts.

Steps:

  1. Open the Tcl Console:
    • In the Vivado interface, navigate to the menu bar.
    • Click on Window > Tcl Console.
    • The Tcl Console pane will typically appear at the bottom of the Vivado window.
  2. Adjust Console Size (Optional):
    • You can drag the console's top border to resize it, making it taller for better visibility of commands and output.
  3. Clear Previous Commands (Optional but Recommended):
    • If there are any previous commands or messages, you can right-click within the console and select "Clear Console" for a clean start.

7.1.3 Navigate to the Project Directory

Before running the Tcl script, you must ensure that the Tcl Console's current working directory is set to the correct location where your board-specific PCILeech-FPGA project scripts reside.

For Squirrel DMA (Artix-7 35T) or similar boards:

Typical Path (after cloning pcileech-fpga and navigating to your board variant):

C:/Users/YourUsername/Documents/pcileech-fpga/pcileech-squirrel-35t/  # On Windows
~/Projects/pcileech-fpga/pcileech-squirrel-35t/  # On Linux/macOS

Note: Replace <your_board_variant> with the actual name of the subdirectory for your board (e.g., pcileech-squirrel-35t, pcileech-artix-7-50t).

Steps:

  1. Set the Working Directory in Tcl Console:
    • In the Vivado Tcl Console, enter the cd command followed by the full path to your board's project directory.
    • Example (Windows):
      cd C:/Users/YourUsername/Documents/pcileech-fpga/pcileech-squirrel-35t/
    • Example (Linux/macOS):
      cd ~/Projects/pcileech-fpga/pcileech-squirrel-35t/
    • Self-correction tip: Use forward slashes (/) even on Windows for Tcl paths.
  2. Verify the Directory Change:
    • To confirm you are in the correct directory, enter pwd (print working directory) in the Tcl Console.
    • The console should display the full path you just set, confirming the change.

7.1.4 Generate the Vivado Project

Running the appropriate Tcl script specific to your FPGA board will automate the entire project setup process within Vivado. This includes creating the project, adding all necessary source files (HDL, constraints), and configuring core project settings.

Steps:

  1. Run the Tcl Script:
    • Enter the source command followed by the name of the project generation script for your board. The PCILeech-FPGA project typically provides these scripts in the main board directory.
    • For Squirrel (Artix-7 35T) (and similar Artix-7 boards):
      source vivado_generate_project_squirrel.tcl -notrace
    • For Enigma-X1 (Artix-7 75T):
      source vivado_generate_project_enigma_x1.tcl -notrace
    • For ZDMA (Artix-7 100T):
      source vivado_generate_project_100t.tcl -notrace
    • The -notrace option prevents verbose output for every single Tcl command, keeping the console cleaner.
  2. Wait for Script Completion:
    • The script will execute many commands in sequence. This process can take several minutes, depending on your system's performance and the complexity of the project.
    • Monitor the Tcl Console for progress messages. The script will:
      • Create a new Vivado project (.xpr file) in your current directory.
      • Add all SystemVerilog/Verilog source files (.sv, .v).
      • Add Xilinx IP core configurations (.xci).
      • Add XDC (Xilinx Design Constraints) files.
      • Potentially configure various project settings.
    • Address Any Errors: If any errors occur (e.g., "file not found," "invalid command"), the script will usually stop. Review the error message, correct the underlying issue (e.g., incorrect path, missing file), and re-run the script.
  3. Confirm Project Generation:
    • Upon successful completion, the Tcl Console will typically indicate that the project has been created, and you should see the new project files (e.g., pcileech_squirrel_top.xpr) and associated directories (e.g., pcileech_squirrel_top.runs, pcileech_squirrel_top.ip) generated in your project directory.

7.1.5 Open the Generated Project

Now that the Vivado project files have been successfully generated by the Tcl script, you can open the project within the Vivado GUI for further inspection and customization.

Steps:

  1. Open the Project:
    • In Vivado, click File > Open Project.
    • Navigate to your project directory (the same one you set in the Tcl Console in Section 7.1.3).
  2. Select the Project File:
    • Locate and select the Vivado project file (.xpr extension) that corresponds to your board.
    • For Squirrel: The file will typically be named pcileech_squirrel_top.xpr.
    • Click on the .xpr file to select it.
  3. Click Open:
    • Vivado will load the project, displaying the design hierarchy, source files, IP Integrator block design (if used), and various design views. This may take a moment.
  4. Verify Project Contents:
    • In the Project Manager window (typically on the left), expand the Sources pane.
    • Ensure that all expected source files (Verilog/SystemVerilog, XDC, IP cores) are listed and that the design hierarchy looks correct.
    • Check the Messages pane (bottom) for any warnings or critical warnings that appear upon opening the project, as these might indicate potential issues.

7.2 Modifying IP Blocks

The PCIe IP core is the heart of your device's PCIe interface. It's a pre-verified, configurable block from Xilinx that handles the complex PCIe protocol layers. While some configuration space values are handled in the SystemVerilog file (Section 6.1), other core PCIe parameters, especially those related to link capabilities and BAR structures, are configured directly within the PCIe IP core's customization settings in Vivado. Customizing the IP core ensures that your FPGA behaves identically to the donor hardware at the PCIe protocol level.

7.2.1 Access the PCIe IP Core

The PCIe IP core is instantiated as an IP block within your Vivado project. You'll need to open its customization GUI to modify its parameters.

Steps:

  1. Locate the PCIe IP Core:
    • In the Sources pane (within the Project Manager window), ensure the Hierarchy tab is selected.
    • Expand the design hierarchy until you find the instance of the PCIe IP core.
    • It is commonly named pcie_7x_0.xci (for 7-Series FPGAs) or similar, typically found within the ip subdirectory of your project sources.
  2. Open the IP Customization Window:
    • Right-click on the pcie_7x_0.xci file.
    • From the context menu, select Customize IP.
    • The IP Configuration window (or similar name, like "Customize IP" or "Re-customize IP") will open, presenting a graphical interface with various tabs and options for configuring the PCIe core.
  3. Wait for IP Settings to Load:
    • The IP customization interface may take a few moments to initialize and populate all its settings. Ensure that all options and tabs are fully loaded and responsive before you begin making changes.

7.2.2 Customize Device IDs and BARs within the IP Core

While some device identifiers are set in pcileech_pcie_cfg_a7.sv, the PCIe IP core itself also contains internal parameters for Device ID, Vendor ID, and crucially, the definitions for the Base Address Registers (BARs). You must ensure these align. Some values set in the .sv file might override or be fed into the IP core, but it's good practice to ensure consistency here too. The BAR settings in the IP core are particularly important as they dictate the hardware implementation of memory mapping.

Steps:

  1. Navigate to Basic/Identification Parameters:
    • In the IP customization window, look for a tab or section related to Basic, Device and Vendor Identifiers, General, or PCIe Capabilities. This is where fundamental IDs and initial link settings are defined.
  2. Verify/Enter the Device ID, Vendor ID, Subsystem IDs, Revision ID, Class Code:
    • Crucially, confirm these match the values you put in pcileech_pcie_cfg_a7.sv and from your donor device.
    • Find fields like:
      • Device ID: Enter 0xXXXX (e.g., 0x1234).
      • Vendor ID: Enter 0xYYYY (e.g., 0xABCD).
      • Subsystem ID: Enter 0xZZZZ (e.g., 0x5678).
      • Subsystem Vendor ID: Enter 0xWWWW (e.g., 0x9ABC).
      • Revision ID: Enter 0xRR (e.g., 0x01).
      • Class Code: Enter 0xCCCCCC (e.g., 0x020000).
    • Important: Some IP core versions or specific configurations might pull these directly from the user logic (like pcileech_pcie_cfg_a7.sv) or might allow setting them directly here. The most reliable approach is to set them consistently in both places if the option is available in the IP GUI.
  3. Navigate to Base Address Registers (BARs) Tab:
    • Within the IP customization window, locate and select the BARs tab or section. This is where you define the memory regions the PCIe device exposes.
  4. Configure Each BAR:
    • For each BAR (BAR0 through BAR5) that your donor device uses, meticulously configure the following parameters based on the information extracted using Arbor:
      • Enable BAR: Check this box only if the donor device uses this specific BAR. Disable (uncheck) any BARs the donor device does not use.
      • BAR Size: Select the exact size from the dropdown list (e.g., 256 MB, 64 KB, 4 KB). This is critical for the host OS to allocate the correct amount of memory.
      • BAR Type: Choose the appropriate type:
        • Memory (32-bit Addressing): For memory-mapped regions accessible with 32-bit addresses.
        • Memory (64-bit Addressing): For memory-mapped regions that can reside anywhere in the 64-bit address space (required for large memory regions or if the donor device uses it).
        • I/O: For legacy I/O port regions (less common in modern PCIe but still possible).
      • Prefetchable: Check this box if the donor's BAR is marked as prefetchable. This property allows the host system to cache or prefetch data from this region for performance.
    • Example Configuration (based on your donor device):
      • BAR0:
        • Enabled: Yes
        • Size: 256 MB
        • Type: Memory (64-bit Addressing)
        • Prefetchable: Yes
      • BAR1:
        • Enabled: No (if the donor device does not use BAR1)
      • Continue for BAR2-BAR5, mirroring the donor device's configuration.
  5. Ensure Alignment and Non-Overlapping Spaces:
    • The Vivado IP core typically handles the alignment automatically based on the size you select. However, be aware that PCIe specification requires BAR sizes to be a power of two, and BARs must be aligned to their size.
    • Ensure that the total memory mapped by your BARs does not exceed the FPGA's available Block RAM (BRAM) or external memory capabilities.

7.2.3 Finalize IP Customization

After configuring all necessary settings within the IP core's customization window, you must apply these changes for them to take effect in your Vivado project.

Steps:

  1. Review All Settings:
    • Before applying, take a moment to quickly review each tab in the IP customization window one last time.
    • Confirm that all entries precisely match the documented specifications of your donor device. A small error here can lead to device detection or functionality issues.
  2. Apply Changes:
    • Click the OK or Generate button (the label may vary) at the bottom of the IP customization window.
    • If Vivado prompts you to confirm that you wish to proceed with the changes and regenerate the IP output products, confirm by clicking Yes.
  3. Regenerate IP Core:
    • Vivado will now regenerate the IP core's output products (e.g., netlists, simulation models, new .xci configuration files) to reflect your new configurations.
    • Monitor the Messages pane (bottom of Vivado window) for any errors, warnings, or critical warnings that may arise during this regeneration process. Address any critical warnings immediately.
  4. Update IP in Project:
    • Vivado might automatically update or prompt you to update any IP dependencies in your project after a core is regenerated. Allow it to do so to ensure the latest configuration is used throughout your design.

7.2.4 Lock the IP Core

Locking the IP core is a recommended best practice in Vivado to prevent unintended modifications or re-customizations during subsequent synthesis and implementation runs, which could potentially revert your carefully configured settings.

Purpose of Locking:

  • Prevent Overwrites: Ensures that your manual configurations within the IP core's GUI are preserved and not accidentally overwritten by Vivado's automation or if the IP is detected as "out-of-date" due to subtle project changes.
  • Maintain Consistency: Keeps the IP core in a known, stable state throughout the build process, which is especially important for critical components like the PCIe interface.

Steps:

  1. Open the Tcl Console:
    • In Vivado, if the Tcl Console is not already open, go to Window > Tcl Console.
  2. Execute the Lock Command:
    • In the Tcl Console, enter the following command precisely. This command sets the IP_LOCKED property to true for your PCIe IP core instance (pcie_7x_0).
    set_property -name {IP_LOCKED} -value true -objects [get_ips pcie_7x_0]
    • Press Enter to execute the command.
  3. Verify the Lock:
    • Check the Messages pane. You should see a confirmation message indicating that the property was set.
    • You can also right-click on pcie_7x_0.xci in the Sources pane, select "IP Properties," and verify that IP_LOCKED is set to true. You might also notice that the "Customize IP" option is now grayed out or only allows "Re-customize IP" which then warns you about the lock.
  4. Unlocking (if necessary):
    • If you need to make further modifications to the PCIe IP core's settings in the future, you must first unlock it. Use the following Tcl command:
    set_property -name {IP_LOCKED} -value false -objects [get_ips pcie_7x_0]
    • Remember to re-lock it after making and applying your changes.
  5. Document the Action:
    • It's a good practice to note in your project documentation (e.g., a README file, project notes) that the PCIe IP core has been locked. This helps anyone else working on the project understand its configuration state and avoid confusion.

Part 2: Intermediate Concepts and Implementation


8. Advanced Firmware Customization

To achieve a precise and convincing emulation of the donor device, further in-depth customization of the FPGA firmware is necessary beyond just basic identification. This involves aligning the PCIe parameters (such as link speed and transaction sizes), meticulously adjusting Base Address Registers (BARs) and their associated memory mappings, and accurately emulating power management and interrupt mechanisms. These steps ensure that the emulated device not only looks like the original hardware to the host system but also behaves identically at the protocol and functional levels.

8.1 Configuring PCIe Parameters for Emulation

Accurate emulation requires that the PCIe operational parameters of your FPGA device are meticulously configured to match those of the donor device. This includes settings such as the PCIe link speed, link width, capability pointers, and maximum payload sizes. Proper configuration ensures compatibility with the host system, the correct operation of drivers and applications that interact with the device, and optimal performance for data transfer.

8.1.1 Matching PCIe Link Speed and Width

The PCIe link speed (e.g., Gen1, Gen2, Gen3) and link width (e.g., x1, x4, x8) are critical parameters that determine the maximum theoretical data throughput and performance of the device. Matching these settings with the donor device is essential for accurate emulation, as drivers or system components might expect specific link capabilities.

Steps:

  1. Access PCIe IP Core Settings:

    • Open Your Vivado Project: Launch Vivado and open the project you previously created or modified (e.g., pcileech_squirrel_top.xpr). Ensure that all source files are correctly added to the project.
    • Locate the PCIe IP Core: In the Sources pane (typically on the left), expand the design hierarchy to find the PCIe IP core instance. For Xilinx 7-series designs (like Artix-7 used in Squirrel), this is usually named pcie_7x_0.xci.
    • Customize the IP Core: Right-click on pcie_7x_0.xci and select Customize IP. The IP customization window will open, displaying various configuration options across multiple tabs.
  2. Set Maximum Link Speed:

    • Navigate to Link Parameters: In the IP customization window, click on the PCIe Capabilities tab (or sometimes "PCIe Configuration" or "General"). Within this, look for a section related to Link Parameters or Link Capability Register.
    • Configure Maximum Link Speed: Find the option labeled Maximum Link Speed (or "Target Link Speed").
    • Set it to match the maximum link speed supported and advertised by your donor device.
      • Example:
        • If the donor device operates at PCIe Gen2 (5.0 GT/s), select 5.0 GT/s.
        • If it operates at PCIe Gen1 (2.5 GT/s) or PCIe Gen3 (8.0 GT/s), select the corresponding option.
    • Note: Ensure that your FPGA's transceivers and the physical hardware (motherboard PCIe slot) support the selected link speed. The FPGA will only negotiate up to its configured maximum.
  3. Set Link Width:

    • Configure Link Width: In the same Link Parameters section, locate the Link Width setting (or "PCIe Link Width," "Target Link Width").
    • Set it to match the maximum link width advertised by your donor device.
      • Example:
        • If the donor device uses a x4 link, set the Link Width to 4.
        • Options typically include 1, 2, 4, 8, 16 lanes.
    • Note: The physical PCIe slot and the FPGA's package must support the selected link width. Attempting to configure a width greater than the physical connection will result in link negotiation issues.
  4. Save and Regenerate:

    • Apply Changes: After configuring the link speed and width, click OK to apply the changes within the IP customization window.
    • Regenerate IP Output Products: Vivado will likely prompt you to regenerate the IP core's output products due to the changes made. Confirm and allow the regeneration process to complete. This can take some time.
    • Verify Settings: Once regeneration is complete, you can optionally revisit the IP core settings to ensure the configurations are correctly applied. Check for any warnings or errors in the Messages window in Vivado.

8.1.2 Setting Capability Pointers

Capability pointers within the PCIe configuration space are 8-bit registers that form a linked list, pointing to various capability structures (e.g., Power Management, MSI/MSI-X, PCIe Express Capability). Correctly setting these pointers ensures that the host system can traverse the capability list and locate and utilize the device's advertised features.

Steps:

  1. Locate Capability Pointer in Firmware:

    • Open Configuration File: In Visual Studio Code, open the primary configuration file for your board, typically pcileech_pcie_cfg_a7.sv, located at pcileech-fpga/<your_board_variant>/src/pcileech_pcie_cfg_a7.sv.
    • Understand the Capability Pointer: The capability pointer (cfg_cap_pointer) in this file points to the first capability structure in the PCIe configuration space, usually starting after the standard 64-byte configuration header. Subsequent capabilities are chained by their "Next Capability Pointer" fields.
  2. Set Capability Pointer Value:

    • Find the Assignment for cfg_cap_pointer: Search for the line in the code where cfg_cap_pointer is assigned.
      reg [7:0] cfg_cap_pointer = 8'hXX; // Current value (e.g., 8'h40 for default)
    • Update the Capability Pointer: Replace XX with the 8-bit hexadecimal capability pointer value observed from your donor device using Arbor. This value typically points to the offset of the first capability structure after the Device-Specific Configuration Space (which usually ends at offset 0x3F). A common starting point for capabilities is 0x40 or 0x60.
      • Example:
        • If the donor device's first capability pointer is 0x60 (meaning its first capability structure starts at offset 0x60 in the configuration space), update the line to:
          reg [7:0] cfg_cap_pointer = 8'h60; // Updated to match donor device's first capability offset
    • Ensure Correct Alignment: Capability structures must be aligned on a 4-byte boundary. The capability pointer should always point to a valid 4-byte aligned offset within the configuration space.
  3. Save Changes:

    • Save the Configuration File: After making the changes, save the file by clicking File > Save or pressing Ctrl + S.
    • Verify Syntax: Ensure there are no syntax errors introduced by the changes (VS Code will usually highlight these).
    • Comment for Clarity: Add a comment explaining the change for future reference and maintainability.
      reg [7:0] cfg_cap_pointer = 8'h60; // Set to donor's capability pointer (e.g., PCIe Cap at 0x60)

8.1.3 Adjusting Maximum Payload and Read Request Sizes

These parameters define the maximum amount of data that can be transferred in a single PCIe Transaction Layer Packet (TLP) and the maximum size of a non-posted Memory Read Request TLP. Matching these settings with the donor device ensures compatibility and optimal performance for data transfer operations. Mismatches can lead to reduced throughput or communication errors.

Steps:

  1. Set Maximum Payload Size Supported (IP Core):

    • Access Device Capabilities: In the PCIe IP core customization window (pcie_7x_0.xci in Vivado), navigate to the PCIe Capabilities tab.
    • Configure Max Payload Size Supported: Find the setting labeled Max Payload Size Supported (or similar).
    • Set it to the value supported and advertised by your donor device (e.g., 128 bytes, 256 bytes, 512 bytes, 1024 bytes, 2048 bytes, 4096 bytes).
      • Example: If the donor device supports a maximum payload size of 256 bytes, select 256 bytes from the dropdown.
  2. Set Maximum Read Request Size Supported (IP Core):

    • Configure Max Read Request Size Supported: In the same tab, find the Max Read Request Size Supported setting.
    • Set it to match the donor device's capability. This specifies the maximum amount of data a device can request in a single read transaction.
      • Example: If the donor supports a maximum read request size of 512 bytes, select 512 bytes.
  3. Adjust Firmware Parameters (Matching IP Core):

    • Open pcileech_pcie_cfg_a7.sv: Ensure that the configuration file is open in Visual Studio Code.
    • Update Firmware Constants: Locate the lines where max_payload_size_supported and max_read_request_size_supported are defined. These are typically bit-encoded values that correspond to the byte sizes you selected in the IP core.
      reg [2:0] max_payload_size_supported = 3'bZZZ;   // Current value
      reg [2:0] max_read_request_size_supported = 3'bWWW; // Current value
    • Set the Appropriate Values: Replace ZZZ and WWW with the 3-bit binary representations corresponding to the selected sizes.
      • Mapping (as per PCIe spec):
        • 128 bytes: 3'b000
        • 256 bytes: 3'b001
        • 512 bytes: 3'b010
        • 1024 bytes: 3'b011
        • 2048 bytes: 3'b100
        • 4096 bytes: 3'b101
      • Example:
        • For 256 bytes payload size:
          reg [2:0] max_payload_size_supported = 3'b001; // Supports up to 256 bytes (0x100)
        • For 512 bytes read request size:
          reg [2:0] max_read_request_size_supported = 3'b010; // Supports up to 512 bytes (0x200)
    • Reasoning: These firmware parameters often dictate the behavior of your user logic that interfaces with the PCIe core, ensuring that your logic respects the configured maximums.
  4. Save Changes:

    • Save the File: After updating the values in pcileech_pcie_cfg_a7.sv, save the file.
    • Verify Consistency: It's crucial that the values configured in the Vivado PCIe IP core's GUI match the values set in your HDL configuration file. Any mismatch can lead to unexpected behavior or link training issues.
    • Add Comments: Document these changes clearly in your code for future reference.

8.2 Adjusting BARs and Memory Mapping

Base Address Registers (BARs) are fundamental to how a PCIe device exposes its internal memory and registers to the host system. Correctly configuring the BARs and defining their memory mapping within your FPGA's BRAMs (Block RAMs) and logic is crucial for accurate emulation and the proper operation of host-side device drivers.

8.2.1 Setting BAR Sizes and Types (IP Core & BRAMs)

Configuring the BAR sizes and types ensures that your emulated device requests the correct amount of address space from the host during enumeration and that the host allocates and maps these regions appropriately. This also involves associating these address regions with physical memory blocks within your FPGA.

Steps:

  1. Access BAR Configuration (PCIe IP Core):

    • Customize PCIe IP Core: In Vivado, right-click on pcie_7x_0.xci and select Customize IP to open its configuration GUI.
    • Navigate to BARs Tab: In the IP customization window, click on the Base Address Registers (BARs) tab.
  2. Configure Each BAR (IP Core):

    • Match Donor Device's BARs: For each BAR (BAR0 to BAR5), set the size, type, and prefetchable status to precisely match what you extracted from the donor device using Arbor.
    • Enable/Disable BARs: Ensure that only the BARs actually used by the donor device are enabled. Disable (uncheck) any unused BARs.
    • Set BAR Sizes: Select the appropriate size from the dropdown for each enabled BAR. This will be a power-of-two size (e.g., 4KB, 8KB, 64KB, 1MB, 256MB, 1GB).
      • Example:
        • If BAR0 is 64 KB, set BAR0 Size to 64 KB.
        • If BAR1 is 128 MB, set BAR1 Size to 128 MB.
    • Set BAR Types:
      • Choose between Memory (32-bit Addressing) or Memory (64-bit Addressing) if the BAR is memory-mapped. Select 64-bit Addressing if the donor device's BAR is 64-bit or if you require access to addresses above 4GB.
      • Choose I/O if the BAR is for I/O port space (less common for modern PCIe devices).
    • Set Prefetchable Status: Check the "Prefetchable" box if the donor's BAR was identified as prefetchable. This bit allows the host to prefetch data from that region, potentially improving performance.
  3. Update BRAM Configurations (if applicable):

    • Many PCILeech-FPGA projects use Xilinx Block RAM (BRAM) IP cores to represent the memory regions exposed by the BARs. These BRAMs provide the physical storage for your emulated device's memory.
    • Locate BRAM IP Cores: In your Vivado project's Sources pane, within the ip subdirectory (or similar), you might find .xci files for BRAMs, potentially named like:
      pcileech-fpga/<your_board_variant>/ip/bram_bar_zero4k.xci
      pcileech-fpga/<your_board_variant>/ip/bram_pcie_cfgspace.xci
      # And potentially others for BAR1, BAR2 etc.
      
    • Modify BRAM Sizes: For each BRAM IP core associated with an enabled BAR, you may need to Customize IP (right-click on the .xci file) and adjust its memory size configuration to exactly match the corresponding BAR size you set in the PCIe IP core.
      • Example: If BAR0 is 256MB, ensure the BRAM connected to BAR0 has a size of 256MB.
      • Caution: Ensure that the total memory required by all active BARs does not exceed the physical BRAM capacity of your FPGA device. Exceeding capacity will lead to implementation failures.
  4. Save and Regenerate:

    • Apply Changes (IP Core): After configuring the BARs in the PCIe IP core, click OK in the IP customization window.
    • Regenerate IP Cores: Vivado will prompt you to regenerate the PCIe IP core and any associated BRAM IP cores due to the size changes. Allow the regeneration to complete. This ensures the hardware netlist reflects your new BAR definitions.
    • Check for Errors: Review the Messages window for any warnings or errors related to BAR configurations or BRAM instantiation.

8.2.2 Defining BAR Address Spaces in Firmware

While the PCIe IP core configures the hardware aspects of the BARs, your custom firmware (SystemVerilog code) needs to define the logic for how your emulated device responds when the host CPU performs read or write operations to addresses within these BAR regions. This involves address decoding and implementing register/memory access logic.

Steps:

  1. Open the BAR Controller File:

    • In Visual Studio Code, open the SystemVerilog file responsible for handling BAR accesses. For PCILeech-FPGA, this is often:
      pcileech-fpga/<your_board_variant>/src/pcileech_tlps128_bar_controller.sv
      
      This module typically receives PCIe Memory Read/Write TLPs and decodes the address to determine which BAR (and which offset within that BAR) is being accessed.
  2. Implement Address Decoding Logic:

    • Within the pcileech_tlps128_bar_controller.sv module, you'll find logic that determines which BAR an incoming transaction targets. This often involves checking the address bits against the configured BAR sizes.
    • You'll need to define how the incoming address req_addr (from the TLP) maps to an offset within your specific BAR.
    • Conceptual Example:
      // Example: Logic for BAR0 (assuming it's a 256MB 64-bit Memory BAR, for registers/data)
      // 'bar_hit[0]' would be an input signal indicating a hit on BAR0, typically from the PCIe core.
      // 'req_addr' is the incoming PCIe address.
      // 'req_be' is the byte enable from the TLP.
      // 'req_data' is the incoming write data.
      // 'rsp_data' is the outgoing read data.
      
      // Assuming BAR0 is 256MB (2^28 bytes), address bits [27:0] are within the BAR.
      localparam BAR0_SIZE_BITS = 28; // 2^28 = 256MB
      
      reg [31:0] internal_register_0; // Example register within BAR0
      reg [31:0] internal_register_1; // Another example register
      
      assign bar0_offset = req_addr[BAR0_SIZE_BITS-1:0]; // Extract offset within BAR0
      
      always_comb begin
          // Default response
          rsp_data = 32'hFFFFFFFF; // Default to all F's or similar for unmapped regions
      
          if (bar_hit[0]) begin // If the transaction targets BAR0
              if (req_write) begin // This is a write operation
                  case (bar0_offset)
                      // Example: Map offset 0x0 to internal_register_0
                      32'h0000_0000: begin
                          if (req_be[3]) internal_register_0[31:24] = req_data[31:24];
                          if (req_be[2]) internal_register_0[23:16] = req_data[23:16];
                          if (req_be[1]) internal_register_0[15:8]  = req_data[15:8];
                          if (req_be[0]) internal_register_0[7:0]   = req_data[7:0];
                      end
                      // Example: Map offset 0x4 to internal_register_1
                      32'h0000_0004: begin
                          if (req_be[3]) internal_register_1[31:24] = req_data[31:24];
                          if (req_be[2]) internal_register_1[23:16] = req_data[23:16];
                          if (req_be[1]) internal_register_1[15:8]  = req_data[15:8];
                          if (req_be[0]) internal_register_1[7:0]   = req_data[7:0];
                      end
                      // Add more register mappings or memory accesses (e.g., BRAM access)
                      default: begin
                          // Handle unmapped writes within BAR0, e.g., ignore or log
                      end
                  endcase
              end else if (req_read) begin // This is a read operation
                  case (bar0_offset)
                      // Example: Read from internal_register_0
                      32'h0000_0000: rsp_data = internal_register_0;
                      // Example: Read from internal_register_1
                      32'h0000_0004: rsp_data = internal_register_1;
                      // Add more register mappings or memory accesses (e.g., BRAM access)
                      default: begin
                          rsp_data = 32'h0; // Return 0 for unmapped reads or specific error value
                      end
                  endcase
              end
          end
      end
    • Handle Data Transfers: The always_comb block (or always_ff for sequential logic) should define how rsp_data is generated for reads and how internal registers/memories are updated for writes, based on the bar0_offset and byte enables (req_be).
  3. Implement BRAM Accesses (if BAR maps to BRAM):

    • If a BAR maps to a large block of memory (e.g., 256MB), you'll typically instantiate a BRAM IP core (as discussed in 8.2.1) and interface your bar_controller logic to it. The bar_controller would provide the address and control signals to the BRAM.
    • Conceptual BRAM Integration (simplified):
      // Inside pcileech_tlps128_bar_controller.sv or a sub-module
      // Interface to BRAM
      wire [BAR0_SIZE_BITS-1:0] bram_addr;
      wire [31:0] bram_wr_data;
      wire [3:0] bram_wr_en; // Byte enables for BRAM
      wire bram_wr_ce;
      wire bram_rd_ce;
      wire [31:0] bram_rd_data;
      
      // Map TLP signals to BRAM interface
      assign bram_addr = bar0_offset;
      assign bram_wr_data = req_data;
      assign bram_wr_en = req_be;
      assign bram_wr_ce = bar_hit[0] && req_write;
      assign bram_rd_ce = bar_hit[0] && req_read;
      
      // Instantiate the BRAM IP core
      bram_bar_zero bram_inst ( // Assuming 'bram_bar_zero' is your BRAM IP module
          .clka(clk),
          .ena(1'b1),
          .wea(bram_wr_en),
          .addra(bram_addr),
          .dina(bram_wr_data),
          .douta(bram_rd_data)
      );
      
      // For reads from BAR0, output data from BRAM
      if (bar_hit[0] && req_read) begin
          rsp_data = bram_rd_data;
      end
  4. Save Changes:

    • After implementing the logic for each BAR, save the pcileech_tlps128_bar_controller.sv file.
    • Verify Functionality: This logic is complex. Thorough simulation (using a test bench) and later hardware testing will be crucial to ensure correct behavior.

8.2.3 Handling Multiple BARs

Properly managing multiple BARs is essential for devices that expose several distinct memory or I/O regions. The bar_controller module typically handles all BARs.

Steps:

  1. Implement Logic for Each BAR:

    • Within pcileech_tlps128_bar_controller.sv, extend the logic to handle all enabled BARs (BAR0, BAR1, BAR2, etc.) that your donor device uses.
    • Separate Logic Blocks: For clarity and maintainability, create distinct if/else if blocks or case statements that activate based on which bar_hit signal is asserted.
      // BAR0 Handling
      if (bar_hit[0]) begin
          // BAR0 specific read/write logic for its registers/memory
      end else if (bar_hit[1]) begin
          // BAR1 specific read/write logic for its registers/memory
      end else if (bar_hit[2]) begin
          // BAR2 specific logic
      end
      // ... continue for other BARs
    • Define Registers and Memories: Allocate separate sets of registers or connect distinct BRAM instances for each BAR as needed.
  2. Ensure Non-Overlapping Address Spaces:

    • While the PCIe IP core handles the negotiation of distinct address spaces for each BAR with the host OS, your internal firmware logic must assume that these spaces are separate and non-overlapping.
    • Validate Address Ranges: Double-check your BAR size configurations in the PCIe IP core to ensure they are distinct and correctly aligned to power-of-two boundaries as per PCIe specifications.
    • Update Address Decoding: Your bar_controller logic relies on the bar_hit signals generated by the PCIe IP core. Ensure these are correctly interpreted and lead to the unique handling logic for each BAR.
  3. Test BAR Accesses:

    • Simulation Testing: Before hardware deployment, use simulation tools (e.g., Vivado Simulator) with a comprehensive test bench to verify all read and write operations to each BAR.
      • Send Memory Write TLPs to specific offsets within each BAR.
      • Send Memory Read TLPs to specific offsets within each BAR and verify the returned data.
    • Hardware Testing: After programming the FPGA, use host-side software tools (like PCILeech client software, or custom C/Python scripts) to access and verify each BAR.
      • Linux: Use lspci -vvv to inspect BAR mappings (Memory at XXXX (64-bit, prefetchable) [size=YYYY]). You can then use devmem2 or a custom kernel module to read/write to these mapped addresses.
      • Windows: Use tools like "RW-Everything" or custom user-mode applications to inspect and interact with the mapped memory regions.
      • Perform various read/write patterns to ensure data integrity and correct addressing across all BARs.

8.3 Emulating Device Power Management and Interrupts

Emulating power management features and implementing interrupts are critical for devices that need to interact closely and efficiently with the host operating system's power management and interrupt handling mechanisms. Without these, the emulated device may not appear fully functional, or performance could be suboptimal.

8.3.1 Power Management Configuration

Implementing power management allows the emulated device to support various power states (e.g., D0, D3hot), contributing to system-wide power efficiency and compliance with operating system expectations. The host OS will query the device's capabilities and send commands to transition between these states.

Steps:

  1. Enable Power Management in PCIe IP Core:

    • Access Capabilities: In the PCIe IP core customization window (pcie_7x_0.xci), navigate to the PCIe Capabilities tab.
    • Enable Power Management: Look for a section or option related to Power Management Capability. Ensure this is checked or enabled to include the Power Management (PM) capability structure in the device's configuration space.
  2. Set Power States Supported:

    • Configure Supported States: Within the Power Management Capability section of the IP core, specify which power states the device supports. These are typically checkboxes or dropdowns. Match these settings to the donor device's capabilities as observed by Arbor.
      • D0 (Fully On/Operational): Always supported.
      • D1, D2 (Intermediate States): Optional, for low-power idle states.
      • D3hot (Power Down, Auxiliary Power Present): Device logic is off, but can respond to PM events.
      • D3cold (Power Off): No power to the device.
    • Example: If the donor device only supports D0 and D3hot, enable only those.
  3. Implement Power State Logic in Firmware:

    • Open pcileech_pcie_cfg_a7.sv (or relevant control module): You'll typically need to modify the firmware to reflect and potentially react to power state transitions commanded by the host. The PCIe core itself handles much of the protocol, but your user logic needs to know the state.
    • Handle Power Management Control and Status Register (PMCSR) Writes: The host OS changes the device's power state by writing to specific bits in the PMCSR, which is part of the PM Capability structure. Your firmware should ideally have logic to read these bits and adjust device behavior (e.g., pausing/resuming operations, enabling/disabling clocks).
      // Example: Part of pcileech_pcie_cfg_a7.sv or a dedicated PM module
      // Assume 'cfg_write' is asserted for config writes, 'cfg_address' is the offset, 'cfg_writedata' is the data.
      // The D-state bits are at offset 0x04 within the PM capability structure, bits [1:0].
      
      // PMCSR Register (Internal representation)
      reg [15:0] pmcsr_reg = 16'h0000; // Initialize to D0
      
      // User logic signal indicating current power state
      reg [1:0] current_d_state = 2'b00; // 00 = D0, 01 = D1, 10 = D2, 11 = D3hot
      
      always @(posedge clk) begin
          if (reset) begin
              pmcsr_reg <= 16'h0000;
              current_d_state <= 2'b00; // Reset to D0
          end else begin
              // Example: Capture writes to PMCSR (if directly handled in user logic)
              // Note: The PCIe IP core manages much of this, but your user logic might need to read values from it.
              // Assuming the PCIe core provides an output reflecting the current D-state:
              // assign current_d_state = pcie_core_d_state_output;
      
              // If user logic *needs* to write to PMCSR (less common, usually read-only status)
              // Or if it needs to process the command
              // if (cfg_write && (cfg_address == PM_CAP_OFFSET + 2'h04)) begin // PMCSR is at +0x04 from PM Cap base
              //     pmcsr_reg[1:0] <= cfg_writedata[1:0]; // Capture New D-state
              //     // current_d_state <= cfg_writedata[1:0]; // Update internal state
              // end
      
              // In PCILeech, the PCIe core manages the PMCSR. You would likely read a signal from the core.
              // For demonstration, let's assume 'pcie_d_state' is an input from the IP core.
              current_d_state <= pcie_d_state; // Update based on PCIe core's status
          end
      end
      
      // Example: Logic reacting to D-state changes
      always @(*) begin
          if (current_d_state == 2'b11) begin // D3hot state
              // Disable power to non-essential blocks, pause operations,
              // assert a signal to main DMA logic to stop activity.
              // For instance: dma_engine_enable = 1'b0;
          end else if (current_d_state == 2'b00) begin // D0 state
              // Enable full functionality
              // For instance: dma_engine_enable = 1'b1;
          end
      end
    • Manage Power State Effects: Implement logic to alter the device's internal behavior (e.g., enable/disable clocks, put sub-modules into low-power modes) based on the current_d_state. This is crucial for accurate power consumption emulation and to ensure the device responds correctly to OS commands.
  4. Save Changes:

    • Save any modified firmware files.
    • Thoroughly test power management features through simulation or hardware testing (e.g., Windows "Sleep" or "Hibernate" functions, or Linux poweroff commands to see if the device transitions correctly).

8.3.2 MSI/MSI-X Configuration

Implementing Message Signaled Interrupts (MSI) or its extended version (MSI-X) allows the emulated device to use message-based interrupts. These are significantly more efficient and scalable than traditional pin-based interrupts (INTx) and are the preferred method for modern PCIe devices. MSI/MSI-X allow the device to notify the CPU by writing a special TLP to a specific memory address.

Steps:

  1. Enable MSI/MSI-X in PCIe IP Core:

    • Access Interrupts Configuration: In the PCIe IP core customization window (pcie_7x_0.xci), navigate to the Interrupts tab or a section specifically labeled MSI/MSI-X Capabilities.
    • Select Interrupt Type: Choose between MSI or MSI-X based on the donor device's capabilities. MSI-X is generally preferred for its flexibility (more vectors, per-vector masking).
    • Configure Number of Supported Vectors: Set the number of interrupt vectors (messages) that the device will support. Match this to the donor device.
      • MSI supports up to 32 vectors (typically 1, 2, 4, 8, 16, or 32).
      • MSI-X supports up to 2048 vectors, allowing for more granular interrupt sources.
    • Enable Capabilities: Ensure that the MSI or MSI-X capability structure is explicitly enabled to be included in the device's configuration space. This is how the host OS discovers the device's interrupt capabilities.
  2. Implement Interrupt Logic in Firmware:

    • Open pcileech_pcie_tlp_a7.sv (or user logic module): This file is generally responsible for user-defined TLP generation and might be the appropriate place to initiate MSI/MSI-X messages. However, the trigger for the interrupt will come from your custom logic.
    • Define Interrupt Signals: Declare internal signals that will indicate when an interrupt needs to be generated.
      // In a custom module (e.g., 'my_device_logic.sv') that interfaces with TLP generation logic
      reg msi_trigger_signal; // Assert this when an interrupt condition occurs
    • Implement Interrupt Generation Logic: Define the conditions under which an interrupt should be triggered. This typically involves event detection within your emulated device's logic.
      // Inside 'my_device_logic.sv'
      input wire clk;
      input wire reset;
      input wire event_data_ready; // Example: Input from your logic when data is ready
      
      always @(posedge clk or posedge reset) begin
          if (reset) begin
              msi_trigger_signal <= 1'b0;
          end else if (event_data_ready) begin // When a specific event happens
              msi_trigger_signal <= 1'b1; // Trigger an MSI
          end else begin
              msi_trigger_signal <= 1'b0; // Clear after one cycle or when acknowledged
          end
      end
    • Connect to PCIe Core's MSI Interface: The msi_trigger_signal (or similar output from your custom logic) needs to be connected to the appropriate input of the PCIe IP core (e.g., s_axis_tdata_tready, s_axis_tdata_tvalid, s_axis_tdata_tlast if using an AXI-Stream interface for MSI TLPs, or dedicated MSI request ports provided by the IP core). The PCIe core then forms and sends the actual MSI/MSI-X TLP. Consult the Xilinx PCIe IP core documentation for precise interface details.
  3. Save Changes:

    • Save any modified firmware files after implementing the interrupt logic.
    • Check for Timing Constraints: New logic, especially interrupt paths, can be timing-critical. Ensure that the synthesis and implementation tools do not report any timing violations related to your interrupt generation logic.

8.3.3 Implementing Interrupt Handling Logic (Device Side)

Beyond just enabling the capability, defining when and how interrupts are generated by your emulated device is crucial for its proper interaction with the host's interrupt handling mechanisms and driver behavior. This involves creating the internal logic that asserts the interrupt request.

Steps:

  1. Define Interrupt Conditions:

    • Identify Trigger Events: Based on your donor device's behavior, determine the specific internal events that should cause your emulated device to generate an interrupt.
      • Examples: Data transfer completion, new data ready in a receive buffer, an internal error condition, completion of a specific command, link status change.
    • Implement Condition Logic: Use combinational or sequential logic within your custom SystemVerilog modules to precisely detect these events and generate a short-duration pulse or level-based signal that indicates an interrupt request.
  2. Create Interrupt Generation Module (Modular Design):

    • It's a good practice to encapsulate your interrupt generation logic into a separate, dedicated module for clarity, reusability, and easier debugging. This module would take internal events as inputs and produce an msi_req (or similar) output that connects to the PCIe core.
      // File: interrupt_generator.sv
      module interrupt_generator (
          input wire clk,
          input wire reset,
          input wire event_trigger,        // Input signal from your custom logic (e.g., data_ready, error_flag)
          output reg msi_req_o            // Output: Assert to request an MSI/MSI-X
      );
      
      // Simple pulse generator for MSI (one-shot interrupt)
      reg event_trigger_d1;
      
      always @(posedge clk or posedge reset) begin
          if (reset) begin
              msi_req_o <= 1'b0;
              event_trigger_d1 <= 1'b0;
          end else begin
              event_trigger_d1 <= event_trigger;
              // Generate a single-cycle pulse when event_trigger goes high
              if (event_trigger && !event_trigger_d1) begin
                  msi_req_o <= 1'b1; // Assert MSI request
              end else begin
                  msi_req_o <= 1'b0; // De-assert after one cycle
              end
          end
      end
      
      endmodule // interrupt_generator
    • Integrate with Main Firmware: Instantiate this interrupt_generator module in your top-level user logic (e.g., within pcileech_squirrel_top.sv or a module it instantiates) and connect its msi_req_o output to the PCIe IP core's MSI input.
  3. Ensure Proper Timing and Sequencing:

    • Adhere to PCIe Specifications: MSI/MSI-X messages are TLPs. Ensure that the generation of these messages adheres to PCIe TLP formatting, flow control, and timing requirements. The PCIe IP core handles much of this, but your input signals to it must be stable and correctly timed.
    • Manage Interrupt Latency: Optimize your logic to minimize any unnecessary delay between the internal event occurrence and the assertion of the msi_req_o signal.
  4. Test Interrupt Delivery:

    • Simulation: Use a comprehensive test bench to simulate scenarios where interrupts should be generated. Verify that your msi_req_o signal behaves as expected and that the PCIe core generates the correct MSI/MSI-X TLPs.
    • Hardware Testing:
      • Program the FPGA with the updated firmware.
      • Use host-side software to trigger the events that should cause an interrupt (e.g., initiating a DMA transfer that completes).
      • Confirm that the interrupts are received by the host operating system. On Linux, dmesg can show interrupt messages. On Windows, you might use specific driver debug tools or Event Viewer.
      • Debugging Tools: Utilize Vivado's Integrated Logic Analyzer (ILA) cores (as discussed later in Section 12) to monitor event_trigger, msi_req_o, and the PCIe core's TLP output signals in real time to verify correct interrupt generation.
  5. Save Changes:

    • Finalize all code modifications and save the relevant firmware files.
    • Review and refine your interrupt logic based on testing results to ensure reliability.

9. Emulating Device-Specific Capabilities

Beyond the standard PCIe configuration space and generic DMA functionality, many donor devices possess unique capabilities, custom registers, or vendor-specific features that are critical for their full functionality or for interacting with their proprietary drivers. Accurate emulation requires understanding and replicating these nuances. This section delves into implementing such advanced capabilities, enabling a more faithful and fully functional emulation.

9.1 Implementing Advanced PCIe Capabilities

The PCIe specification includes various extended capabilities beyond the basic configuration space. These capabilities provide features like advanced error reporting, power management, virtualization, and more. Implementing these helps your emulated device appear more legitimate and interact correctly with modern host systems.

Steps:

  1. Identify Required Extended Capabilities:

    • When gathering donor device information with tools like Arbor, carefully look for and document any Extended Capabilities present in the donor's configuration space. These are usually found beyond the initial 256 bytes of standard configuration space.
    • Common Examples:
      • Advanced Error Reporting (AER): Provides robust error detection, logging, and reporting mechanisms for PCIe links.
      • Device Serial Number (DSN): (Already covered in Section 6.2).
      • Power Management (PM): (Already covered in Section 8.3.1).
      • PCI Express (PCIe) Capability Structure: (Already covered in Section 8.1 for Link Speed/Width, Max Payload/Read Request, but also includes other fields like Device Control/Status, Link Control/Status).
      • Virtual Channel (VC)/Multi-Function Virtual Channel (MFVC): For Quality of Service (QoS) and traffic management.
      • Precision Time Measurement (PTM): For synchronizing time across devices.
      • Latency Tolerance Reporting (LTR): For power management based on latency requirements.
      • Resettable FPC (Function-Level Reset): For more granular resets.
  2. Enable Capabilities in Vivado PCIe IP Core:

    • Access the PCIe IP core customization window (pcie_7x_0.xci) in Vivado.
    • Navigate through the various tabs (e.g., "PCIe Capabilities," "Extended Capabilities," "Advanced Options").
    • Look for checkboxes or dropdowns to enable and configure the specific extended capabilities identified from your donor device.
    • Example (AER): You'll find a section for "Advanced Error Reporting" where you can enable it and configure its registers (e.g., severity masks).
    • Note: The Xilinx PCIe IP core provides a high degree of configurability for many standard and extended capabilities. It's often just a matter of enabling the correct options in the GUI.
  3. Implement Firmware Logic for Capability Registers (if necessary):

    • While the PCIe IP core handles the presence and much of the protocol for these capabilities, some capabilities expose registers that your custom firmware might need to read or write, or whose values your firmware needs to react to.
    • Example (AER): If your emulated device detects an internal error that should be reported via AER, your firmware needs to write to specific AER error status registers (which might be exposed as part of the BARs or handled internally by the PCIe core and then reflected to user logic). Your user logic would then assert an error input to the PCIe core.
    • Example (Power Management): As discussed in 8.3.1, your firmware needs to react to the D-state changes signaled by the PCIe core.
    • Process:
      • Identify the specific registers within each enabled capability structure that your donor device's driver interacts with.
      • Locate the corresponding signals or logic within the PCILeech-FPGA framework that interface with these registers (often within pcileech_pcie_cfg_a7.sv or the bar_controller).
      • Implement read/write logic for these registers, ensuring your emulated device's internal state accurately reflects the values the driver expects.

9.2 Emulating Vendor-Specific Features

This is where true "full device emulation" becomes highly specialized. Many real-world devices have unique registers, undocumented commands, custom data formats, or proprietary control flows that differentiate them. Replicating these requires deeper analysis and custom HDL development.

Steps:

  1. Reverse Engineer Vendor-Specific Behaviors:

    • This is often the most challenging part.
    • Static Analysis (Driver/Firmware): Disassemble the donor device's official driver (Windows .sys, Linux .ko) or the device's original firmware (if available). Look for unique I/O or MMIO access patterns, magic values, or sequences of register writes. Tools like Ghidra, IDA Pro, or objdump can be invaluable.
    • Dynamic Analysis (Driver Execution): Run the donor device with its driver and monitor PCIe traffic using a PCIe Protocol Analyzer (e.g., Teledyne LeCroy, Keysight, as discussed in Section 12.2). This is the gold standard for understanding actual TLP exchanges, including vendor-defined messages and sequences of register accesses. Pay attention to:
      • Specific memory addresses accessed within BARs.
      • Read/write patterns to these addresses.
      • The values written to or read from specific registers.
      • Timing relationships between commands and responses.
    • System Calls/API Monitoring: On the host, use tools like Procmon (Windows) or strace (Linux) to see how the driver interacts with the OS and what specific device I/O control (IOCTL) codes it uses, which might correspond to specific hardware operations.
    • Hardware Sniffing: If possible, use a hardware sniffer (like a Saleae Logic Analyzer) to capture signals on the device's internal buses (e.g., SPI, I2C) if it has external flash or components.
  2. Implement Custom Registers and Logic within BARs:

    • Once you've identified vendor-specific registers or command protocols, you'll need to define these within your FPGA firmware, typically as memory-mapped registers accessible via one of your BARs.
    • Create Internal Registers: Declare reg variables in your SystemVerilog code to represent these custom registers.
      // In your pcileech_tlps128_bar_controller.sv or a sub-module
      reg [31:0] custom_control_reg;
      reg [31:0] custom_status_reg;
      reg [31:0] custom_data_reg;
      
      // Example: Map these to specific offsets within BAR0 (assuming BAR0 is large enough)
      // Adjust the 'bar0_offset' case statement (from Section 8.2.2)
      // ...
      if (bar_hit[0]) begin
          if (req_write) begin
              case (bar0_offset)
                  32'h0000_1000: custom_control_reg <= req_data; // Custom control register
                  32'h0000_1004: custom_data_reg <= req_data;    // Custom data write register
                  // ... other mappings
              endcase
          end else if (req_read) begin
              case (bar0_offset)
                  32'h0000_1000: rsp_data = custom_control_reg; // Read control register
                  32'h0000_1008: rsp_data = custom_status_reg;  // Custom status register
                  // ... other mappings
              endcase
          end
      end
      // ...
    • Implement Behavioral Logic: Create SystemVerilog logic (state machines, combinational logic) that:
      • Responds to writes to your custom_control_reg. For example, a specific bit in this register might trigger a DMA transfer, clear a status flag, or initiate an internal operation.
      • Updates your custom_status_reg based on the internal state of your emulated device (e.g., "operation complete," "error occurred," "data available").
      • Processes data written to custom_data_reg or provides data from it on reads, mimicking the donor device's data paths.
  3. Emulate Vendor-Specific Messages (if applicable):

    • Some complex devices might use "Vendor Defined Messages" (VDMs) over PCIe for specific control or communication. If your analysis reveals such messages, you would need to:
      • Enable VDM support in the PCIe IP core (if available).
      • Implement TLP generation logic (as will be discussed in Section 10) to craft and send these VDMs.
      • Implement TLP reception and parsing logic to interpret incoming VDMs from the host.
  4. Validate Emulated Behavior:

    • Iterative Testing: This is a highly iterative process. Make small changes, compile, flash, and test.
    • Driver Loading: Does the donor device's driver load correctly without errors?
    • Functional Tests: Can the driver initiate basic operations? Does it get the expected responses from your emulated registers?
    • Application Tests: Can applications that rely on the donor device function correctly with your emulated version?
    • Debugging: Use ILA and PCIe protocol analyzers extensively to compare your emulated device's behavior against the captured behavior of the real donor device. Look for discrepancies in TLP timing, register values, and overall protocol flow.

10. Transaction Layer Packet (TLP) Emulation

Transaction Layer Packets (TLPs) are the fundamental units of communication in the PCIe architecture. Every interaction between a host system and a PCIe device, from configuration reads to data transfers, is encapsulated within one or more TLPs. Accurate TLP emulation is not just important; it is crucial for your emulated device to interact properly with the host system, ensuring that drivers function correctly and that data moves as expected.

10.1 Understanding and Capturing TLPs

Before you can craft custom TLPs, you must deeply understand their structure and common types. Capturing real-world TLPs from your donor device provides the most accurate blueprint.

10.1.1 Learning the TLP Structure

A TLP is generally composed of a header, an optional data payload, and an optional End-to-End CRC (ECRC). The header is paramount, defining the TLP's type, transaction specifics, and routing information.

  • Components of a TLP:

    • Header: The most important part, typically 3 or 4 Double Words (Dwords = 4 bytes). It contains critical fields that define the TLP's purpose and how it should be handled:
      • Fmt (Format) and Type: Defines the TLP's format (3DW/4DW, with/without data) and its specific purpose (e.g., Memory Read Request, Memory Write, Completion, Configuration Read/Write).
      • Length: Specifies the size of the data payload in Dwords.
      • Requester ID (Bus, Device, Function): Identifies the PCIe function that initiated the request. Essential for routing completions back to the correct source.
      • Tag: A unique identifier assigned by the Requester to a transaction, allowing a Completer to match a Completion TLP to its original Request TLP.
      • Address: For Memory/IO transactions, this is the target memory or I/O address.
      • First DW Byte Enable (FBE) and Last DW Byte Enable (LBE): Specify which bytes within the first and last Dwords of the payload are valid for write operations, or which bytes are being requested for read completions.
      • Traffic Class (TC) and Transaction ID (TID): For QoS and ordering rules.
    • Data Payload (Optional): Present in TLPs like Memory Write, Configuration Write, and Read Completions. It contains the actual data being transferred.
    • End-to-End CRC (ECRC) (Optional): A 32-bit CRC that covers the entire TLP, ensuring data integrity from source to destination, typically generated/checked by software.
  • Understanding Common TLP Types: Your firmware will primarily deal with these:

    • Memory Read Request (MRd): A TLP sent by a Requester (e.g., host CPU, or your FPGA as a DMA master) to read data from a specific memory address.
    • Memory Read Completion (CplD): A TLP sent by a Completer (e.g., your FPGA responding to a host MRd) carrying the requested data.
    • Memory Write (MWr): A TLP sent by a Requester (e.g., host CPU, or your FPGA) to write data to a specific memory address.
    • Completion Without Data (Cpl): A TLP sent by a Completer to acknowledge a request that does not return data (e.g., a successful MWr).
    • Configuration Read Request (CfgRd): A TLP from the host to read registers in the device's Configuration Space.
    • Configuration Read Completion (CplD): A TLP from the device returning data for a CfgRd.
    • Configuration Write Request (CfgWr): A TLP from the host to write to registers in the device's Configuration Space.
    • Vendor-Defined Messages (VDM): Custom TLPs used by specific vendors for proprietary communication.

10.1.2 Capturing TLPs from the Donor Device

Capturing real PCIe traffic from your donor device is invaluable. It provides concrete examples of TLP structures, sequences, and timing, allowing you to replicate them precisely.

  • Steps:
    1. Set Up a PCIe Protocol Analyzer:
      • The most effective method involves using dedicated hardware tools, often referred to as "PCIe Protocol Analyzers." These devices sit in-line between the host and the donor PCIe card, passively capturing all traffic.
      • Examples:
        • Teledyne LeCroy PCIe Analyzers: Industry standard, highly capable, but significant investment.
        • Keysight PCIe Analyzers: Another leading vendor.
        • (For basic debugging, some higher-end logic analyzers with PCIe decoders might offer limited TLP viewing, but a true protocol analyzer is superior).
    2. Capture Transactions:
      • Install the donor device in a test system with the protocol analyzer connected.
      • Run the donor device's driver and any associated applications.
      • Monitor and record the PCIe transactions during normal operation, and especially during critical phases like:
        • Device enumeration (when the OS first detects it).
        • Driver loading and initialization.
        • Typical data transfer operations (e.g., large file copies for a storage device, network traffic for a NIC).
        • Device-specific commands or diagnostics.
    3. Analyze Captured TLPs:
      • Use the protocol analyzer's sophisticated software to dissect the captured TLPs. The software will decode fields, provide chronological views, and allow filtering and searching.
      • Pay close attention to:
        • The exact Fmt and Type fields.
        • Requester ID and Tag values (especially for completions).
        • The Address and Length for memory transactions.
        • The contents of Data Payload for writes and read completions.
        • Any vendor-specific fields or custom TLPs.

10.1.3 Documenting Key TLP Transactions

Structured documentation of captured TLPs creates a blueprint for your emulation.

  • Steps:
    1. Identify Critical Transactions:
      • Focus on TLPs that are essential for the device's core functionality. These include:
        • Initialization Sequence: The series of configuration reads/writes the OS performs during enumeration.
        • Driver Initialization: The commands and data exchanged when the driver starts up.
        • Primary Data Transfers: How MWr and MRd TLPs are structured and completed for the device's main function.
        • Error Handling: How the device reports errors (e.g., Completion with Completer Abort (CA), Unsupported Request (UR)).
        • Power Management Transitions: TLPs related to D-state changes.
        • Interrupt Generation: How MSI/MSI-X messages are sent.
    2. Create Detailed Documentation:
      • For each key TLP sequence, record:
        • The TLP Type (e.g., MWr, MRd, CplD).
        • Its header fields (Fmt, Type, Requester ID, Tag, Length, Address, Byte Enables).
        • The data payload (if applicable).
        • The sequence number or order within a transaction.
        • The conditions under which it is sent (e.g., "sent by host when driver initializes," "sent by device on DMA completion").
        • Any expected responses or follow-up TLPs.
      • Screenshots from the protocol analyzer can be very helpful here.
    3. Understand Timing and Sequencing:
      • Beyond just the TLP content, the timing and sequence of TLPs are vital. PCIe has strict ordering rules and flow control mechanisms. Pay attention to:
        • Delay between Request and Completion: How quickly the real device responds.
        • Flow Control Credits: How the device manages its buffer space for incoming/outgoing TLPs. While the Xilinx PCIe IP core handles basic flow control, for advanced emulation, knowing the donor's typical credit usage can be helpful.
        • Transaction Layer Packet Ordering: Understand how posted (writes) and non-posted (reads, completions) transactions are ordered.

10.2 Crafting Custom TLPs for Specific Operations

Once you understand the blueprint, you can translate this knowledge into your FPGA firmware (SystemVerilog) to actively generate and respond to TLPs. The PCILeech-FPGA framework provides abstraction layers, but for deep emulation, you might need to interact directly with the TLP generation/parsing logic.

10.2.1 Implementing TLP Handling in Firmware

Your firmware will need logic to both send and receive TLPs. The PCIe IP core handles the physical and data link layers, exposing a Transaction Layer interface (often AXI-Stream) to your user logic.

  • Files to Modify (Primary):

    • pcileech-fpga/<your_board_variant>/src/pcileech_pcie_tlp_a7.sv (or similar, depending on board variant)
      • This file often contains the core logic for translating user requests into outbound TLPs and parsing incoming TLPs into signals for your user logic.
    • pcileech-fpga/<your_board_variant>/src/pcileech_tlps128_bar_controller.sv
      • This module specifically handles parsing incoming Memory Read/Write TLPs that target your device's BARs and generating corresponding Completion TLPs.
  • Steps:

    1. Understand the PCIe IP Core Interface:

      • Before writing TLP logic, thoroughly read the Xilinx PCIe IP Core User Guide (specifically the sections on the User Application Interface or AXI4-Stream Interface). This defines how your SystemVerilog logic connects to the PCIe core to send and receive TLPs. You'll typically interact with s_axis_rx_tdata (received TLP data), s_axis_rx_tvalid (valid TLP received), m_axis_tx_tdata (outgoing TLP data), m_axis_tx_tready (core ready to accept TLP), etc.
    2. Create TLP Generation Functions (for outbound TLPs):

      • In pcileech_pcie_tlp_a7.sv (or a module that interfaces with m_axis_tx_*), you will write logic to assemble TLPs with the required headers and payloads. This often involves combining various fields into a [127:0] (for 128-bit interface) or [63:0] (for 64-bit interface) bus that feeds the PCIe core.
      • Example (Conceptual, Simplified Function for a 3DW TLP Header):
        // This is a conceptual helper function. In reality, you'd build a state machine
        // to send TLPs over an AXI-Stream interface, possibly using a FIFO.
        function automatic [95:0] create_3dw_tlp_header; // Assuming 3 Dwords = 96 bits
            input logic [7:0] tlp_type_fmt;   // Format and Type fields
            input logic [15:0] requester_id;  // BDF
            input logic [7:0] tag;
            input logic [7:0] lower_address_bits; // Or more complex address
            input logic [7:0] byte_enables;   // First DW Byte Enable
        
            begin
                create_3dw_tlp_header = {
                    tlp_type_fmt,                     // Fmt[6:4], Type[3:0]
                    8'b0,                             // Reserved
                    4'b0,                             // TC[3:0] (Traffic Class)
                    3'b0,                             // Attr[2:0]
                    1'b0,                             // TH (TLP Hint)
                    2'b0,                             // D(igest) (ECRC presence)
                    1'b0,                             // EP (Poisoned)
                    1'b0,                             // TD (Type Dependent)
                    // DW0: Fmt, Type, TC, Attr, TH, D, EP, TD, Length (not in 3DW, 4DW has it)
        
                    requester_id,                     // Requester ID (Bus[7:0], Device[4:0], Function[2:0])
                    tag,                              // Tag
                    lower_address_bits,               // Example: lower bits of address or a portion of data
                    byte_enables,                     // First DW Byte Enable
                    4'b0,                             // Reserved
                    4'b0                              // Last DW Byte Enable (for MWr usually)
                    // DW1, DW2... fields
                };
            end
        endfunction
        
        // Example: Generating a Completion with Data (CplD) in a state machine
        // This is just a snippet, not a full implementation
        localparam  CPLD_3DW_FMT = 8'h4A; // Fmt=100 (4DW, with data), Type=1010 (Cpl)
        localparam  CPL_D_FMT_TYPE_LEN = 8'h4A; // Adjusted based on PCIe spec. (4DW header with data)
        
        // ... state machine to send TLP
        // In a state where you are ready to send CplD
        if (tx_ready_from_pcie_core) begin
            // Build the header and payload
            // For a CplD, you need Complier ID, Status, Byte Count, Requester ID, Tag, Completion ID, Lower Address
            // And then the actual read data payload
            m_axis_tx_tdata_reg = {
                CPL_D_FMT_TYPE_LEN,         // Byte 0: Fmt/Type
                tlp_length_dw_minus_one,    // Byte 1: TLP Length (in Dwords) - 1
                status_completion_bits,     // Byte 2: Cpl Status, BCM, Rsvd
                byte_count_dws_upper,       // Byte 3: Byte Count (upper bits)
                requester_id,               // Byte 4-5: Requester ID (from original MRd)
                tag,                        // Byte 6: Tag (from original MRd)
                byte_count_dws_lower,       // Byte 7: Byte Count (lower bits)
                completion_id,              // Byte 8-9: Completion ID (your BDF)
                lower_address_from_request  // Byte 10-11: Lower Address from Request
                // ...followed by actual data payload
            };
            m_axis_tx_tvalid_reg = 1'b1;
            m_axis_tx_tlast_reg = 1'b1; // Last TLP segment
            // ... state transition to wait for tready
        end
      • Note: The actual implementation involves state machines, FIFOs, and adherence to the AXI-Stream protocol for the PCIe IP core. The PCILeech-FPGA framework already provides a good base for this, but you might extend or modify it for very specific TLP behaviors.
    3. Handle TLP Reception (for inbound TLPs):

      • Implement logic to parse incoming TLPs from the PCIe core's receive interface (e.g., s_axis_rx_tdata, s_axis_rx_tvalid).
      • This parsing involves:
        • Checking s_axis_rx_tvalid to know a TLP is present.
        • Reading the Fmt and Type fields from the TLP header to determine its purpose.
        • Extracting relevant fields like Requester ID, Tag, Address, Length, and Data Payload.
      • Use case statements or if/else if blocks based on the TLP type to route the information to the appropriate internal logic (e.g., bar_controller for Memory Writes, a configuration module for Configuration Writes).
      • Example (Conceptual, Simplified Parsing):
        // In pcileech_pcie_tlp_a7.sv or a TLP parser module
        input wire [127:0] s_axis_rx_tdata;
        input wire s_axis_rx_tvalid;
        output wire s_axis_rx_tready; // Needs to be asserted to accept more data
        
        reg [7:0] received_tlp_fmt_type;
        reg [15:0] received_requester_id;
        // ... declare other parsed fields
        
        assign s_axis_rx_tready = 1'b1; // Always ready to receive for simplicity, manage backpressure in real design
        
        always @(posedge clk) begin
            if (s_axis_rx_tvalid) begin
                received_tlp_fmt_type = s_axis_rx_tdata[127:120]; // Assuming highest bits
                received_requester_id = s_axis_rx_tdata[111:96]; // Example offset
        
                // Decode based on TLP type
                case (received_tlp_fmt_type[3:0]) // Only TLP Type bits
                    4'h0: // Memory Write (3DW or 4DW depending on Fmt)
                        // Extract address, length, payload and pass to BAR controller
                        begin
                            // Pass to BAR controller for writing to emulated memory
                            // bar_write_enable = 1'b1;
                            // bar_write_address = s_axis_rx_tdata[...];
                            // bar_write_data = s_axis_rx_tdata[...];
                        end
                    4'h1: // Memory Read
                        // Extract address, length, and pass to BAR controller for reading
                        begin
                            // bar_read_enable = 1'b1;
                            // bar_read_address = s_axis_rx_tdata[...];
                            // (Completion will be generated by BAR controller)
                        end
                    // ... other TLP types
                    default: begin
                        // Handle unsupported or reserved TLP types (e.g., log, error)
                    end
                endcase
            end
        end
    4. Ensure Compliance:

      • Strictly verify that both your generated and parsed TLPs conform to the PCIe specification regarding format, field definitions, and timing. Deviations will lead to communication failures.
    5. Implement Completion Handling:

      • For Memory Read Requests (MRd) and Configuration Read Requests (CfgRd) received from the host, your device is obligated to send back appropriate Completion TLPs (CplD for data, Cpl for no data) within a specific timeframe. The bar_controller module (Section 8.2.2) is where this logic for BAR reads resides.
    6. Save Changes:

      • Save the files (pcileech_pcie_tlp_a7.sv, pcileech_tlps128_bar_controller.sv, or any custom modules) after implementing the changes.

10.2.2 Handling Different TLP Types

Each TLP type has a specific header format and behavior. Your firmware must be adept at handling the ones relevant to your donor device.

  • Memory Read Requests (MRd):

    • Implementation:
      • When an MRd TLP is received (parsed by pcileech_pcie_tlp_a7.sv and routed to the bar_controller), the bar_controller needs to:
        • Parse the requested address and length.
        • Fetch the data from the appropriate internal memory location (e.g., BRAM connected to a BAR) or internal register.
        • Assemble a Completion with Data (CplD) TLP. Crucially, this TLP must include the original Requester ID, Tag, and Completion ID (your device's BDF) from the MRd request, along with the fetched data payload.
        • Send the CplD TLP back to the host via the PCIe IP core's transmit interface.
  • Memory Write Requests (MWr):

    • Implementation:
      • When an MWr TLP is received, the bar_controller needs to:
        • Parse the target address, length, and Byte Enables (FBE/LBE).
        • Extract the data payload.
        • Write the data to the specified memory location within your emulated device (e.g., a BRAM or internal registers), respecting the byte enables.
      • Memory Writes are "posted transactions," meaning they don't require a completion TLP for acknowledgment unless an error occurs.
  • Configuration Read/Write Requests (CfgRd/CfgWr):

    • Implementation:
      • These TLPs target the device's Configuration Space (Vendor ID, Device ID, BARs, Capabilities, etc.). The Xilinx PCIe IP core handles the majority of standard Configuration Space accesses automatically based on its configuration.
      • However, if you have custom registers or extended capabilities within the Configuration Space that are not standard, you might need specific logic to:
        • For CfgRd: Return the requested data from your internal cfg_ registers.
        • For CfgWr: Update your internal cfg_ registers or trigger actions based on the written data.
      • Configuration Reads require a Completion with Data (CplD), while Configuration Writes require a Completion without Data (Cpl).
  • Vendor-Defined Messages (VDM):

    • Implementation:
      • If your donor device uses VDMs, this requires specialized parsing and response logic.
      • Parsing Incoming VDMs: Identify VDMs based on their Fmt and Type fields. Extract the vendor-specific data and interpret it according to your reverse-engineering findings.
      • Crafting Outbound VDMs: Create logic to assemble VDMs with the precise vendor-specific header and payload formats when your emulated device needs to send them.

10.2.3 Validating TLP Timing and Sequence

Even if TLPs are perfectly formatted, incorrect timing or sequencing will lead to device malfunction or detection as non-compliant.

  • Steps:

    1. Use Simulation Tools:

      • Test Benches: Develop comprehensive SystemVerilog test benches for your TLP generation and parsing modules.
      • Simulate various scenarios (e.g., host sending MRd, your device sending CplD; host sending MWr; host enumerating device) to validate that TLPs are correctly formed, transmitted, received, and processed.
      • Verify the sequence of TLPs and that completions are sent within reasonable timeframes.
    2. Monitor with ILA (Integrated Logic Analyzer):

      • As detailed in Section 12.1, insert an ILA core into your Vivado design.
      • Connect the ILA probes to the AXI-Stream interfaces of the PCIe IP core (e.g., s_axis_rx_tdata, s_axis_rx_tvalid, m_axis_tx_tdata, m_axis_tx_tready).
      • Set triggers to capture specific TLPs (e.g., on m_axis_tx_tvalid for a certain TLP type).
      • This allows you to see the actual TLP bits on the FPGA in real time during hardware operation, verifying if your firmware is sending/receiving the correct data and control signals to/from the PCIe IP core.
    3. Check Timing Constraints:

      • The PCIe IP core has strict timing requirements for its AXI-Stream interfaces. Ensure that your user logic providing data to m_axis_tx_tdata and handling s_axis_rx_tdata meets these timing constraints.
      • Vivado's timing analysis reports (after synthesis and implementation) will flag any violations. Address these by optimizing your logic or adjusting clocking where possible.
    4. Compliance Testing (Advanced):

      • For high-fidelity emulation, consider using a dedicated PCIe compliance test suite (often integrated with high-end protocol analyzers). These tests systematically check adherence to the PCIe specification, uncovering subtle protocol violations.
    5. Save Changes:

      • Save all modified files after thorough testing and validation. Iteration is key in TLP-level debugging.

Part 3: Advanced Techniques and Optimization


11. Building, Flashing, and Testing

After all your customizations are complete, the moment of truth arrives: building the firmware, programming it onto your FPGA, and rigorously testing its functionality to ensure it behaves exactly like the donor device. This phase transitions your design from code to a working hardware emulation.

11.1 Synthesis and Implementation

These are the core steps in the FPGA design flow, where your high-level SystemVerilog code is transformed into a low-level hardware configuration that can be loaded onto the FPGA.

11.1.1 Running Synthesis

Synthesis is the process by which Vivado translates your HDL code into a gate-level netlist (a description of logical gates and their interconnections). It also performs preliminary timing analysis and resource estimation.

  • Steps:
    1. Start Synthesis:
      • In the Vivado GUI, in the Flow Navigator pane (typically on the left), under "Synthesis," click Run Synthesis.
    2. Monitor Progress:
      • Vivado will open a "Launch Runs" dialog. You can typically just click "OK."
      • Monitor the Messages tab at the bottom of the Vivado window. It will show the progress of the synthesis run.
      • Common Warnings/Errors to Watch For:
        • [Synth 8-327] Unconnected Ports / Unused Inputs: These indicate that a signal or port in your design is not connected to anything. While sometimes intentional (e.g., unused pins on the FPGA), they can also point to typos in port names or forgotten connections. Investigate each to ensure it's not a functional issue.
        • [Synth 8-256] Registers/Wires Not Optimized: This might indicate that logic is being inferred incorrectly or that you have redundant logic that could be optimized.
        • Syntax Errors: If there are fatal syntax errors in your SystemVerilog code, synthesis will fail immediately. Fix these in Visual Studio Code.
    3. Review Synthesis Report:
      • Upon successful completion, Vivado will ask you what to do next. Select Open Synthesized Design or Open Report.
      • Crucially, review the Utilization Summary in the synthesis report. This shows how much of the FPGA's resources (LUTs, Flip-Flops, BRAMs, DSPs) your design consumes. Ensure that the design fits within your target FPGA's capacity (e.g., for a Artix-7 35T, you should be well within its limits).

11.1.2 Running Implementation

Implementation is the most time-consuming step. It takes the synthesized netlist and physically maps it onto the FPGA's resources (placing logic blocks, routing connections) and then performs a detailed timing analysis to ensure the design can operate at the specified clock frequencies.

  • Steps:
    1. Start Implementation:
      • After successful synthesis, in the Flow Navigator, under "Implementation," click Run Implementation.
      • Confirm the "Launch Runs" dialog.
    2. Monitor Progress:
      • Implementation consists of several phases: Opt Design, Power Opt Design, Place Design, Post-Placement Phys Opt Design, Route Design, Post-Route Phys Opt Design. Each phase can take a significant amount of time.
      • Monitor the Messages tab for progress and potential issues.
    3. Analyze Timing Reports:
      • This is the most critical step after implementation. Upon completion, Vivado will again ask what to do next. Select Open Implemented Design or, more importantly, Open Report and then select Report Timing Summary.
      • Ensure that all timing constraints are met. Look for "WNS (Worst Negative Slack)" values.
        • Positive WNS: Indicates that all timing paths meet their requirements (slack means you have extra time). This is what you want.
        • Negative WNS: Indicates timing violations, meaning your design cannot operate at the desired clock frequency or data might not be stable. This is a critical issue that must be addressed.
      • Address Violations:
        • If you have negative slack, investigate the specific paths that are failing. Vivado's timing report will show you the source, destination, and components of the failing paths.
        • Solutions can include:
          • Optimizing your HDL code to reduce logic depth or critical path delays.
          • Adding pipeline stages (registers) to break up long combinational paths.
          • Refining your XDC (constraints) file, ensuring all clocks are correctly defined and propagated.
          • Adjusting clock frequencies (if the application allows).
          • Using faster timing-closure strategies in Vivado.
          • Ensuring your custom logic interfaces correctly with the PCIe core's AXI-Stream interface timing requirements.
    4. Verify Placement (Optional):
      • In the implemented design, you can open the "Device" view to see how your logic has been placed on the FPGA. This is generally for advanced users to confirm that critical components are placed optimally (e.g., close to PCIe transceivers).

11.1.3 Generating Bitstream

The bitstream is the final, binary configuration file (.bit extension) that will be loaded onto your FPGA. It's the culmination of synthesis and implementation.

  • Steps:
    1. Generate Bitstream:
      • After successful implementation (with no critical timing violations), in the Flow Navigator, under "Program and Debug," click Generate Bitstream.
    2. Wait for Completion:
      • This process generally takes less time than implementation but can still vary based on design complexity.
    3. Review Bitstream Generation Log:
      • Upon completion, Vivado will indicate success. Review the log for any warnings, though typically if implementation passed cleanly, bitstream generation will too.
      • The .bit file will be generated in your project's pcileech_squirrel_top.runs/impl_1/ directory (or similar path for your board).

11.2 Flashing the Bitstream

Programming (flashing) the bitstream loads your compiled design onto the FPGA, making your emulated device active.

11.2.1 Connecting the FPGA Device

  • Steps:
    1. Prepare Hardware:
      • Ensure your FPGA-based DMA board is correctly inserted into a compatible PCIe slot on your host system.
      • Connect the JTAG programmer (e.g., Digilent HS3, Xilinx Platform Cable) to the JTAG header on your FPGA board and to a USB port on your development PC.
      • Power on the host system.
      • Refer to your specific FPGA board's manual for precise power, JTAG, and PCIe connection instructions.
    2. Open Hardware Manager:
      • In Vivado, navigate to Flow Navigator > Program and Debug > Open Hardware Manager.
      • If Vivado is not running, you can launch Hardware Manager as a standalone application.

11.2.2 Programming the FPGA

  • Steps:
    1. Connect to the Target:
      • In the Hardware Manager window, click Open Target (often a large button or link) and select Auto Connect.
      • Vivado should automatically detect your JTAG programmer and then the connected FPGA device(s) on the JTAG chain. If detection fails, check JTAG cable connections, power to the board, and JTAG drivers on your PC.
    2. Program Device:
      • Once your FPGA device is detected and displayed in the Hardware window, right-click on your FPGA device (e.g., xc7a35t_0) and select Program Device.
      • A dialog will appear. Click the "..." button next to the "Bitstream file" field and navigate to your generated bitstream file (e.g., pcileech_squirrel_top.runs/impl_1/pcileech_squirrel_top.bit).
      • Click Program to begin flashing the firmware onto the FPGA.
      • Wait for the programming process to complete. You'll see a progress bar.

11.2.3 Verifying Programming

  • Steps:
    1. Check Status:
      • Ensure the programming completes without errors in Vivado's Hardware Manager. Vivado will display a "Program Device" success message upon completion.
    2. Observe LEDs or Indicators:
      • Many FPGA boards have status LEDs. A successful programming operation often causes a specific LED to illuminate or change state (e.g., a "DONE" LED). This is a quick visual confirmation.
    3. Host System Reboot (Sometimes Required):
      • For the host operating system to correctly recognize the newly programmed PCIe device, a system reboot is often necessary, especially on Windows, to trigger the full PCIe enumeration process.

11.3 Testing and Validation

After programming, the crucial step is to verify that your emulated device is detected correctly by the host and that it functions as expected, mimicking the donor device.

11.3.1 Verifying Device Enumeration

This confirms that the host OS sees your FPGA as the donor device based on the IDs you programmed.

  • Windows:
    • Steps:
      1. Open Device Manager: Press Win + X and select Device Manager from the Quick Link menu.
      2. Check Device Properties:
        • Look under the appropriate device category (e.g., Network Adapters, Storage Controllers, System devices).
        • Find your emulated device. It should now appear with the name of the donor device (e.g., "Intel(R) Ethernet Connection...")
        • Right-click on the device, select Properties, and go to the Details tab.
        • In the "Property" dropdown, select "Hardware Ids." Confirm that the Device ID (DID) and Vendor ID (VID) (e.g., PCI\VEN_ABCD&DEV_1234) match those you programmed into your firmware.
        • Also check "Class Code" and "Subsystem ID" for further verification.
  • Linux:
    • Steps:
      1. Use lspci: Open a terminal and use the lspci command.
        lspci -nn # Shows VendorID:DeviceID
        lspci -vvv # Shows verbose details including BARs, capabilities, and more
      2. Verify Device Listing:
        • Check that the emulated device appears in the lspci output with the correct Vendor ID, Device ID, and Class Code.
        • Example Output (emulating an Intel NIC):
          03:00.0 Network controller [0280]: Intel Corporation Ethernet Connection I219-V [8086:1570] (rev 21)
          
          (8086 is Intel's Vendor ID, 1570 is the Device ID for I219-V, 0280 is the Network Controller Class Code).
        • Use lspci -vvv to confirm that the BARs are enumerated with the correct sizes and types, matching your donor device's configuration.

11.3.2 Testing Device Functionality

Once the device is enumerated, the ultimate test is whether it functions like the original.

  • Steps:
    1. Install Necessary Drivers:
      • If the host OS doesn't automatically load a suitable driver, you will need to manually install the official drivers for your donor device. Download them from the manufacturer's website.
      • Install them as per the manufacturer's instructions. If the emulation is successful, the driver should install and recognize your FPGA as the real hardware.
    2. Perform Functional Tests:
      • Run applications or utilities that would typically interact with the donor device.
      • Examples:
        • Network Card: Perform ping tests, browse the web, or initiate large file transfers to test throughput.
        • Storage Controller: Attempt to format a simulated drive (if your emulation includes storage features), perform read/write operations, or run disk benchmarks.
        • USB Controller: Connect USB devices (if your emulation includes USB host functionality) and test their detection and operation.
      • Monitor the host system for expected behavior and performance characteristics.
    3. Monitor System Behavior:
      • Check for system stability (no BSODs on Windows, kernel panics on Linux).
      • Look for device-specific errors in system logs (Event Viewer on Windows, dmesg or journalctl on Linux).
      • Ensure that the emulated device behaves as expected under various workloads, including heavy data transfers or stress tests.

11.3.3 Monitoring for Errors

Proactive error monitoring is crucial for identifying subtle emulation issues that might not cause immediate crashes.

  • Windows:
    • Steps:
      1. Check Event Viewer: Press Win + X and select Event Viewer.
      2. Look for PCIe-Related Errors: Navigate to Windows Logs > System. Filter or search for warnings, errors, or critical events related to "PCIe," "PCI Express," or events originating from the specific device driver (look for source names matching your emulated device's driver).
        • Common errors include resource conflicts, driver initialization failures, or unexpected device responses.
  • Linux:
    • Steps:
      1. Check dmesg Logs: Open a terminal and type:
        dmesg | grep -i pci # Case-insensitive grep for pci messages
        dmesg | grep -i <VendorID> # Filter for your device's Vendor ID
      2. Identify Issues: Look for messages indicating problems with PCIe link training, device initialization, memory allocation failures, driver probing errors, or unexpected DMA activity. The Linux kernel's PCIe subsystem is quite verbose.
    • Systemd Journal (Modern Linux):
      journalctl -b | grep -i pci # Current boot log

12. Advanced Debugging Techniques

When issues arise, especially in complex PCIe device emulation, basic troubleshooting might not suffice. Advanced debugging tools and techniques provide deep visibility into the FPGA's internal logic and the PCIe bus, allowing you to identify and resolve problems efficiently.

12.1 Using Vivado's Integrated Logic Analyzer (ILA)

The Integrated Logic Analyzer (ILA) is a powerful, configurable debug IP core provided by Xilinx that you can embed directly into your FPGA design. It allows you to monitor the real-time behavior of internal FPGA signals (wires and registers) without needing external probing hardware, functioning like a powerful, internal oscilloscope or logic analyzer.

12.1.1 Inserting ILA Cores

  • Steps:
    1. Plan Your Probes: Identify the key signals you need to observe. For PCIe emulation, these often include:
      • The AXI-Stream interfaces of the PCIe IP core (e.g., s_axis_rx_tdata, s_axis_rx_tvalid, m_axis_tx_tdata, m_axis_tx_tready).
      • Internal state machine signals (current_state, next_state).
      • BAR address decoding outputs (bar_hit[0], bar_hit[1]).
      • Custom register values (custom_control_reg, custom_status_reg).
      • Interrupt request signals (msi_trigger_signal).
    2. Add ILA IP Core:
      • In Vivado, open the IP Catalog (usually in the Flow Navigator pane).
      • Search for "ILA" (Integrated Logic Analyzer).
      • Double-click on the "Debug Bridge" (for basic ILA) or "Integrated Logic Analyzer (ILA)" to open its customization GUI.
      • Configure the ILA:
        • Set the Number of Capture Data Ports (probes) you need.
        • Set the Width of each probe to match the signals you plan to connect.
        • Configure the Sample Depth (how many samples to store before/after a trigger). Larger depths consume more BRAM.
        • Click "OK" and let Vivado generate the IP.
    3. Instantiate and Connect Signals:
      • Vivado will generate an .xci file for the ILA. You can instantiate it directly in your top-level SystemVerilog file (e.g., pcileech_squirrel_top.sv) or within a module where the signals of interest are available.
      • Example (in pcileech_squirrel_top.sv or a sub-module):
        // Assuming you've generated ila_0 from IP Catalog
        // Connect to your design's clock and signals of interest
        ila_0 your_ila_instance (
            .clk(clk_125mhz), // Connect to a stable clock in your design, typically the PCIe user clock
            .probe0(pcie_s_axis_rx_tdata),    // Example: PCIe inbound TLP data
            .probe1(pcie_s_axis_rx_tvalid),   // Example: PCIe inbound TLP valid
            .probe2(pcie_m_axis_tx_tdata),    // Example: PCIe outbound TLP data
            .probe3(my_bar_controller_state), // Example: State of your BAR logic
            .probe4(my_custom_register),      // Example: Value of a custom register
            // Add more probes as needed
            .probeN(signal_to_monitor_N)
        );
      • Alternative (Marking for Debugging): For simpler signals, you can sometimes mark them directly in your HDL code for debugging. Use (* mark_debug = "true" *) wire my_signal; or (* mark_debug = "true" *) reg my_register;. Vivado will then automatically suggest adding these to an ILA.

12.1.2 Configuring Trigger Conditions

ILA is most powerful when you configure intelligent trigger conditions to capture data precisely when an event of interest occurs (e.g., an error, a specific TLP type, a state transition).

  • Steps:
    1. Generate Bitstream with ILA: After inserting and connecting the ILA, you must run synthesis, implementation, and generate a new bitstream. The ILA core consumes FPGA resources and will be embedded in your design.
    2. Open Hardware Manager: Program your FPGA with the ILA-enabled bitstream (Section 11.2). Then, in Vivado, open the Hardware Manager and connect to your target.
    3. Access the ILA Dashboard: In the Hardware Manager, select your ILA instance (e.g., hw_ila_1). This will open the ILA dashboard.
    4. Define Triggers:
      • Select the probes you want to use as trigger inputs.
      • Set specific trigger patterns (e.g., 0x4A for pcie_s_axis_rx_tdata to trigger on a Completion TLP).
      • Configure trigger conditions (e.g., "equal to," "not equal to," "rising edge," "falling edge").
      • Set trigger positions (how many samples to capture before the trigger event, for pre-trigger visibility).
      • You can set up multiple trigger sequences for complex event detection.
      • Example Scenarios for Triggers:
        • Trigger on a specific Fmt/Type in a received TLP to analyze incoming commands.
        • Trigger when a specific register (my_custom_register) reaches a certain value.
        • Trigger on a pcie_m_axis_tx_tvalid AND pcie_m_axis_tx_tdata[3:0] == 4'hC (for a Memory Write TLP) to analyze outbound writes.
        • Trigger on the assertion of an error signal.

12.1.3 Capturing and Analyzing Data

  • Steps:
    1. Run the Design: Allow your host system to interact with the programmed FPGA, causing the events you want to debug.
    2. Arm the ILA: In the ILA dashboard, click the Run Trigger button (often a green "Play" icon). The ILA will wait for the defined trigger condition.
    3. Capture Data: Once the trigger condition is met, the ILA will capture a snapshot of the signals into its internal memory buffer.
    4. Analyze Waveforms:
      • The captured data will appear in the waveform viewer.
      • Inspect the signal behavior over time. Zoom in, add cursors, and decode values.
      • Look for:
        • Unexpected transitions: Signals changing at the wrong time.
        • Incorrect values: Registers holding incorrect data.
        • Protocol violations: Your logic sending incorrect data on PCIe interfaces.
        • Timing issues: If signals are not stable when expected (though full timing analysis is done in implementation, ILA shows runtime behavior).
      • Compare the captured behavior against your expected design logic and the observed behavior of the donor device (if you captured it with a protocol analyzer).

12.2 PCIe Traffic Analysis Tools

While ILA gives you internal FPGA visibility, external PCIe traffic analysis tools provide an unrivaled view of the actual communication on the PCIe bus between your emulated device and the host. This is crucial for verifying protocol compliance and debugging link-level issues.

12.2.1 PCIe Protocol Analyzers (Hardware)

  • Examples:
    • Teledyne LeCroy PCIe Analyzers: Gold standard for deep analysis, full protocol decoding, advanced triggering, and error injection capabilities.
    • Keysight PCIe Analyzers: Another leading vendor with similar high-end features.
  • Steps:
    1. Set Up Analyzer: Connect the hardware analyzer in-line between the host system's PCIe slot and your FPGA-based DMA device. This typically involves a special interposer card.
    2. Configure Capture Settings: Use the analyzer's software to define what traffic to capture. You can filter by TLP type, address, Requester ID, error conditions, etc., to focus on relevant events.
    3. Capture Traffic: Run your emulated device on the host. The analyzer will passively record all PCIe transactions.
    4. Analyze Results:
      • Use the analyzer's powerful software to view decoded TLPs, transaction lists, and waveform views.
      • Examine TLPs for compliance and correctness: Are all fields correct? Is the sequence proper?
      • Identify any protocol violations or unexpected behaviors: This is where you find why the driver might be failing (e.g., your device sends a Completion with data when the spec requires a Completion without data, or it responds too slowly).
      • Compare with Donor Device Captures: Directly compare the captured traffic from your emulated device to the captures you made from the real donor device. This is the ultimate test of emulation accuracy.

12.2.2 Software-Based Tools

For basic PCIe bus inspection, or if a dedicated hardware analyzer is unavailable, some software tools can provide limited insights.

  • Examples:
    • Wireshark with PCIe Plugins: While Wireshark is primarily for network traffic, with specialized hardware (e.g., network cards that expose PCIe traces to the OS, or specific capture hardware/drivers), it can sometimes capture and decode PCIe packets. This is highly dependent on the system.
    • ChipScope Pro (Legacy Xilinx, now part of Vivado): Integrated Logic Analyzer (ILA) is the modern equivalent, but ChipScope was the standalone tool.
    • lspci (Linux): As mentioned in Section 11.3.1, lspci -vvv provides extensive static configuration space information. You can combine it with watch or scripting to monitor changes over time.
    • pcileech client (from the PCILeech framework): The pcileech client software itself can perform read/write operations to memory and configuration space via your FPGA, and can be used to test basic DMA functionality. While not a "traffic analyzer," it's essential for testing the functional interface.
  • Steps:
    1. Install Necessary Tools/Plugins: Ensure the tool is installed and any required drivers or plugins are configured.
    2. Monitor PCIe Bus: Run the software tool to capture and display PCIe-related information.
    3. Analyze Communications:
      • Look for discrepancies in device configuration.
      • If the tool supports it, analyze the structure of captured packets for anomalies or errors.
      • Verify that your emulated device is responding to configuration requests correctly.

13. Troubleshooting

This section provides solutions to common problems you may encounter during custom firmware development, bitstream programming, and hardware testing of your PCIe device emulation. Firmware debugging can be challenging, so a methodical approach is key.

13.1 Device Detection Issues

Problem: Your FPGA-based DMA device, after programming, is not recognized by the host system, or it shows up with incorrect IDs (e.g., "Unknown device") or an error symbol in Device Manager/lspci.

Possible Causes and Solutions:

  1. Incorrect Device IDs, Vendor IDs, Subsystem IDs, or Class Code:

    • Cause: The most common reason. There is a mismatch between the identification values programmed into your FPGA firmware and what the host operating system expects, or what you intend to emulate.
    • Solution:
      • Verify: Double-check all cfg_deviceid, cfg_vendorid, cfg_subsysid, cfg_subsysvendorid, cfg_revisionid, and cfg_classcode parameters in pcileech_pcie_cfg_a7.sv (or equivalent file) against your meticulously recorded donor device information (from Section 5).
      • Consistency: Ensure these values are also consistently set in the Vivado PCIe IP Core customization GUI (Section 7.2.2).
      • Rebuild & Re-flash: After any changes, always re-synthesize, re-implement, generate a new bitstream, and re-flash the FPGA (Section 11.1, 11.2).
      • Reboot Host: Always reboot the host system after flashing, as Windows often needs a full restart to re-enumerate PCIe devices correctly.
  2. PCIe Link Training Failure:

    • Cause: The fundamental PCIe link between the host's root complex and your FPGA card fails to establish. This happens before any configuration space reads. Symptoms include the device not appearing at all (lspci shows nothing at that bus/slot, or Device Manager shows a "PCI Express Root Port" error).
    • Solution:
      • Physical Connections: Ensure the FPGA board is seated firmly in the PCIe slot and all power connections are secure. Try a different PCIe slot if available.
      • Power: Verify that the FPGA board is receiving adequate power. Some boards require auxiliary PCIe power connectors.
      • Link Speed/Width:
        • Check Max Link Speed and Link Width settings in your Vivado PCIe IP Core (Section 8.1.1).
        • Try setting the link speed to a lower generation (e.g., Gen1 / 2.5 GT/s) and width to x1, even if your board supports higher. Sometimes, compatibility issues arise with specific motherboards at higher speeds.
        • Check motherboard BIOS settings for PCIe slot speed options.
      • Reset: Ensure the FPGA's reset logic is correctly implemented (e.g., synchronized to the PCIe reference clock) and asserted/de-asserted correctly upon power-up/reboot.
      • PCIe IP Core: Ensure the PCIe IP core is correctly instantiated and its clocks and resets are properly connected in your top-level design.
  3. Power Issues (Insufficient or Unstable Power):

    • Cause: The FPGA board is not receiving enough stable power, or the power supply is noisy, leading to unreliable operation.
    • Solution:
      • Verify Connections: Double-check all power cables (main PCIe slot power, auxiliary PCIe power, external DC jack if used).
      • Power Supply: Ensure your host system's power supply (PSU) has sufficient wattage and stable 12V rails. For high-power FPGAs, a weak PSU can cause issues.
      • External Power: If the board has an external power jack, ensure it's used with the correct voltage and current rating.
  4. Firmware Errors (Early Stage):

    • Cause: Logic errors in your SystemVerilog code, particularly in the top-level module or the PCIe core's wrapper, that prevent the PCIe core from initializing or presenting itself correctly.
    • Solution:
      • Vivado Messages: Scrutinize Vivado's synthesis and implementation logs for Critical Warnings or Errors related to the PCIe IP core. These are often indicators of misconfigurations or improper connections.
      • ILA Debugging: If the link attempts to train but fails, use an ILA (Section 12.1) connected to the PCIe IP core's status signals (e.g., link_up, link_speed, link_width) and AXI-Stream interfaces to see at what point the link negotiation fails or if the core is generating unexpected traffic.

13.2 Memory Mapping and BAR Configuration Errors

Problem: The emulated device is detected, but when the host OS or a driver tries to access its memory-mapped registers or buffers (via BARs), it crashes, freezes, or reports errors.

Possible Causes and Solutions:

  1. Incorrect BAR Sizes or Types (IP Core & Firmware):

    • Cause: The BAR sizes or types (32-bit/64-bit, Memory/I/O, Prefetchable/Non-prefetchable) configured in your Vivado PCIe IP Core (Section 7.2.2) and/or handled in your pcileech_tlps128_bar_controller.sv do not match what the donor device actually provides. This can cause the host to allocate an incorrect address space or attempt unsupported accesses.
    • Solution:
      • Cross-Verify: Go back to your Arbor/protocol analyzer data (Section 5) and re-verify every single BAR configuration (size, type, prefetchable).
      • Consistency: Ensure these match perfectly in the PCIe IP Core customization and that your bar_controller logic correctly handles the size (address decoding range) and type of each BAR.
      • BRAM Sizing: If your BARs map to BRAMs, confirm that the BRAM IP core sizes (Section 8.2.1) exactly match the BAR sizes.
  2. Address Decoding Errors in Firmware:

    • Cause: Your pcileech_tlps128_bar_controller.sv (or custom BAR logic) is misinterpreting the incoming PCIe addresses, leading to accesses to incorrect internal registers or memory locations.
    • Solution:
      • Review Logic: Meticulously review the case statements and address calculations in your bar_controller.
      • Simulation: Develop specific test cases in your SystemVerilog test bench to simulate host read/write accesses to various offsets within each BAR. Verify that the internal bar_hit signals are correct and that the data is routed to/from the correct internal registers/BRAMs.
      • ILA Debugging: Place ILA probes on the req_addr, req_write, req_read, req_data, rsp_data, and the internal signals related to your address decoding and register access within the bar_controller. Observe how the address is decoded and what data is being read/written in real time.
  3. Overlapping Address Spaces (Internal):

    • Cause: While the PCIe standard ensures that BARs from different devices don't overlap in the host's memory map, internally within your FPGA, you might accidentally map different logical components to the same physical address space within a single BAR.
    • Solution:
      • Map Carefully: When defining internal registers and memory blocks within a BAR, explicitly assign unique, non-overlapping offsets to each. Use localparam for these offsets to prevent errors.
      • Design Review: A thorough design review of your bar_controller is necessary to ensure every address range is uniquely handled.
  4. BRAM Access Issues:

    • Cause: Problems with interfacing your logic to the BRAM IP cores (e.g., incorrect BRAM clocking, asynchronous resets, wrong byte enables, or incorrect write enable logic).
    • Solution:
      • BRAM Documentation: Consult the Xilinx BRAM IP core documentation for correct instantiation and interface signals.
      • ILA: Place ILA probes on the BRAM interface signals (address, write enable, data in, data out) to verify that your logic is sending the correct control signals to the BRAM.

13.3 DMA Performance and TLP Errors

Problem: The device is detected and functionally appears to work, but data transfer rates are slow, or the system experiences intermittent crashes, hangs, or errors during large DMA operations. PCIe protocol analyzers report malformed TLPs or flow control issues.

Possible Causes and Solutions:

  1. Malformed TLPs (Header/Payload):

    • Cause: Your firmware is generating TLPs (especially Completions or outbound Memory Writes if your FPGA is acting as a DMA master) with incorrect headers, lengths, byte enables, or payloads. The host system's PCIe core or driver detects these as violations.
    • Solution:
      • PCIe Protocol Analyzer: This is the best tool here (Section 12.2.1). Capture traffic and meticulously compare your generated TLPs against the PCIe specification and, more importantly, against captures from your real donor device.
      • TLP Generation Logic: Review your TLP assembly code (pcileech_pcie_tlp_a7.sv and related modules). Ensure all fields (Fmt, Type, Requester ID, Tag, Completion ID, Length, Byte Enables, Address) are correctly derived and packed into the TLP structure.
      • Error Checking: Implement basic error checking in your firmware (e.g., checking for unexpected req_valid without req_ready or vice-versa).
  2. Flow Control Issues:

    • Cause: PCIe uses a credit-based flow control mechanism. If your firmware (or the PCIe IP core's interaction with it) incorrectly manages credits, it can lead to deadlocks, timeouts, or dropped packets. Symptoms include a "stalled" PCIe link, timeouts, or low throughput.
    • Solution:
      • PCIe IP Core Configuration: Ensure the flow control settings within the Vivado PCIe IP Core customization are appropriate for your expected traffic patterns. The default settings are usually robust.
      • User Logic Backpressure: Your user logic that sends TLPs to the PCIe IP core (m_axis_tx_* interface) must respect the m_axis_tx_tready signal from the IP core. If tready is de-asserted, you must pause sending data. Failing to do so will overflow the core's buffers.
      • ILA Debugging: Connect ILA probes to the flow control interface signals of the PCIe IP core and your user logic to observe if tvalid/tready handshake is working correctly.
  3. Inefficient DMA Logic / Buffering Issues:

    • Cause: Your DMA engine implementation within the FPGA (the part that reads/writes data to/from host memory) is not optimized, causing bottlenecks. This can involve:
      • Lack of pipelining.
      • Inefficient use of BRAMs.
      • Stalls due to external memory access latency.
      • Small burst sizes.
    • Solution:
      • Pipelining: Break down long combinational paths into smaller, sequential stages using registers. This allows higher clock frequencies and better throughput.
      • Buffering: Use FIFOs (First-In, First-Out buffers) to decouple sender and receiver logic, smoothing out data flow and preventing stalls.
      • Burst Transfers: Utilize PCIe's ability to perform burst reads/writes for efficiency. Ensure your DMA logic requests and handles data in appropriate burst sizes.
      • Memory Bandwidth: Ensure your BRAMs or external DDR memory interfaces are capable of supplying/consuming data fast enough for your desired DMA rates.
      • ILA: Monitor your DMA engine's internal state, read/write pointers, and data path signals to identify bottlenecks.
  4. Completion Timeout / Unsupported Request:

    • Cause: The host sends a request (e.g., MRd, CfgRd), but your FPGA device does not respond with a Completion TLP within the allowed timeout period, or it responds with an error status (e.g., Completion with Unsupported Request (UR) or Completer Abort (CA)).
    • Solution:
      • Response Logic: Verify that your bar_controller (for MRds) and pcileech_pcie_cfg_a7.sv (for CfgRds to custom config space) correctly identify the request and generate the appropriate Completion.
      • Timeout Value: Review your donor device's expected completion latency. While PCIe defines default timeouts, some drivers might be sensitive.
      • ILA/Protocol Analyzer: Crucial for pinpointing why a completion isn't sent or why it's malformed. Is the request TLP even reaching your user logic? Is your logic generating a response? Is the PCIe core successfully sending the response?

14. Emulation Accuracy and Optimizations

Achieving truly convincing emulation means making your FPGA-based device indistinguishable from the donor, not just in its ID, but in its behavior. This requires meticulous attention to timing, responsiveness, and subtle operational details.

14.1 Techniques for Accurate Timing Emulation

Precise timing is paramount in hardware, especially for high-speed interfaces like PCIe. Mismatches can lead to driver timeouts, incorrect data interpretation, or system instability.

  • Implement Timing Constraints (XDC Files):

    • Purpose: Timing constraints are instructions to Vivado's synthesis and implementation tools, telling them how fast your design needs to operate. They define clock periods, input/output delays, and path delays.
    • Usage: The PCILeech-FPGA project includes XDC files (e.g., pcileech_squirrel_top.xdc) that define the main clocks (e.g., create_clock -name sys_clk_p -period 8.0 [get_ports sys_clk_p]).
    • Refinement: If your emulation requires very specific internal timing or reacts to time-sensitive commands, you might need to add further constraints to critical paths (set_max_delay, set_input_delay, set_output_delay) within your custom logic.
    • Goal: Ensure Vivado reports positive WNS (Worst Negative Slack) for all paths after implementation, indicating the design meets its timing requirements.
  • Use Clock Domain Crossing (CDC) Techniques:

    • Purpose: PCIe designs often involve multiple clock domains (e.g., a 125MHz PCIe user clock, a separate clock for your custom logic). Moving signals between these domains asynchronously (without proper synchronization) can lead to metastability, causing unreliable behavior.
    • Implementation: Always use proper CDC circuits for signals crossing clock domains:
      • Two-Flip-Flop Synchronizers: For single bit control signals.
      • Asynchronous FIFOs (First-In, First-Out): For multi-bit data paths, providing buffering and flow control between clock domains.
      • Gray Code Encoders/Decoders: For counters or addresses crossing domains to ensure only one bit changes at a time.
    • Vivado Tools: Vivado includes CDC analysis tools (e.g., report_cdc) that can identify potential metastability issues.
  • Simulate Device Behavior with Time-Accurate Models:

    • Advanced Test Benches: Use SystemVerilog test benches that incorporate realistic timing delays or even provide time-accurate PCIe bus functional models (BFMs).
    • Verification: This allows you to observe how your emulated device's internal state and external TLP generation/response timings behave under various conditions, ensuring they match your captured donor device behavior.

14.2 Dynamic Response to System Calls

A truly accurate emulation doesn't just present the correct IDs; it also reacts intelligently and dynamically to the host system's commands and queries, mimicking the behavior of a real, active device.

  • Implement State Machines for Device Control:

    • Purpose: Design robust SystemVerilog state machines that manage the device's operational modes, command processing, and data flow.
    • Responsiveness: Ensure the state machine transitions logically and quickly in response to incoming commands (e.g., writes to control registers in a BAR, specific TLPs).
    • Graceful Handling: The state machine should be able to handle unexpected or out-of-order requests gracefully, perhaps returning an error TLP or simply ignoring invalid commands, rather than crashing or freezing.
  • Monitor and Respond to Host Commands (Beyond Simple Reads/Writes):

    • Configuration Writes: Beyond initial enumeration, drivers often write to configuration space registers to enable features, set thresholds, or clear status bits. Your firmware must process these writes and update internal state accordingly.
    • Vendor-Specific Commands: As discussed in Section 9.2, if the donor device has proprietary commands (accessed via custom registers or Vendor Defined Messages), your firmware must parse these commands and trigger the appropriate emulated behavior.
    • Power Management Commands: React to host-initiated power state transitions (D0, D1, D3hot, etc.) by enabling/disabling internal logic and acknowledging the state change.
    • Interrupt Acknowledgment: If the host driver acknowledges interrupts by writing to a specific register, ensure your firmware can detect this and clear the internal interrupt request.
  • Optimize Firmware Logic for Responsiveness:

    • Reduce Latency: Critical data paths and control paths should be optimized to minimize combinational logic depth and pipeline stalls.
    • Parallelism: Leverage the FPGA's inherent parallelism to perform multiple operations concurrently, improving throughput and response times.
    • Efficient Memory Access: Optimize access to internal BRAMs or external DDR memory to ensure data is available when needed for DMA transfers or register reads.
    • Hardware Acceleration: For complex computations or data manipulations that the donor device performs, consider implementing dedicated hardware accelerators on the FPGA rather than trying to perform them in a slow, software-like fashion.

15. Best Practices for Firmware Development

Adhering to best practices in custom firmware development is crucial for maintaining code quality, facilitating collaboration (if working in a team), simplifying debugging, and ensuring the long-term maintainability and reliability of your project. This is especially true for security-sensitive applications.

15.1 Continuous Testing and Documentation

  • Regular, Incremental Testing:

    • Unit Testing: Test small, individual modules (e.g., a TLP parser, a register block) in isolation using dedicated test benches.
    • Integration Testing: Verify that different modules work together correctly.
    • System Testing: After flashing, perform end-to-end tests with the host system to ensure overall functionality.
    • Test Early, Test Often: Test the firmware after each significant change, no matter how small, to catch issues early when they are easier to debug.
  • Automated Testing (Advanced):

    • For complex projects, implement automated test scripts (e.g., using Python with a hardware abstraction layer) on the host side to repeatedly verify functionality and performance.
    • Consider integrating with Continuous Integration (CI) tools (e.g., Jenkins, GitLab CI) in a team environment to automate builds, tests, and static analysis on every code commit.
  • Maintain Comprehensive Documentation:

    • Design Documents: Create and update documents that describe your firmware's architecture, including:
      • Block Diagrams: Illustrating the major modules and their interconnections.
      • State Machine Diagrams: For all stateful logic.
      • Interface Specifications: Detailing input/output signals, timing, and protocols between modules.
      • Memory Maps: For all BARs, defining register addresses, bit fields, and their functionality.
    • Code Comments: Use clear, concise comments within your SystemVerilog code to explain complex logic, purpose of signals, and any non-obvious design choices.
    • Change Log/Commit Messages: Maintain a change log or use detailed Git commit messages to track all modifications, bug fixes, and feature additions, explaining why changes were made.
    • User Guide: For your custom firmware, a simple user guide explaining how to build, flash, and interact with the emulated device from the host side is invaluable.

15.2 Managing Firmware Versioning

Proper version control is essential for tracking changes, collaborating effectively, and managing releases.

  • Use Version Control Systems (VCS):

    • Git: Strongly recommended. Use Git to manage your HDL source code, constraints files, and project scripts.
    • Organize Repository: Maintain a clear directory structure (e.g., separate folders for src, xdc, ip, scripts, doc).
    • Branches: Use feature branches for developing new capabilities or large changes. Merge back to a main or develop branch after thorough testing.
    • Regular Commits: Commit frequently with atomic, meaningful commit messages.
  • Tag Releases and Milestones:

    • Stable Versions: Use Git tags (e.g., v1.0.0, v1.0.1_bugfix) to mark stable, tested versions of your firmware. This makes it easy to revert or deploy a known good state.
    • Milestones: Tag significant development milestones (e.g., "Basic Enumeration Working," "DMA Read/Write Functional").
  • Backup and Recovery Strategy:

    • Cloud-Based Repositories: Host your Git repository on platforms like GitHub, GitLab, or Bitbucket. This provides off-site backups and facilitates collaboration.
    • Local Backups: Even with cloud repositories, maintain regular local backups of your entire Vivado project directory (which can be very large due to generated files).

15.3 Security Considerations

Developing custom firmware for PCIe device emulation, especially one capable of Direct Memory Access, carries significant security implications. This technology is inherently a "dual-use" capability, meaning it can be used for both legitimate (e.g., hardware testing, security research) and malicious purposes (e.g., DMA attacks, security bypasses). It is paramount to understand and responsibly manage these risks.

  • Dual-Use Nature and Ethical Implications:

    • Ethical Hacking vs. Malicious Use: Clearly distinguish between using this knowledge for authorized security testing (red teaming, penetration testing) and unauthorized, illegal activities.
    • Responsible Disclosure: If you discover vulnerabilities using these techniques, follow responsible disclosure guidelines.
    • Legal and Licensing Compliance: Be aware of and comply with all relevant laws, regulations, and licensing agreements (e.g., PCIe-SIG specifications, Xilinx EULAs) concerning hardware reverse engineering and device modification.
    • "Weaponization": Recognize that the ability to accurately emulate trusted hardware can be weaponized for advanced persistent threats (APTs) or sophisticated malware.
  • Understanding Attack Vectors (Offensive Perspective):

    • Memory Exfiltration: A malicious emulated device can perform DMA reads to access any physical memory address, including sensitive data in the kernel, user processes, cryptographic keys, or network buffers.
    • Memory Injection/Modification: A malicious emulated device can perform DMA writes to arbitrarily modify memory, enabling:
      • Privilege Escalation: Modifying kernel data structures (e.g., process tokens, SIDs) to gain administrator or system privileges.
      • Code Injection: Injecting malicious code into running processes or the kernel, then triggering its execution.
      • Security Software Bypass: Disabling or subverting endpoint detection and response (EDR), antivirus, or firewall software by modifying their memory directly.
    • Fuzzing and Crashing: Sending malformed or out-of-spec TLPs/commands to trigger driver vulnerabilities, leading to system crashes (BSODs) or potentially exploitable memory corruption.
    • Firmware/BIOS Manipulation: In some advanced scenarios, a DMA device might be able to interact with the host's SPI flash memory containing the BIOS/UEFI, potentially for persistent modification.
  • Defensive Measures and Mitigation Strategies (Defensive Perspective):

    • IOMMU/VT-d/AMD-Vi: As noted in Section 3.2, these technologies are designed to mitigate DMA attacks by providing memory protection for peripherals. For legitimate testing, you disable them, but in production systems, they should always be enabled. They prevent unauthorized memory access by peripherals.
    • Kernel DMA Protection (Windows) / Thunderbolt Security (Linux): Modern OS features specifically address "cold boot" DMA attacks (where an attacker connects a malicious device while the system is off or locked). Keep these enabled on production systems.
    • Secure Boot: While not directly a DMA protection, Secure Boot helps ensure that only trusted bootloaders and kernel modules are loaded, reducing the chance of an attacker injecting malicious kernel components to bypass DMA protections.
    • Physical Security: The most basic but critical defense. If an attacker has physical access to a PCIe slot or Thunderbolt port, they can bypass many software protections. Secure physical access to critical systems.
    • Driver Hardening: Drivers should be written defensively, validating all inputs from hardware and operating within strict memory boundaries.
    • Memory Hardening: OS-level memory protections (e.g., KASLR, DEP, SMAP/SMEP) help reduce the impact of memory corruption, but a direct DMA attack bypasses these.
    • Monitoring and Logging: While difficult at the hardware level, unusual DMA activity or enumeration of unknown PCIe devices should trigger alerts in security monitoring systems.
  • Secure Coding Practices for Firmware:

    • Input Validation: If your firmware accepts any inputs (e.g., via a UART debug interface, or internal registers written by the host), validate them rigorously to prevent buffer overflows, integer overflows, or unexpected behavior.
    • Least Privilege: Design your firmware logic to only perform the operations absolutely necessary for its function. Avoid granting unnecessary capabilities.
    • State Management: Implement robust state machines to prevent unintended behavior due to invalid state transitions.
    • No Hardcoded Secrets: Avoid embedding sensitive information (e.g., cryptographic keys, hardcoded credentials) directly in your firmware if it could be easily extracted.
    • Tamper Detection: For production firmware, consider implementing mechanisms to detect if the firmware itself has been tampered with or if non-authorized configurations are loaded.

16. Additional Resources

To deepen your understanding and stay updated in the dynamic fields of FPGA development, PCIe, and hardware security, consult the following resources:

  • Xilinx (AMD) Documentation: Your primary source for all things Vivado and Xilinx FPGAs.

    • Main Documentation Portal: https://docs.amd.com/ (formerly Xilinx.com/support/documentation).
    • Vivado Design Suite User Guides:
      • UG900 - Getting Started: Essential for new Vivado users.
      • UG901 - Logic Synthesis: Deep dive into synthesis.
      • UG904 - Implementation: Detailed guide on placement and routing.
      • UG912 - Tcl Command Reference Guide: Invaluable for scripting.
      • UG939 - Debugging: Comprehensive guide to ILA and other debug features.
    • PCI Express IP Core User Guide: Critically important for understanding the Xilinx PCIe IP (e.g., PG054 for 7 Series Integrated Block for PCI Express). Search for "PCI Express" on the documentation portal. This details the core's configuration, interfaces, and limitations.
  • PCI-SIG Specifications: The definitive source for the PCIe standard.

    • PCI Express Base Specification: The foundational document. While not publicly free, summaries and educational materials based on it are widely available. You can usually find information on their website: https://pcisig.com/specifications (Note: Full specifications typically require PCI-SIG membership).
  • FPGA Tutorials and Learning Platforms:

  • PCIe Protocol Analysis Tools:

  • PCILeech Community & Resources:

    • The ufrisk/pcileech GitHub repository is the core. Actively follow its updates and issues.
    • Look for community forums or Discord servers dedicated to PCILeech or similar open-source DMA projects.
  • Hardware Security & Reverse Engineering:

    • Books on hardware hacking, reverse engineering, and low-level system exploitation.
    • Conferences like Black Hat, DEF CON, Recon, and Troopers often feature talks on PCIe and DMA attacks.
    • Blogs and research papers from security researchers focusing on hardware.

17. Contact Information

If you need assistance, have questions, or wish to collaborate on topics related to this guide, firmware development, or hardware security, please feel free to reach out. I'm available to provide guidance, troubleshoot complex problems, or discuss ideas in detail.

Discord:


18. Support and Contributions

Your support helps maintain and improve this guide and related projects. Creating and updating comprehensive technical documentation and open-source hardware projects requires significant time and effort.

Donations

If you found this guide helpful and want to support ongoing work, consider contributing. Every donation, no matter how small, helps in continuing to create, share, and support the community through further research, development, and documentation efforts.

  • Crypto Donations (LTC - Litecoin):
    • Address: MPMyQD5zgy2b2CpDn1C1KZ31KmHpT7AwRi

Special Bonus: If you donate, please feel free to reach out on Discord (VCPU) to receive a personal thank you and possibly additional resources, early access to new content, or personalized assistance with your project.

Note: If you need me to review specific sections of your implementation, troubleshoot issues, or provide detailed feedback on your code, please mark the relevant sections within your code with //VCPU-REVIEW// comments and provide detailed explanations of the problems or questions you're encountering. This helps me focus my efforts and provide the most effective support.

May God bless your soul.


End of Guide

About

Новейшее Pcileech DMA CFW руководство которое тебе понадобится. Спонсировано DMAPolice.com

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published