diff --git a/README.md b/README.md index 20ee451..4514013 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,78 @@ Vulkan Grass Rendering **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Wayne Wu + * [LinkedIn](https://www.linkedin.com/in/wayne-wu/), [Personal Website](https://www.wuwayne.com/) +* Tested on: Windows 10, AMD Ryzen 5 5600X @ 3.70GHz 32GB, RTX 3070 8GB (personal) -### (TODO: Your README) +## Background -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +This project implements the simulation and rendering of grass in real-time using Vulkan based on the paper: [Responsive Real-Time Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). + +https://user-images.githubusercontent.com/77313916/139769346-15b2f672-7666-4ccd-a473-662d80589295.mp4 + + +The three main components of the project are the following: +1. **Simulation**: Apply forces to the grass to provide dynamics. +2. **Culling**: Removing visually unimportant grass blades to improve performance. +3. **Rendering**: Tessellation and shading of the grass. + +## Simulation +Three types of natural forces are simulated per blade as part of the physical model. +The simulation is performed in the compute shader. + +### Gravity +Gravity is the first type of force applied. In addition to the normal downward gravity force, an artificial gravity force is also added that pushes the blade towards the face-forward direction in order to get the bending affect. However, since gravity is the only force applied at the moment, all the blades will just fall to the ground as shown below. + +![](img/gravity.gif) + +### Recovery +Once the recovery force is applied, which in essence models the grass blades as mass-spring systems, the blades will be kept from falling to the ground. This produces a more realistic grass-look as below: + +![](img/recovery.gif) + +### Wind +The wind force can be arbitrarily generated to add liveliness to the grass. The image is based on the following function: + +```wind = windStrength * sin(totalTime) * vec3(1,0,1) * noise(v0)``` + +Adding all the forces together, we get the final result below: + +![](img/wind.gif) + +## Culling +To reduce the number of blades that need to be rendered, three culling operations are performed in the compute shader. + +### Orientation +Orientation culling removes blades that have a width direction parallel to the view direction. In the image below, we can see that as the camera is oriented, some blades disappear at a specific viewing angle. (Try focusing on one specific blade to see the effect). + +![](img/orientation.gif) + +### View-Frustum +View-frustum culling removes blades that are outside of the view frustum. Different from the paper, instead of adding a small tolerance value, a weight factor is used instead to shrink or expand the clip space. For example, the image below has a weight factor of `0.8`. Additionally, no bound-test is performed for the z component in clip space. + +![](img/frustum.gif) + +### Distance +Finally, the distance culling operation removes blades based on the distance of the blade from the camera (projected on to the ground plane). The blade distance is discretized into a distance level bucket, in which a certain percentage of blades will be culled. The further the distance level is from the camera, the more blades will be culled. + +![](img/distance.gif) + +## Rendering +The grass is rendered using its own graphics pipeline in Vulkan, on top of the pipeline that renders the ground with texture. Each blade data is processed as a vertex by the Vertex Shader, and the data (e.g. v0, v1 etc.) is passed directly from the Tessellation Control Shader to the Tessellation Evaluation Shader, in which the actual curve interpolation is performed. The tessellation level is set to 10 arbitrarily for sufficient visual quality. Finally, a simple shading model is applied in the Fragment Shader for ambient and diffuse lighting effects. + +## Performance Analysis + +### Number of Blades +The implementation was tested at different number of blades with and without culling operations. As expected, we see that as the number of blades increases (by a factor of 2), the FPS decreases accordingly. +The maximum number of blades tested was 2e24 with culling operations and 2e20 without culling operations. Any number above those will produce unstable results. + +![](img/bladesperformance.png) + +### Effects of Culling +To understand the effect of each culling operation, each operation is applied exclusively for performance measurement. +It is important to note that the result is dependent on the camera view and the parameters applied. For example, a camera view that puts everything inside the view-frustum will not benefit from the view-frustum culling. In this particular scene, the distance culling has the greatest performance boost. + +Test Scene | Result +:-------------------------:|:-------------------------: +![](img/cullingtestscene.png) | ![](img/cullingperformance.png) diff --git a/img/bladesperformance.png b/img/bladesperformance.png new file mode 100644 index 0000000..11ab5e0 Binary files /dev/null and b/img/bladesperformance.png differ diff --git a/img/cullingperformance.png b/img/cullingperformance.png new file mode 100644 index 0000000..2bd9b74 Binary files /dev/null and b/img/cullingperformance.png differ diff --git a/img/cullingtestscene.png b/img/cullingtestscene.png new file mode 100644 index 0000000..269b9cd Binary files /dev/null and b/img/cullingtestscene.png differ diff --git a/img/distance.gif b/img/distance.gif new file mode 100644 index 0000000..3761aca Binary files /dev/null and b/img/distance.gif differ diff --git a/img/frustum.gif b/img/frustum.gif new file mode 100644 index 0000000..131dd74 Binary files /dev/null and b/img/frustum.gif differ diff --git a/img/gravity.gif b/img/gravity.gif new file mode 100644 index 0000000..11664c7 Binary files /dev/null and b/img/gravity.gif differ diff --git a/img/orientation.gif b/img/orientation.gif new file mode 100644 index 0000000..6669f3d Binary files /dev/null and b/img/orientation.gif differ diff --git a/img/recording.mp4 b/img/recording.mp4 new file mode 100644 index 0000000..bc513c9 Binary files /dev/null and b/img/recording.mp4 differ diff --git a/img/recovery.gif b/img/recovery.gif new file mode 100644 index 0000000..ded892f Binary files /dev/null and b/img/recovery.gif differ diff --git a/img/wind.gif b/img/wind.gif new file mode 100644 index 0000000..2911d3a Binary files /dev/null and b/img/wind.gif differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..fb9edda 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -44,8 +44,11 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstVertex = 0; indirectDraw.firstInstance = 0; - BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + // NOTE: VK_BUFFER_USAGE_STORAGE_BUFFER_BIT specifies that the buffer can be used in a VkDescriptorBufferInfo suitable + // for occupying a VkDescriptorSet slot either of type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER or VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC. + + BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, bladesBuffer, bladesBufferMemory); + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } diff --git a/src/Blades.h b/src/Blades.h index 9bd1eed..fc23ba7 100644 --- a/src/Blades.h +++ b/src/Blades.h @@ -4,7 +4,7 @@ #include #include "Model.h" -constexpr static unsigned int NUM_BLADES = 1 << 13; +constexpr static unsigned int NUM_BLADES = 1 << 17; constexpr static float MIN_HEIGHT = 1.3f; constexpr static float MAX_HEIGHT = 2.5f; constexpr static float MIN_WIDTH = 0.1f; diff --git a/src/Camera.cpp b/src/Camera.cpp index 3afb5b8..d9821fb 100644 --- a/src/Camera.cpp +++ b/src/Camera.cpp @@ -12,7 +12,7 @@ Camera::Camera(Device* device, float aspectRatio) : device(device) { r = 10.0f; theta = 0.0f; phi = 0.0f; - cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 1.0f, 10.0f), glm::vec3(0.0f, 1.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f)); + cameraBufferObject.viewMatrix = glm::lookAt(glm::vec3(0.0f, 5.0f, 10.0f), glm::vec3(0.0f, 1.0f, 0.0f), glm::vec3(0.0f, 1.0f, 0.0f)); cameraBufferObject.projectionMatrix = glm::perspective(glm::radians(45.0f), aspectRatio, 0.1f, 100.0f); cameraBufferObject.projectionMatrix[1][1] *= -1; // y-coordinate is flipped diff --git a/src/Instance.cpp b/src/Instance.cpp index 7f6b01c..de1f03d 100644 --- a/src/Instance.cpp +++ b/src/Instance.cpp @@ -350,6 +350,7 @@ Device* Instance::CreateDevice(QueueFlagBits requiredQueues, VkPhysicalDeviceFea throw std::runtime_error("Failed to create logical device"); } + // NOTE: Retrieving the queue handles for each queue family. Will use the queues to submit graphics commands Device::Queues queues; for (unsigned int i = 0; i < requiredQueues.size(); ++i) { if (requiredQueues[i]) { diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..f7f3129 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -31,8 +31,9 @@ Renderer::Renderer(Device* device, SwapChain* swapChain, Scene* scene, Camera* c CreateGraphicsPipeline(); CreateGrassPipeline(); CreateComputePipeline(); - RecordCommandBuffers(); RecordComputeCommandBuffer(); + RecordCommandBuffers(); + } void Renderer::CreateCommandPools() { @@ -172,6 +173,27 @@ void Renderer::CreateModelDescriptorSetLayout() { } } +void Renderer::CreateGrassDescriptorSetLayout() { + VkDescriptorSetLayoutBinding uboLayoutBinding = {}; + uboLayoutBinding.binding = 0; + uboLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + uboLayoutBinding.descriptorCount = 1; + uboLayoutBinding.stageFlags = VK_SHADER_STAGE_VERTEX_BIT; + uboLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { uboLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &grassDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } +} + void Renderer::CreateTimeDescriptorSetLayout() { // Describe the binding of the descriptor set layout VkDescriptorSetLayoutBinding uboLayoutBinding = {}; @@ -198,24 +220,59 @@ void Renderer::CreateComputeDescriptorSetLayout() { // TODO: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + VkDescriptorSetLayoutBinding bladesLayoutBinding = {}; + bladesLayoutBinding.binding = 0; + bladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + bladesLayoutBinding.descriptorCount = 1; + bladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + bladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding culledBladesLayoutBinding = {}; + culledBladesLayoutBinding.binding = 1; + culledBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesLayoutBinding.descriptorCount = 1; + culledBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding numBladesLayoutBinding = {}; + numBladesLayoutBinding.binding = 2; + numBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numBladesLayoutBinding.descriptorCount = 1; + numBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numBladesLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { bladesLayoutBinding, culledBladesLayoutBinding, numBladesLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { // Describe which descriptor types that the descriptor sets will contain - std::vector poolSizes = { - // Camera - { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1}, + std::vector poolSizes = { + // Camera + { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1}, - // Models + Blades - { VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, + // Models + Blades + { VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, - // Models + Blades - { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, + // Models + Blades + { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, - // Time (compute) - { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, + // Time (compute) + { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, - // TODO: Add any additional types and counts of descriptors you will need to allocate + // TODO: Add any additional types and counts of descriptors you will need to allocate + + // NOTE: Each compute descriptor set has three storage buffers + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(scene->GetBlades().size() * 3) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +375,45 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. - // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + // TODO: Create Descriptor sets for the grass. + // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo modelBufferInfo = {}; + modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + modelBufferInfo.offset = 0; + modelBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &modelBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -360,6 +454,70 @@ void Renderer::CreateTimeDescriptorSet() { void Renderer::CreateComputeDescriptorSets() { // TODO: Create Descriptor sets for the compute pipeline // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts;; + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo bladesBufferInfo = {}; + bladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + bladesBufferInfo.offset = 0; + bladesBufferInfo.range = sizeof(Blade)*NUM_BLADES; + + VkDescriptorBufferInfo culledBladesBufferInfo = {}; + culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesBufferInfo.offset = 0; + culledBladesBufferInfo.range = sizeof(Blade)*NUM_BLADES; + + VkDescriptorBufferInfo numBladesBufferInfo = {}; + numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numBladesBufferInfo.offset = 0; + numBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; + descriptorWrites[3 * i + 0].pBufferInfo = &bladesBufferInfo; + descriptorWrites[3 * i + 0].pImageInfo = nullptr; + descriptorWrites[3 * i + 0].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo; + descriptorWrites[3 * i + 1].pImageInfo = nullptr; + descriptorWrites[3 * i + 1].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo; + descriptorWrites[3 * i + 2].pImageInfo = nullptr; + descriptorWrites[3 * i + 2].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -717,7 +875,7 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.pName = "main"; // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -884,6 +1042,10 @@ void Renderer::RecordComputeCommandBuffer() { vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); // TODO: For each group of blades bind its descriptor set and dispatch + for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[j], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, NUM_BLADES/32, 1, 1); + } // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -973,16 +1135,18 @@ void Renderer::RecordCommandBuffers() { vkCmdBindPipeline(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipeline); for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { + // NOTE: Blades class does not create vertex buffer (since no vertices are created as part of Model class) VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); // TODO: Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1057,6 +1221,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..7b23d2b 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -19,6 +19,7 @@ class Renderer { void CreateModelDescriptorSetLayout(); void CreateTimeDescriptorSetLayout(); void CreateComputeDescriptorSetLayout(); + void CreateGrassDescriptorSetLayout(); void CreateDescriptorPool(); @@ -56,12 +57,16 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout grassDescriptorSetLayout; + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector grassDescriptorSets; + std::vector computeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/main.cpp b/src/main.cpp index 8bf822b..1e9c69c 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -1,4 +1,5 @@ #include +#include #include "Instance.h" #include "Window.h" #include "Renderer.h" @@ -67,18 +68,21 @@ namespace { int main() { static constexpr char* applicationName = "Vulkan Grass Rendering"; - InitializeWindow(640, 480, applicationName); + InitializeWindow(1280, 960, applicationName); unsigned int glfwExtensionCount = 0; const char** glfwExtensions = glfwGetRequiredInstanceExtensions(&glfwExtensionCount); - + + //NOTE: First thing is to create an instance -> connection between application and the Vulkan library Instance* instance = new Instance(applicationName, glfwExtensionCount, glfwExtensions); + //NOTE: Connection between Vulkan and the windows system to present result to the screen VkSurfaceKHR surface; if (glfwCreateWindowSurface(instance->GetVkInstance(), GetGLFWWindow(), nullptr, &surface) != VK_SUCCESS) { throw std::runtime_error("Failed to create window surface"); } + //NOTE: Need to select a physical GPU device to use instance->PickPhysicalDevice({ VK_KHR_SWAPCHAIN_EXTENSION_NAME }, QueueFlagBit::GraphicsBit | QueueFlagBit::TransferBit | QueueFlagBit::ComputeBit | QueueFlagBit::PresentBit, surface); VkPhysicalDeviceFeatures deviceFeatures = {}; @@ -86,8 +90,10 @@ int main() { deviceFeatures.fillModeNonSolid = VK_TRUE; deviceFeatures.samplerAnisotropy = VK_TRUE; + //NOTE: Create the logical device here and connecting to the physical device device = instance->CreateDevice(QueueFlagBit::GraphicsBit | QueueFlagBit::TransferBit | QueueFlagBit::ComputeBit | QueueFlagBit::PresentBit, deviceFeatures); + //NOTE: Swap chain is essentially a queue of images that are waiting to be presented to the screen swapChain = device->CreateSwapChain(surface, 5); camera = new Camera(device, 640.f / 480.f); @@ -129,6 +135,7 @@ int main() { ); plane->SetTexture(grassImage); + // NOTE: Blades class contains ALL blades (i.e. vector is contained within the class) Blades* blades = new Blades(device, transferCommandPool, planeDim); vkDestroyCommandPool(device->GetVkDevice(), transferCommandPool, nullptr); @@ -138,15 +145,38 @@ int main() { scene->AddBlades(blades); renderer = new Renderer(device, swapChain, scene, camera); + + GLFWwindow* window = GetGLFWWindow(); + glfwSetWindowSizeCallback(window, resizeCallback); + glfwSetMouseButtonCallback(window, mouseDownCallback); + glfwSetCursorPosCallback(window, mouseMoveCallback); - glfwSetWindowSizeCallback(GetGLFWWindow(), resizeCallback); - glfwSetMouseButtonCallback(GetGLFWWindow(), mouseDownCallback); - glfwSetCursorPosCallback(GetGLFWWindow(), mouseMoveCallback); + double fps = 0; + double timebase = 0; + int frame = 0; while (!ShouldQuit()) { + glfwPollEvents(); + + frame++; + double time = glfwGetTime(); + + if (time - timebase > 1.0) { + fps = frame / (time - timebase); + timebase = time; + frame = 0; + } + scene->UpdateTime(); renderer->Frame(); + + std::ostringstream ss; + ss << "["; + ss.precision(1); + ss << std::fixed << fps; + ss << " fps] " << applicationName; + glfwSetWindowTitle(window, ss.str().c_str()); } vkDeviceWaitIdle(device->GetVkDevice()); diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..deb013d 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -1,6 +1,19 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define G 4.0 // Gravity +#define WS 3.0 // Wind Strength +#define WF 0.5 // Wind Frequency + +#define CULLING true +#define ORIENTATION_CULLING true +#define FRUSTUM_CULLING true +#define DIST_CULLING true +#define DIST_CULLING_MAX 15.0 +#define DIST_CULLING_N 10 +#define FRUSTUM_CULLING_THRESHOLD 1.1 +#define ORIENTATION_CULLING_THRESHOLD 0.97 + #define WORKGROUP_SIZE 32 layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; @@ -21,36 +34,147 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: -// 1. Store the input blades -// 2. Write out the culled blades -// 3. Write the total number of blades remaining - -// The project is using vkCmdDrawIndirect to use a buffer as the arguments for a draw call -// This is sort of an advanced feature so we've showed you what this buffer should look like -// -// layout(set = ???, binding = ???) buffer NumBlades { -// uint vertexCount; // Write the number of blades remaining here -// uint instanceCount; // = 1 -// uint firstVertex; // = 0 -// uint firstInstance; // = 0 -// } numBlades; +layout(set = 2, binding = 0) buffer Blades { + Blade blades[]; +}; +layout(set = 2, binding = 1) buffer CulledBlades { + Blade culledBlades[]; +}; + +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); } +bool inFrustum(vec3 p) { + // NOTE: Slight modification from the paper: instead of adding a small tolerannce value, + // I'm using a weight factor so that it's not dependent on the magnitude of w. + // Also not testing against pp.z since usually we'd want the whole depth. + + vec4 pp = camera.proj * camera.view * vec4(p, 1.0); + float h = pp.w * FRUSTUM_CULLING_THRESHOLD; + return inBounds(pp.x, h) && inBounds(pp.y, h); +} + +/* +* Noise function taken directly from: https://gist.github.com/patriciogonzalezvivo/670c22f3966e662d2f83 +*/ +float rand(vec2 n) { + return fract(sin(dot(n, vec2(12.9898, 4.1414))) * 43758.5453); +} + +float noise(vec2 p){ + vec2 ip = floor(p); + vec2 u = fract(p); + u = u*u*(3.0-2.0*u); + + float res = mix( + mix(rand(ip),rand(ip+vec2(1.0,0.0)),u.x), + mix(rand(ip+vec2(0.0,1.0)),rand(ip+vec2(1.0,1.0)),u.x),u.y); + return res*res; +} + +vec3 windFunc(float t, vec3 v0){ + float n = noise(v0.xz); + return WS*vec3(sin(t)+n, 0, sin(t)+n); +} + void main() { + + uint index = gl_GlobalInvocationID.x; + // Reset the number of blades to 0 if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point - // TODO: Apply forces on every blade and update the vertices in the buffer + Blade b = blades[index]; + + // Extract Parameters + vec3 up = b.up.xyz; + vec3 v0 = b.v0.xyz; + vec3 v1 = b.v1.xyz; + vec3 v2 = b.v2.xyz; + float h = b.v1.w; + float k = b.up.w; + float theta = b.v0.w; + + // Gravity + vec4 D = vec4(0.0, -1.0, 0.0, G); + + vec3 t1 = vec3(-1.0, 0.0, 0.0); + t1 = cos(theta)*t1 + sin(theta)*cross(up, t1) + (1.0 - cos(theta))*dot(up, t1)*up; + vec3 t0 = normalize(v2 - v0); + t1 = normalize(t1); + vec3 f = normalize(cross(t0, t1)); + + vec3 ge = D.xyz * D.w; + vec3 gf = 0.25 * length(ge) * f; + vec3 g = ge + gf; + + // Recovery + vec3 iv2 = v0 + up * h; + vec3 r = (iv2 - v2) * k; + + // Wind + vec3 windDir = windFunc(totalTime, v0); + float fd = 1 - length(dot(normalize(windDir), normalize(v2-v0))); + float fr = dot((v2 - v0), up) / h; + vec3 w = windDir * (fd * fr); + + v2 += (r + g + w) * deltaTime; + + // State Validation + v2 = v2 - up * min(dot(up, v2 - v0), 0); + + float lproj = length(v2 - v0 - up*dot(v2-v0, up)); + v1 = v0 + h*up*max(1 - lproj/h, 0.05*max(lproj/h, 1)); + + float L0 = distance(v2, v0); + float L1 = distance(v1, v0) + distance(v2, v1); + float L = 0.25*(2*L0 + 2*L1); // cubic bezier curve + float ratio = h/L; + + v1 = v0 + ratio*(v1 - v0); + v2 = v1 + ratio*(v2 - v1); + + // Update buffer + blades[index].v1.xyz = v1.xyz; + blades[index].v2.xyz = v2.xyz; + + if (CULLING) { + + vec3 c = inverse(camera.view)[3].xyz; // camera's position in world space + + // NOTE: viewing direction is projected on the plane so that it's coplanar + // with the blade direction. Can be used for distance culling as well + vec3 viewVec = v0 - c - up*dot(v0 - c, up); + + // Orientation Culling + if (ORIENTATION_CULLING && ORIENTATION_CULLING_THRESHOLD < abs(dot(normalize(viewVec), t1))) + return; + + // View-Frustum Culling + vec3 m = 0.25*v0 + 0.5*v1 + 0.25*v2; + if (FRUSTUM_CULLING && !inFrustum(v0) && !inFrustum(v2) && !inFrustum(m)) + return; + + // Distance Culling + float dproj = length(viewVec); + int n = DIST_CULLING_N; + if (DIST_CULLING && index % n < int(floor(n*(1.0 - dproj/DIST_CULLING_MAX)))) + return; + } - // TODO: Cull blades that are too far away or not in the camera frustum and write them - // to the culled blades buffer - // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount - // You want to write the visible blades to the buffer without write conflicts between threads + // NOTE: atomicAdd returns the value of vertexCount right before the addition + // which guarantees that it would be the index for the culledBlades array + uint idx = atomicAdd(numBlades.vertexCount, 1); + culledBlades[idx] = blades[index]; } diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..653cb76 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -6,12 +6,22 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare fragment shader inputs +layout(location = 0) in vec3 pos; +layout(location = 1) in vec3 n; layout(location = 0) out vec4 outColor; void main() { - // TODO: Compute fragment color + + // Lighting Parameters + vec3 albedo = vec3(0.1, 0.5, 0.0); + vec3 lightPos = vec3(0.0, 100.0, 0.0); - outColor = vec4(1.0); + // Diffuse Lighting + vec3 l = normalize(lightPos - pos); + float lambert = max(dot(l, normalize(n)), 0.0); + + vec3 result = albedo + lambert * vec3(0.3); + + outColor = vec4(result, 1.0); } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..1fdd88f 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -1,6 +1,14 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +#define TESSLEVEL 10.0 + +// NOTE: The TCS takes an input patch and emits an output patch. +// The group of CPs (Control Points) is called a patch +// The input patch is the vertices from vertex shader. +// The output patch are the control points for the bezier curve? + +//layout(vertices = 3) out; // output patch control points layout(vertices = 1) out; layout(set = 0, binding = 0) uniform CameraBufferObject { @@ -8,19 +16,45 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation control shader inputs and outputs +layout(location = 0) in vec4 tcs_v1[]; +layout(location = 1) in vec4 tcs_v2[]; +layout(location = 2) in vec4 tcs_up[]; -void main() { - // Don't move the origin location of the patch - gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; +layout(location = 0) out vec4 tes_v1[]; +layout(location = 1) out vec4 tes_v2[]; +layout(location = 2) out vec4 tes_up[]; + +in gl_PerVertex +{ + vec4 gl_Position; +} gl_in[gl_MaxPatchVertices]; - // TODO: Write any shader outputs - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? +/* +float GetTessLevel(float d0, float d1){ + float avg = (d0 + d1)/2.0; + if (avg <= 2.0) + return 10.0; + else if (avg <= 5.0) + return 7.0; + else + return 3.0; +} +*/ + +// NOTE: This function is executed once per output CP and the builtin variable gl_InvocationID contains the index of the current invocation. +void main() { + + gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; + tes_v1[gl_InvocationID] = tcs_v1[gl_InvocationID]; + tes_v2[gl_InvocationID] = tcs_v2[gl_InvocationID]; + tes_up[gl_InvocationID] = tcs_up[gl_InvocationID]; + + // NOTE: The TLs determine the Tessellation level of detail - how many triangles to generate for the patch + gl_TessLevelInner[0] = TESSLEVEL; + gl_TessLevelInner[1] = TESSLEVEL; + gl_TessLevelOuter[0] = TESSLEVEL; + gl_TessLevelOuter[1] = TESSLEVEL; + gl_TessLevelOuter[2] = TESSLEVEL; + gl_TessLevelOuter[3] = TESSLEVEL; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..db18fd4 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -1,6 +1,7 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +// NOTE: this is the configuration for the TSE layout(quads, equal_spacing, ccw) in; layout(set = 0, binding = 0) uniform CameraBufferObject { @@ -8,11 +9,49 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation evaluation shader inputs and outputs +layout(location = 0) in vec4 tes_v1[]; +layout(location = 1) in vec4 tes_v2[]; +layout(location = 2) in vec4 tes_up[]; + +layout(location = 0) out vec3 fs_pos; +layout(location = 1) out vec3 fs_n; + +vec3 lerp(vec3 v1, vec3 v2, float u){ + return (1 - u) * v1 + u * v2; +} void main() { + + vec3 v0 = gl_in[0].gl_Position.xyz; + float theta = gl_in[0].gl_Position.w; + vec3 v1 = tes_v1[0].xyz; + vec3 v2 = tes_v2[0].xyz; + vec3 up = tes_up[0].xyz; + float w = tes_v2[0].w; //width + float u = gl_TessCoord.x; float v = gl_TessCoord.y; - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + // De Casteljau's Algorithm + vec3 a = lerp(v0, v1, v); + vec3 b = lerp(v1, v2, v); + vec3 c = lerp(a, b, v); + + // Calculate bitangent vector along the width of the blade + vec3 t1 = vec3(-cos(theta), 0, sin(theta)); + normalize(t1); + + vec3 t0 = normalize(b - a); + + fs_n = normalize(cross(t0, t1)); + + vec3 c0 = c - w * t1; + vec3 c1 = c + w * t1; + + float t = u + 0.5*v - u*v; + vec3 pos = lerp(c0, c1, t); + + fs_pos = pos; // vertex position in world space + + gl_Position = camera.proj * camera.view * vec4(pos, 1.0); // transform to clip space } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..d552761 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -2,16 +2,42 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable +// NOTE: The VS is executed on each patch (i.e. grass blade), containing all CPs. +// The patch comprises several CPs from the vertex buffer. + layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; }; // TODO: Declare vertex shader inputs and outputs +layout (location = 0) in vec4 in_v0; +layout (location = 1) in vec4 in_v1; +layout (location = 2) in vec4 in_v2; +layout (location = 3) in vec4 in_up; + +layout(location = 0) out vec4 tcs_v1; +layout(location = 1) out vec4 tcs_v2; +layout(location = 2) out vec4 tcs_up; out gl_PerVertex { vec4 gl_Position; }; +vec4 multiply(mat4 m, vec4 v) { + vec4 ans = m * vec4(v.xyz, 1.0); + ans.w = v.w; + return ans; +} + void main() { // TODO: Write gl_Position and any other shader outputs + + // Convert all vectors from local space to world space + + tcs_v1 = multiply(model, in_v1); + tcs_v2 = multiply(model, in_v2); + tcs_up = multiply(model, in_up); + + // Store v0 as gl_Position + gl_Position = multiply(model, in_v0); }