diff --git a/README.md b/README.md index 20ee451..64e6e7e 100644 --- a/README.md +++ b/README.md @@ -3,10 +3,103 @@ Vulkan Grass Rendering **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Aditya Hota + * [LinkedIn](https://www.linkedin.com/in/aditya-hota) +* Tested on: Windows 11, i7-8750H @ 2.20 GHz 20 GB, GTX 1070 8 GB -### (TODO: Your README) +# Overview +https://user-images.githubusercontent.com/12516225/139997820-71cbc941-aba7-4718-bdbc-8f8757408048.mp4 -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. + +This project involved implementing a grass rendered in Vulkan, based on the paper [Responsive Real-Time Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf) by Jahrmann and Wimmer. There were three major components to this project, including rendering the grass, applying forces to control grass movement, and culling blades of grass to improve rendering frame rate when possible. + +# Features +## Rendering and Compute Pipeline +To display blades of grass, the grass blades had to properly be transferred to the GPU. Part of this project involved adding the blades of grass to the models buffer so that the blades can be displayed along with the static image for the base. Vulkan is a new graphics API, so it took some getting used to before I could understand exactly what was going on. I first created a descriptor set for the grass, which populates the blade information into UBOs--these are eventually used for rendering. + +Next, I had to make changes to the compute pipeline to pass in the data to the compute shaders. There are three buffers that are used and need space allocated for--these are the input blades, retained (called culled in the code) blades, and number of blades. The compute shader takes in input blades, computes any changes due to environmental forces, and performs culling to selectively render blades that will be perceived by the user. The selected blades are then passed onto the graphics pipeline, but all blades still have the impact of forces computed and their positions modified. + +## Grass Blade Structure +A model for the grass blades was presented in the paper, which uses Bezier curves to represent the shape of a blade. Bezier curves are also used to represent grass blades for rendering in this project. The model is replicated below: + +![](img/blade_model.jpg) + +Each Bezier curve has three control points: +* `v0`: the position of the grass blade on the geomtry +* `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector +* `v2`: a physical guide for which we simulate forces on + +Additional information is passed into the compute shader. +* `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0` +* Orientation: the orientation of the grass blade's face (as an angle offset from the x axis) +* Height: the height of the grass blade +* Width: the width of the grass blade's face +* Stiffness coefficient: the stiffness of our grass blade, which will affect the force computations on our blade + +This data is cleverly packed into four `vec4`s, such that `v0.w` holds orientation, `v1.w` holds height, `v2.w` holds width, and `up.w` holds the stiffness coefficient. + +# Modeling Forces +## Wind +A simple wind force was created, based on the cosine (along x) and sin (along z) of the current time. This makes the blades of grass move in a circular fashion. + +## Gravity +Gravity drags down the tips of the blades, making them move towards the ground. Alone, gravity would cause all the grass to fall, so a counter force is needed to balance them. This is discussed next. + +## Restorative +The restoring force pushes the blade of grass back up, allowing it to "bounce" back up when the wind and gravity push the blade down. Depending on the stiffness of the blade, the recovery can be slow or fast. + +## Validation +Once all forces have been applied to the blade of grass, the end positions of the control points are validated. Specifically, we ensure that the wind and gravity do not push the blade of grass below the ground and that the blade of grass does not somehow end up being taller than its original height. + +# Optimization +Three optimizations were added to increase the frame rate. Some were more effective than others, as discussed in the performance analysis section. + +## Orientation Test +If the viewing angle is perpendicular to the thin edge of the blade of grass, it will not be rendered. It would not be visible to the user anyways, due to its thinness. + +https://user-images.githubusercontent.com/12516225/139998042-966158d1-739e-409d-adf4-dbe54419e8be.mp4 + +## View-Frustum Culling +If a blade is out of the camera view, it will not be rendered. The algorithm used makes it a bit difficult to see the effect (which is a good thing, because we don't want to cull blades that we can see). I didn't directly use the algorithm in the paper; I had to turn off the check for culling in the z direction since it was causing blades to be culled when moving away from the grass patch--this should be the function of distance culling instead. For demo purposes, I have made the culling more extreme, as seen below. + +https://user-images.githubusercontent.com/12516225/139998198-82aa14ca-c9c7-4f3f-9a75-83df591833f1.mp4 + +## Distance Culling +Distance culling only keeps blades that are within a certain distance to the camera. The maximum distance is divided into equally sized levels, where each level represents a radius away from the camera. Only some blades in each level are removed so that the effect looks more even. + +https://user-images.githubusercontent.com/12516225/139998364-d1364819-6588-4683-af34-0bb173f33b4f.mp4 + +# Performance Analysis +All tests below were performed with the camera situated above the plane, at an angle. `r = 20.0, theta = 45.f, phi = -45.0`. +## Impact of Number of Blades +As expected, increasing the number of blades decreases the frame rate. This is because there are more blades to compute physics for, and more blades to send through the rendering pipeline. Adding more than 65536 blades causes us to not be able to see any of the blades anyways, so figures are only helpful in seeing the FPS hit. The graph below shows how the FPS decreases exponentially as number of blades is increased (a log2-log2 plot is used to more easily depict the changes in FPS, due to the large magnitude difference). Enabling all forms of culling gives us a nearly 4x boost in most cases! + +

+ +

+ +## Effect of Each Optimization +It seems that distance culling had the greatest impact, and that orientation and view-frustum had about the same impact. I suspect that this is because distance culling removes the most number of blades (because it has a hard cutoff at a distance). Furthermore, we actually saw a performance hit when either orientation and view-frustum are used alone, probably due to the fact that additional computations need to be performed but only a few blades are culled. Of additional note is the fact that at higher blade numbers, distance culling seems to have a disproportionally greater impact on the FPS. I believe this is because the distance culling will take out the greatest number of blades, rather than a sparse sampling of them. + +Separate graphs are used because increasing number of blades causex large differences in FPS magnitude which are hard to see on a single graph. + +

+ +

+ +

+ +

+ +

+ +

+ +

+ +

+ +# References +* Getting camera eye from view matrix + * Invert to find camera space in terms of world space and take displacement in fourth column of homogeneous matrix + * https://www.3dgep.com/understanding-the-view-matrix/ diff --git a/img/fps_1048576.png b/img/fps_1048576.png new file mode 100644 index 0000000..1dea551 Binary files /dev/null and b/img/fps_1048576.png differ diff --git a/img/fps_16777216.png b/img/fps_16777216.png new file mode 100644 index 0000000..2bb6275 Binary files /dev/null and b/img/fps_16777216.png differ diff --git a/img/fps_4096.png b/img/fps_4096.png new file mode 100644 index 0000000..bcf3f76 Binary files /dev/null and b/img/fps_4096.png differ diff --git a/img/fps_65536.png b/img/fps_65536.png new file mode 100644 index 0000000..89a6c5e Binary files /dev/null and b/img/fps_65536.png differ diff --git a/img/fps_blades.png b/img/fps_blades.png new file mode 100644 index 0000000..2e699d4 Binary files /dev/null and b/img/fps_blades.png differ diff --git a/img/videos/DistanceCulling.mp4 b/img/videos/DistanceCulling.mp4 new file mode 100644 index 0000000..6b9d236 Binary files /dev/null and b/img/videos/DistanceCulling.mp4 differ diff --git a/img/videos/OrientationCulling.mp4 b/img/videos/OrientationCulling.mp4 new file mode 100644 index 0000000..593201c Binary files /dev/null and b/img/videos/OrientationCulling.mp4 differ diff --git a/img/videos/Overview.mp4 b/img/videos/Overview.mp4 new file mode 100644 index 0000000..ce78f10 Binary files /dev/null and b/img/videos/Overview.mp4 differ diff --git a/img/videos/ViewFrustumCulling.mp4 b/img/videos/ViewFrustumCulling.mp4 new file mode 100644 index 0000000..bee4d94 Binary files /dev/null and b/img/videos/ViewFrustumCulling.mp4 differ diff --git a/src/Blades.cpp b/src/Blades.cpp index 80e3d76..9a8a3a8 100644 --- a/src/Blades.cpp +++ b/src/Blades.cpp @@ -45,7 +45,7 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode indirectDraw.firstInstance = 0; BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory); - BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); + BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory); BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory); } diff --git a/src/Renderer.cpp b/src/Renderer.cpp index b445d04..1efaf63 100644 --- a/src/Renderer.cpp +++ b/src/Renderer.cpp @@ -195,9 +195,43 @@ void Renderer::CreateTimeDescriptorSetLayout() { } void Renderer::CreateComputeDescriptorSetLayout() { - // TODO: Create the descriptor set layout for the compute pipeline + // DONE: Create the descriptor set layout for the compute pipeline // Remember this is like a class definition stating why types of information // will be stored at each binding + // Describe the binding of the descriptor set layout + + VkDescriptorSetLayoutBinding grassBladesLayoutBinding = {}; + grassBladesLayoutBinding.binding = 0; + grassBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + grassBladesLayoutBinding.descriptorCount = 1; + grassBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + grassBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding culledBladesLayoutBinding = {}; + culledBladesLayoutBinding.binding = 1; + culledBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + culledBladesLayoutBinding.descriptorCount = 1; + culledBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + culledBladesLayoutBinding.pImmutableSamplers = nullptr; + + VkDescriptorSetLayoutBinding numBladesLayoutBinding = {}; + numBladesLayoutBinding.binding = 2; + numBladesLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + numBladesLayoutBinding.descriptorCount = 1; + numBladesLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT; + numBladesLayoutBinding.pImmutableSamplers = nullptr; + + std::vector bindings = { grassBladesLayoutBinding, culledBladesLayoutBinding, numBladesLayoutBinding }; + + // Create the descriptor set layout + VkDescriptorSetLayoutCreateInfo layoutInfo = {}; + layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO; + layoutInfo.bindingCount = static_cast(bindings.size()); + layoutInfo.pBindings = bindings.data(); + + if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) { + throw std::runtime_error("Failed to create descriptor set layout"); + } } void Renderer::CreateDescriptorPool() { @@ -206,16 +240,18 @@ void Renderer::CreateDescriptorPool() { // Camera { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1}, - // Models + Blades + // Models + Blades samplers { VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, - // Models + Blades + // Models + Blades UBOs { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , static_cast(scene->GetModels().size() + scene->GetBlades().size()) }, // Time (compute) { VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 }, - // TODO: Add any additional types and counts of descriptors you will need to allocate + // DONE: Add any additional types and counts of descriptors you will need to allocate + // Blades compute buffers + { VK_DESCRIPTOR_TYPE_STORAGE_BUFFER , static_cast(3 * scene->GetBlades().size()) }, }; VkDescriptorPoolCreateInfo poolInfo = {}; @@ -318,8 +354,44 @@ void Renderer::CreateModelDescriptorSets() { } void Renderer::CreateGrassDescriptorSets() { - // TODO: Create Descriptor sets for the grass. + // DONE: Create Descriptor sets for the grass. // This should involve creating descriptor sets which point to the model matrix of each group of grass blades + grassDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the desciptor set + VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(grassDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(grassDescriptorSets.size()); // Contains UBO for Grass + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassBufferInfo = {}; + grassBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer(); + grassBufferInfo.offset = 0; + grassBufferInfo.range = sizeof(ModelBufferObject); + + descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[i].dstSet = grassDescriptorSets[i]; + descriptorWrites[i].dstBinding = 0; + descriptorWrites[i].dstArrayElement = 0; + descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER; + descriptorWrites[i].descriptorCount = 1; + descriptorWrites[i].pBufferInfo = &grassBufferInfo; + descriptorWrites[i].pImageInfo = nullptr; + descriptorWrites[i].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateTimeDescriptorSet() { @@ -358,8 +430,74 @@ void Renderer::CreateTimeDescriptorSet() { } void Renderer::CreateComputeDescriptorSets() { - // TODO: Create Descriptor sets for the compute pipeline - // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + // DONE: Create Descriptor sets for the compute pipeline + // The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades + computeDescriptorSets.resize(scene->GetBlades().size()); + + // Describe the descriptor set + VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout }; + VkDescriptorSetAllocateInfo allocInfo = {}; + allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO; + allocInfo.descriptorPool = descriptorPool; + allocInfo.descriptorSetCount = static_cast(computeDescriptorSets.size()); + allocInfo.pSetLayouts = layouts; + + // Allocate descriptor sets + if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) { + throw std::runtime_error("Failed to allocate descriptor set"); + } + + std::vector descriptorWrites(3 * computeDescriptorSets.size()); + + for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) { + VkDescriptorBufferInfo grassBladesBufferInfo = {}; + grassBladesBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer(); + grassBladesBufferInfo.offset = 0; + grassBladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + VkDescriptorBufferInfo culledBladesBufferInfo = {}; + culledBladesBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer(); + culledBladesBufferInfo.offset = 0; + culledBladesBufferInfo.range = NUM_BLADES * sizeof(Blade); + + VkDescriptorBufferInfo numBladesBufferInfo = {}; + numBladesBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer(); + numBladesBufferInfo.offset = 0; + numBladesBufferInfo.range = sizeof(BladeDrawIndirect); + + descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 0].dstBinding = 0; + descriptorWrites[3 * i + 0].dstArrayElement = 0; + descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 0].descriptorCount = 1; + descriptorWrites[3 * i + 0].pBufferInfo = &grassBladesBufferInfo; + descriptorWrites[3 * i + 0].pImageInfo = nullptr; + descriptorWrites[3 * i + 0].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 1].dstBinding = 1; + descriptorWrites[3 * i + 1].dstArrayElement = 0; + descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 1].descriptorCount = 1; + descriptorWrites[3 * i + 1].pBufferInfo = &culledBladesBufferInfo; + descriptorWrites[3 * i + 1].pImageInfo = nullptr; + descriptorWrites[3 * i + 1].pTexelBufferView = nullptr; + + descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; + descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i]; + descriptorWrites[3 * i + 2].dstBinding = 2; + descriptorWrites[3 * i + 2].dstArrayElement = 0; + descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER; + descriptorWrites[3 * i + 2].descriptorCount = 1; + descriptorWrites[3 * i + 2].pBufferInfo = &numBladesBufferInfo; + descriptorWrites[3 * i + 2].pImageInfo = nullptr; + descriptorWrites[3 * i + 2].pTexelBufferView = nullptr; + } + + // Update descriptor sets + vkUpdateDescriptorSets(logicalDevice, static_cast(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr); } void Renderer::CreateGraphicsPipeline() { @@ -716,8 +854,8 @@ void Renderer::CreateComputePipeline() { computeShaderStageInfo.module = computeShaderModule; computeShaderStageInfo.pName = "main"; - // TODO: Add the compute dsecriptor set layout you create to this list - std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout }; + // DONE: Add the compute dsecriptor set layout you create to this list + std::vector descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout }; // Create pipeline layout VkPipelineLayoutCreateInfo pipelineLayoutInfo = {}; @@ -795,11 +933,11 @@ void Renderer::CreateFrameResources() { ); depthImageView = Image::CreateView(device, depthImage, depthFormat, VK_IMAGE_ASPECT_DEPTH_BIT); - + // Transition the image for use as depth-stencil Image::TransitionLayout(device, graphicsCommandPool, depthImage, depthFormat, VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL); - + // CREATE FRAMEBUFFERS framebuffers.resize(swapChain->GetCount()); for (size_t i = 0; i < swapChain->GetCount(); i++) { @@ -883,7 +1021,13 @@ void Renderer::RecordComputeCommandBuffer() { // Bind descriptor set for time uniforms vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr); - // TODO: For each group of blades bind its descriptor set and dispatch + // DONE: For each group of blades bind its descriptor set and dispatch + for (int i = 0; i < computeDescriptorSets.size(); i++) + { + vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr); + vkCmdDispatch(computeCommandBuffer, NUM_BLADES / WORKGROUP_SIZE, 1, 1); + } + // ~ End recording ~ if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) { @@ -975,14 +1119,15 @@ void Renderer::RecordCommandBuffers() { for (uint32_t j = 0; j < scene->GetBlades().size(); ++j) { VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() }; VkDeviceSize offsets[] = { 0 }; - // TODO: Uncomment this when the buffers are populated - // vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); + // DONE: Uncomment this when the buffers are populated + vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets); - // TODO: Bind the descriptor set for each grass blades model + // DONE: Bind the descriptor set for each grass blades model + vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr); // Draw - // TODO: Uncomment this when the buffers are populated - // vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); + // DONE: Uncomment this when the buffers are populated + vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect)); } // End render pass @@ -1041,11 +1186,11 @@ void Renderer::Frame() { Renderer::~Renderer() { vkDeviceWaitIdle(logicalDevice); - // TODO: destroy any resources you created + // DONE: destroy any resources you created vkFreeCommandBuffers(logicalDevice, graphicsCommandPool, static_cast(commandBuffers.size()), commandBuffers.data()); vkFreeCommandBuffers(logicalDevice, computeCommandPool, 1, &computeCommandBuffer); - + vkDestroyPipeline(logicalDevice, graphicsPipeline, nullptr); vkDestroyPipeline(logicalDevice, grassPipeline, nullptr); vkDestroyPipeline(logicalDevice, computePipeline, nullptr); @@ -1057,6 +1202,7 @@ Renderer::~Renderer() { vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr); vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr); + vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr); vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr); diff --git a/src/Renderer.h b/src/Renderer.h index 95e025f..f241ccd 100644 --- a/src/Renderer.h +++ b/src/Renderer.h @@ -17,6 +17,7 @@ class Renderer { void CreateCameraDescriptorSetLayout(); void CreateModelDescriptorSetLayout(); + void CreateGrassDescriptorSetLayout(); void CreateTimeDescriptorSetLayout(); void CreateComputeDescriptorSetLayout(); @@ -55,13 +56,17 @@ class Renderer { VkDescriptorSetLayout cameraDescriptorSetLayout; VkDescriptorSetLayout modelDescriptorSetLayout; + VkDescriptorSetLayout grassDescriptorSetLayout; VkDescriptorSetLayout timeDescriptorSetLayout; + VkDescriptorSetLayout computeDescriptorSetLayout; VkDescriptorPool descriptorPool; VkDescriptorSet cameraDescriptorSet; std::vector modelDescriptorSets; + std::vector grassDescriptorSets; VkDescriptorSet timeDescriptorSet; + std::vector computeDescriptorSets; VkPipelineLayout graphicsPipelineLayout; VkPipelineLayout grassPipelineLayout; diff --git a/src/Scene.cpp b/src/Scene.cpp index 86894f2..bdc2707 100644 --- a/src/Scene.cpp +++ b/src/Scene.cpp @@ -1,6 +1,8 @@ #include "Scene.h" #include "BufferUtils.h" +#define FPS_FRAME_PERIOD 20 + Scene::Scene(Device* device) : device(device) { BufferUtils::CreateBuffer(device, sizeof(Time), VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, timeBuffer, timeBufferMemory); vkMapMemory(device->GetVkDevice(), timeBufferMemory, 0, sizeof(Time), 0, &mappedData); @@ -24,10 +26,26 @@ void Scene::AddBlades(Blades* blades) { } void Scene::UpdateTime() { + static unsigned int frames_count = 0; + static float last_frames_time = 0.f; + high_resolution_clock::time_point currentTime = high_resolution_clock::now(); duration nextDeltaTime = duration_cast>(currentTime - startTime); startTime = currentTime; + frames_count++; + last_frames_time += duration_cast(nextDeltaTime).count(); + + if (frames_count == FPS_FRAME_PERIOD) + { + float fps = FPS_FRAME_PERIOD * 1000 / last_frames_time; + std::cout << fps << std::endl; + + frames_count = 0; + last_frames_time = 0.f; + } + + time.deltaTime = nextDeltaTime.count(); time.totalTime += time.deltaTime; diff --git a/src/Scene.h b/src/Scene.h index 7699d78..e838eff 100644 --- a/src/Scene.h +++ b/src/Scene.h @@ -2,6 +2,7 @@ #include #include +#include #include "Model.h" #include "Blades.h" diff --git a/src/shaders/compute.comp b/src/shaders/compute.comp index 0fd0224..f1ad5d6 100644 --- a/src/shaders/compute.comp +++ b/src/shaders/compute.comp @@ -2,6 +2,11 @@ #extension GL_ARB_separate_shader_objects : enable #define WORKGROUP_SIZE 32 + +#define ORIENTATION_CULL 1 +#define VIEW_FRUST_CULL 1 +#define DISTANCE_CULL 1 + layout(local_size_x = WORKGROUP_SIZE, local_size_y = 1, local_size_z = 1) in; layout(set = 0, binding = 0) uniform CameraBufferObject { @@ -21,20 +26,31 @@ struct Blade { vec4 up; }; -// TODO: Add bindings to: +// DONE: Add bindings to: // 1. Store the input blades // 2. Write out the culled blades // 3. Write the total number of blades remaining // The project is using vkCmdDrawIndirect to use a buffer as the arguments for a draw call // This is sort of an advanced feature so we've showed you what this buffer should look like -// -// layout(set = ???, binding = ???) buffer NumBlades { -// uint vertexCount; // Write the number of blades remaining here -// uint instanceCount; // = 1 -// uint firstVertex; // = 0 -// uint firstInstance; // = 0 -// } numBlades; + +// Store the input blades +layout(set = 2, binding = 0) buffer GrassBlades { + Blade grassBlades[]; +} grassBlades; + +// Write out the culled blades +layout(set = 2, binding = 1) buffer CulledBlades { + Blade culledBlades[]; +} culledBlades; + +// Write the total number of blades remaining +layout(set = 2, binding = 2) buffer NumBlades { + uint vertexCount; // Write the number of blades remaining here + uint instanceCount; // = 1 + uint firstVertex; // = 0 + uint firstInstance; // = 0 +} numBlades; bool inBounds(float value, float bounds) { return (value >= -bounds) && (value <= bounds); @@ -43,14 +59,117 @@ bool inBounds(float value, float bounds) { void main() { // Reset the number of blades to 0 if (gl_GlobalInvocationID.x == 0) { - // numBlades.vertexCount = 0; + numBlades.vertexCount = 0; } barrier(); // Wait till all threads reach this point - // TODO: Apply forces on every blade and update the vertices in the buffer + uint threadIdx = gl_GlobalInvocationID.x; + Blade b = grassBlades.grassBlades[threadIdx]; + + // DONE: Apply forces on every blade and update the vertices in the buffer + float orient = b.v0.w; + float height = b.v1.w; + float width = b.v2.w; + float stiff = b.up.w; + + vec3 v0_vec = b.v0.xyz; + vec3 v1_vec = b.v1.xyz; + vec3 v2_vec = b.v2.xyz; + vec3 up_vec = b.up.xyz; + + vec3 orient_vec = normalize(vec3(cos(orient), 0.f, sin(orient))); + vec3 frface_vec = normalize(cross(orient_vec, up_vec)); + + // Gravity force + vec4 direc_gE = vec4(0.f, -1.f, 0.f, 3.f); + vec3 force_gE = normalize(direc_gE.xyz) * direc_gE.w; + vec3 force_gF = 0.25 * length(force_gE) * frface_vec; + vec3 force_g = force_gE + force_gF; + + // Recovery force + vec3 v2_init = v0_vec + normalize(up_vec) * height; + vec3 force_r = (v2_init - v2_vec) * stiff; + + // Wind force + vec3 wind_dir = vec3(cos(4 * totalTime), 0.f, sin(5 * totalTime)); + float fd = 1 - abs(dot(normalize(wind_dir), normalize(v2_vec - v0_vec))); + float fr = dot(v2_vec - v0_vec, up_vec) / height; + float wind_alignment = fd * fr; + vec3 force_w = wind_dir * wind_alignment; + + // Translate v2 as a result of forces + vec3 tv2_vec = (force_g + force_r + force_w) * deltaTime; + v2_vec += tv2_vec; + + // Ensure v2 does not get pushed into the ground + v2_vec -= up_vec * min(dot(v2_vec - v0_vec, up_vec), 0.f); - // TODO: Cull blades that are too far away or not in the camera frustum and write them + // Adjust v1 as result of new v2 position + float l_proj = length(v2_vec - v0_vec - up_vec * dot(v2_vec - v0_vec, up_vec)); + v1_vec = v0_vec + height * up_vec * max(1.f - l_proj / height, 0.05 * max(l_proj / height, 1.f)); + + // Validate that length of Bezier curve is not larger than height + float n = 2.f; // Degree of Bezier curve + float L0 = length(v2_vec - v0_vec); + float L1 = length(v2_vec - v1_vec) + length(v1_vec - v0_vec); + float L_blade = (2.f * L0 + (n - 1.f) * L1) / (n + 1.f); + float r = height / L_blade; + v1_vec = v0_vec + r * (v1_vec - v0_vec); + v2_vec = v1_vec + r * (v2_vec - v1_vec); + + // DONE: Cull blades that are too far away or not in the camera frustum and write them // to the culled blades buffer // Note: to do this, you will need to use an atomic operation to read and update numBlades.vertexCount // You want to write the visible blades to the buffer without write conflicts between threads + bool cull = false; + + // Compute camera origin + vec3 cam_eye = inverse(camera.view)[3].xyz; + +#if ORIENTATION_CULL + // Orientation test + float angle_camera_blade = dot(frface_vec, normalize(v0_vec - cam_eye)); + cull = cull || angle_camera_blade > 0.9; +#endif + +#if VIEW_FRUST_CULL + // View-frustum test + float tol = 20.f; + vec3 md = 0.25 * v0_vec + 0.5 * v1_vec + 0.25 * v2_vec; + mat4 VP = camera.proj * camera.view; + + vec4 v0_ndc = VP * vec4(v0_vec, 1.f); + vec4 md_ndc = VP * vec4(md, 1.f); + vec4 v2_ndc = VP * vec4(v2_vec, 1.f); + + float v0_lim = tol; // v0_ndc.w + tol; (these did not seem to make a difference) + float md_lim = tol; // md_ndc.w + tol; + float v2_lim = tol; // v2_ndc.w + tol; + + bool keep_v0 = (inBounds(v0_ndc.x, v0_lim)) && + (inBounds(v0_ndc.y, v0_lim)); + bool keep_md = (inBounds(md_ndc.x, md_lim)) && + (inBounds(md_ndc.y, md_lim)); + bool keep_v2 = (inBounds(v2_ndc.x, v2_lim)) && + (inBounds(v2_ndc.y, v2_lim)); + + cull = cull || (!keep_v0 && !keep_md && !keep_v2); +#endif + +#if DISTANCE_CULL + // Distance test + float d_max = 20.f; + int cull_levels = 6; + float d_proj = length(v0_vec - cam_eye - up_vec * dot(v0_vec - cam_eye, up_vec)); + + cull = cull || (mod(threadIdx, cull_levels) > floor(cull_levels * (1.f - d_proj / d_max))); +#endif + + grassBlades.grassBlades[threadIdx].v1.xyz = v1_vec; + grassBlades.grassBlades[threadIdx].v2.xyz = v2_vec; + + if (!cull) + { + culledBlades.culledBlades[atomicAdd(numBlades.vertexCount, 1)] = grassBlades.grassBlades[threadIdx]; + } } diff --git a/src/shaders/grass.frag b/src/shaders/grass.frag index c7df157..18a6c1d 100644 --- a/src/shaders/grass.frag +++ b/src/shaders/grass.frag @@ -6,12 +6,17 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare fragment shader inputs +// DONE: Declare fragment shader inputs +layout(location = 0) in vec3 nor; +layout(location = 1) in float param_height; layout(location = 0) out vec4 outColor; void main() { - // TODO: Compute fragment color + // DONE: Compute fragment color + vec4 light_green = vec4(127.f, 212.f, 78.f, 255.f) / 255.f; + vec4 dark_green = vec4(26.f, 102.f, 26.f, 255.f) / 255.f; + vec4 color = mix(light_green, dark_green, 1 - param_height); - outColor = vec4(1.0); + outColor = color; } diff --git a/src/shaders/grass.tesc b/src/shaders/grass.tesc index f9ffd07..397f107 100644 --- a/src/shaders/grass.tesc +++ b/src/shaders/grass.tesc @@ -9,18 +9,31 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { } camera; // TODO: Declare tessellation control shader inputs and outputs +layout(location = 0) in vec4 in_v0[]; +layout(location = 1) in vec4 in_v1[]; +layout(location = 2) in vec4 in_v2[]; +layout(location = 3) in vec4 in_up[]; + +layout(location = 0) out vec4 out_v0[]; +layout(location = 1) out vec4 out_v1[]; +layout(location = 2) out vec4 out_v2[]; +layout(location = 3) out vec4 out_up[]; void main() { // Don't move the origin location of the patch gl_out[gl_InvocationID].gl_Position = gl_in[gl_InvocationID].gl_Position; - // TODO: Write any shader outputs + // DONE: Write any shader outputs + out_v0[gl_InvocationID] = in_v0[gl_InvocationID]; + out_v1[gl_InvocationID] = in_v1[gl_InvocationID]; + out_v2[gl_InvocationID] = in_v2[gl_InvocationID]; + out_up[gl_InvocationID] = in_up[gl_InvocationID]; - // TODO: Set level of tesselation - // gl_TessLevelInner[0] = ??? - // gl_TessLevelInner[1] = ??? - // gl_TessLevelOuter[0] = ??? - // gl_TessLevelOuter[1] = ??? - // gl_TessLevelOuter[2] = ??? - // gl_TessLevelOuter[3] = ??? + // DONE: Set level of tesselation + gl_TessLevelInner[0] = 7.0; + gl_TessLevelInner[1] = 7.0; + gl_TessLevelOuter[0] = 7.0; + gl_TessLevelOuter[1] = 7.0; + gl_TessLevelOuter[2] = 7.0; + gl_TessLevelOuter[3] = 7.0; } diff --git a/src/shaders/grass.tese b/src/shaders/grass.tese index 751fff6..0029c51 100644 --- a/src/shaders/grass.tese +++ b/src/shaders/grass.tese @@ -8,11 +8,40 @@ layout(set = 0, binding = 0) uniform CameraBufferObject { mat4 proj; } camera; -// TODO: Declare tessellation evaluation shader inputs and outputs +// DONE: Declare tessellation evaluation shader inputs and outputs +layout(location = 0) in vec4 in_v0[]; +layout(location = 1) in vec4 in_v1[]; +layout(location = 2) in vec4 in_v2[]; +layout(location = 3) in vec4 in_up[]; + +layout(location = 0) out vec3 nor; +layout(location = 1) out float param_height; void main() { float u = gl_TessCoord.x; float v = gl_TessCoord.y; - // TODO: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + // DONE: Use u and v to parameterize along the grass blade and output positions for each vertex of the grass blade + vec3 v0 = in_v0[0].xyz; + vec3 v1 = in_v1[0].xyz; + vec3 v2 = in_v2[0].xyz; + float orient = in_v0[0].w; + float width = in_v2[0].w; + vec3 t1 = normalize(vec3(cos(orient), 0.f, sin(orient))); + + // De Casteljau's algorithm + vec3 a = v0 + v * (v1 - v0); + vec3 b = v1 + v * (v2 - v1); + vec3 c = a + v * (b - a); + vec3 c0 = c - width * t1; + vec3 c1 = c + width * t1; + vec3 t0 = normalize(b - a); + vec3 n = normalize(cross(t0, t1)); + + // Generate triangle shape + float t = u + 0.5 * v - u * v; + vec3 p = (1 - t) * c0 + t * c1; + gl_Position = camera.proj * camera.view * vec4(p, 1.f); + + param_height = v; } diff --git a/src/shaders/grass.vert b/src/shaders/grass.vert index db9dfe9..7fae0c7 100644 --- a/src/shaders/grass.vert +++ b/src/shaders/grass.vert @@ -1,4 +1,3 @@ - #version 450 #extension GL_ARB_separate_shader_objects : enable @@ -6,12 +5,25 @@ layout(set = 1, binding = 0) uniform ModelBufferObject { mat4 model; }; -// TODO: Declare vertex shader inputs and outputs +// DONE: Declare vertex shader inputs and outputs +layout(location = 0) in vec4 in_v0; +layout(location = 1) in vec4 in_v1; +layout(location = 2) in vec4 in_v2; +layout(location = 3) in vec4 in_up; + +layout(location = 0) out vec4 out_v0; +layout(location = 1) out vec4 out_v1; +layout(location = 2) out vec4 out_v2; +layout(location = 3) out vec4 out_up; out gl_PerVertex { vec4 gl_Position; }; void main() { - // TODO: Write gl_Position and any other shader outputs + // DONE: Write gl_Position and any other shader outputs + out_v0 = in_v0; + out_v1 = in_v1; + out_v2 = in_v2; + out_up = in_up; }