Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 102 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,107 @@ Vulkan Grass Rendering

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 5**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Christine Kneer
* https://www.linkedin.com/in/christine-kneer/
* https://www.christinekneer.com/
* Tested on: Windows 11, i7-13700HX @ 2.1GHz 32GB, RTX 4060 8GB (Personal Laptop)

### (TODO: Your README)
## Part 1: Introduction

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
In this project, I used Vulkan to implement a grass simulator and renderer based on the paper [Responsive Real-Time Grass Grass Rendering for General 3D Scenes](https://www.cg.tuwien.ac.at/research/publications/2017/JAHRMANN-2017-RRTG/JAHRMANN-2017-RRTG-draft.pdf). The paper leverages compute shaders and tesselation to render and simulate physically accurate grass in real time.

<p align="center">
<img width="600" src = "https://github.com/user-attachments/assets/7d5edb52-b6e3-462f-84ae-7de291d6aeb4">
</p>

### Part 1.1: The Grass Blade Model

Based on the paper, grass is represented as Bezier Curve with 3 control points.

<p align="center">
<img width="350" alt="image" src="https://github.com/user-attachments/assets/af06d0dd-f42e-40b6-97c0-ce662c0169a1">
</p>

Each Bezier curve has three control points.
* `v0`: the position of the grass blade on the geomtry
* `v1`: a Bezier curve guide that is always "above" `v0` with respect to the grass blade's up vector (explained soon)
* `v2`: a physical guide for which we simulate forces on

We also need to store per-blade characteristics that will help us simulate and tessellate our grass blades correctly.
* `up`: the blade's up vector, which corresponds to the normal of the geometry that the grass blade resides on at `v0`
* Orientation: the orientation of the grass blade's face
* Height: the height of the grass blade
* Width: the width of the grass blade's face
* Stiffness coefficient: the stiffness of our grass blade, which will affect the force computations on our blade


### Part 1.2: Simulating Forces

Forces (gravity, recovery, and wind) are applied to Bezier Curve represented grass blades.

|![without](https://github.com/user-attachments/assets/11b952b6-9810-440f-8e23-23e25d61e814)|![with](https://github.com/user-attachments/assets/8cd90295-49af-443b-9b64-cb8183d82b0c)|
|:--:|:--:|
|*Without Physics*|*With Physics*|

### Part 1.3: Culling

In order to further optimize our simulator for real time, we need to cull glass blades that do not need to be rendered due to a variety of reasons.
* **Orientation Culling**: Cull grass blades whose front face direction is perpendicular to the camera's view vector, in which case the blade does not have width.
* **View-Frustrum Culling**: Cull grass blades that are outside of the view-frustum, effectively cannot be seen by the camera.
* **Distance Culling**: Cull grass blades that are far enough that end up smaller than a pixel.

|![ori_cull](https://github.com/user-attachments/assets/3a91c1f4-1ee8-41e2-a8ba-362de6151707)|![view_cull](https://github.com/user-attachments/assets/379cff28-624f-42b5-973f-4bf8c623eb3e)|![dist_cull](https://github.com/user-attachments/assets/a6800bb9-f259-447c-b91f-505b2f5ec2bb)|
|:--:|:--:|:--:|
|*Orientation Culling*|*View Frustrum Culling*|*Distance Culling*|

**Note**: The above demos were produced with enhanced parameters to better showcase the features.

### Part 1.4: Tesselation

Finally, Bezier Curves need to be tesselated into polygons to be processed by the grass graphics pipeline. In this simulator, I chose to tesselate into trangles. The tesselation level is a function of how far the grass blade is from the camera, because further objects require fewer details to be represented accurately.

|![LOD](https://github.com/user-attachments/assets/1feb473a-29e5-44d5-bdd9-89951feea74a)|
|:--:|
|*Dynamic LOD*|

**Note**: The above demo was produced with enhanced parameters to better showcase the feature.

## Part 2: Performance Analysis

In this part, we discuss the performance of our simulator under different performance improvement techiniques.

### Part 2.1: Culling vs # of Grass Blades

|![chart (4)](https://github.com/user-attachments/assets/3b0d9f37-a470-40d6-8a67-9e8e8d784420)|
|:--:|
|*Hardcoded Tesselation Level = 8*|

As the number of grass blades increases, the FPS of both with & without culling significantly drops. This is expected since more blades equates to more computational workload in the compute shader. However, it can be seen from the digram above that there is a consistent performance boost associated with using culling. Culling effectively reduces the amount of work.

It is also interesting to note that the performance benifit introduced by culling is more significant as the number of grass blades increases. This may not be straightforward from the graph itself.

At **2^10** number of blades, culling increases the FPS from 2300 to 2850. At **2^18** number of blades, culling inreases the FPS from 26 to 48. At first glance, a 550 FPS increase looks more prominent than a 22 FPS increase. However, the relative impact of the FPS gain is more meaningful in lower FPS scenarios.

Here's the math to clarify:
* At 2300 FPS, the frame time is approximately 1/2300 = 0.435 ms.
* At 2850 FPS, the frame time is approximately 1/2850 = 0.351 ms.
* **The difference in frame time is 0.435 − 0.351 = 0.084 ms, which is very small.**

Now, consider the case of **lower FPS**:
* At 26 FPS, the frame time is approaximately 1/26 = 38.46 ms.
* At 48 FPS, the frame time is approximately 1/48 = 20.83 ms.
* **The difference in frame time is a whopping 38.46 - 20.83 = 17.63 ms, which is MUCH larger.**

This means that culling is more substantial as the number of grass blades increases, which is also expected since more blades means that we will probably cull more blades as well.

### Part 2.2: Culling Methods

As discussed in part 2.2, culling is more substantial at hight number of grass blades, so let us now compare the three different culling methods at 2^18 grass blades.

|![chart (5)](https://github.com/user-attachments/assets/b6006729-b663-4c6e-823a-df1b60456c39)|
|:--:|
|*2^18 Grass Blades, Hardcoded Tesselation Level = 8*|

As seen from the graph above, all three culling methods introduces some performance boost, to different extent. View-frustrum culling seems to have less of an impact compared to orientation and distance culling, but the three combined results in the best performance. This is also expected since each culling method culls blades according to different criteria, and the three combined would cull the most blades.

However, it should be noted that the above test is not sound since each culling method has tunable parameters. Admittedly, these parameters and the camera position & orientation would definitely have an impact on how much performance is increased.
3 changes: 2 additions & 1 deletion src/Blades.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ Blades::Blades(Device* device, VkCommandPool commandPool, float planeDim) : Mode
indirectDraw.firstInstance = 0;

BufferUtils::CreateBufferFromData(device, commandPool, blades.data(), NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, bladesBuffer, bladesBufferMemory);
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
//Used as Vertex buffer
BufferUtils::CreateBuffer(device, NUM_BLADES * sizeof(Blade), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, culledBladesBuffer, culledBladesBufferMemory);
BufferUtils::CreateBufferFromData(device, commandPool, &indirectDraw, sizeof(BladeDrawIndirect), VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT, numBladesBuffer, numBladesBufferMemory);
}

Expand Down
148 changes: 143 additions & 5 deletions src/Renderer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,26 @@ void Renderer::CreateComputeDescriptorSetLayout() {
// TODO: Create the descriptor set layout for the compute pipeline
// Remember this is like a class definition stating why types of information
// will be stored at each binding
std::vector<VkDescriptorSetLayoutBinding> bindings = {};
for (int i = 0; i < 3; ++i)
{
VkDescriptorSetLayoutBinding uboLayoutBinding = {};
uboLayoutBinding.binding = i;
uboLayoutBinding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
uboLayoutBinding.descriptorCount = 1;
uboLayoutBinding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
uboLayoutBinding.pImmutableSamplers = nullptr;
bindings.push_back(uboLayoutBinding);
}

VkDescriptorSetLayoutCreateInfo layoutInfo = {};
layoutInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
layoutInfo.bindingCount = static_cast<uint32_t>(bindings.size());
layoutInfo.pBindings = bindings.data();

if (vkCreateDescriptorSetLayout(logicalDevice, &layoutInfo, nullptr, &computeDescriptorSetLayout) != VK_SUCCESS) {
throw std::runtime_error("Failed to create COMPUTE descriptor set layout");
}
}

void Renderer::CreateDescriptorPool() {
Expand All @@ -216,6 +236,7 @@ void Renderer::CreateDescriptorPool() {
{ VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER , 1 },

// TODO: Add any additional types and counts of descriptors you will need to allocate
{ VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, static_cast<uint32_t>(3 * scene->GetBlades().size())}
};

VkDescriptorPoolCreateInfo poolInfo = {};
Expand Down Expand Up @@ -320,6 +341,48 @@ void Renderer::CreateModelDescriptorSets() {
void Renderer::CreateGrassDescriptorSets() {
// TODO: Create Descriptor sets for the grass.
// This should involve creating descriptor sets which point to the model matrix of each group of grass blades
grassDescriptorSets.resize(scene->GetBlades().size());

// Describe the desciptor set
VkDescriptorSetLayout layouts[] = { modelDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(grassDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

// Allocate descriptor sets
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, grassDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate grass descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(grassDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
VkDescriptorBufferInfo modelBufferInfo = {};
modelBufferInfo.buffer = scene->GetBlades()[i]->GetModelBuffer();
modelBufferInfo.offset = 0;
modelBufferInfo.range = sizeof(ModelBufferObject);

// Bind image and sampler resources to the descriptor
VkDescriptorImageInfo imageInfo = {};
imageInfo.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
imageInfo.imageView = scene->GetModels()[i]->GetTextureView();
imageInfo.sampler = scene->GetModels()[i]->GetTextureSampler();

descriptorWrites[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[i].dstSet = grassDescriptorSets[i];
descriptorWrites[i].dstBinding = 0;
descriptorWrites[i].dstArrayElement = 0;
descriptorWrites[i].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
descriptorWrites[i].descriptorCount = 1;
descriptorWrites[i].pBufferInfo = &modelBufferInfo;
descriptorWrites[i].pImageInfo = nullptr;
descriptorWrites[i].pTexelBufferView = nullptr;
}

// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateTimeDescriptorSet() {
Expand Down Expand Up @@ -360,6 +423,75 @@ void Renderer::CreateTimeDescriptorSet() {
void Renderer::CreateComputeDescriptorSets() {
// TODO: Create Descriptor sets for the compute pipeline
// The descriptors should point to Storage buffers which will hold the grass blades, the culled grass blades, and the output number of grass blades
computeDescriptorSets.resize(scene->GetBlades().size());

// Describe the desciptor set
VkDescriptorSetLayout layouts[] = { computeDescriptorSetLayout };
VkDescriptorSetAllocateInfo allocInfo = {};
allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
allocInfo.descriptorPool = descriptorPool;
allocInfo.descriptorSetCount = static_cast<uint32_t>(computeDescriptorSets.size());
allocInfo.pSetLayouts = layouts;

// Allocate descriptor sets
if (vkAllocateDescriptorSets(logicalDevice, &allocInfo, computeDescriptorSets.data()) != VK_SUCCESS) {
throw std::runtime_error("Failed to allocate compute descriptor set");
}

std::vector<VkWriteDescriptorSet> descriptorWrites(3 * computeDescriptorSets.size());

for (uint32_t i = 0; i < scene->GetBlades().size(); ++i) {
// input blade buffer
VkDescriptorBufferInfo inputBladeBufferInfo = {};
inputBladeBufferInfo.buffer = scene->GetBlades()[i]->GetBladesBuffer();
inputBladeBufferInfo.offset = 0;
inputBladeBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 0].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 0].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 0].dstBinding = 0;
descriptorWrites[3 * i + 0].dstArrayElement = 0;
descriptorWrites[3 * i + 0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 0].descriptorCount = 1;
descriptorWrites[3 * i + 0].pBufferInfo = &inputBladeBufferInfo;
descriptorWrites[3 * i + 0].pImageInfo = nullptr;
descriptorWrites[3 * i + 0].pTexelBufferView = nullptr;

// culled blade buffer
VkDescriptorBufferInfo culledBladeBufferInfo = {};
culledBladeBufferInfo.buffer = scene->GetBlades()[i]->GetCulledBladesBuffer();
culledBladeBufferInfo.offset = 0;
culledBladeBufferInfo.range = NUM_BLADES * sizeof(Blade);

descriptorWrites[3 * i + 1].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 1].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 1].dstBinding = 1;
descriptorWrites[3 * i + 1].dstArrayElement = 0;
descriptorWrites[3 * i + 1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 1].descriptorCount = 1;
descriptorWrites[3 * i + 1].pBufferInfo = &culledBladeBufferInfo;
descriptorWrites[3 * i + 1].pImageInfo = nullptr;
descriptorWrites[3 * i + 1].pTexelBufferView = nullptr;

// num blades buffer
VkDescriptorBufferInfo numBladeBufferInfo = {};
numBladeBufferInfo.buffer = scene->GetBlades()[i]->GetNumBladesBuffer();
numBladeBufferInfo.offset = 0;
numBladeBufferInfo.range = sizeof(BladeDrawIndirect);

descriptorWrites[3 * i + 2].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
descriptorWrites[3 * i + 2].dstSet = computeDescriptorSets[i];
descriptorWrites[3 * i + 2].dstBinding = 2;
descriptorWrites[3 * i + 2].dstArrayElement = 0;
descriptorWrites[3 * i + 2].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
descriptorWrites[3 * i + 2].descriptorCount = 1;
descriptorWrites[3 * i + 2].pBufferInfo = &numBladeBufferInfo;
descriptorWrites[3 * i + 2].pImageInfo = nullptr;
descriptorWrites[3 * i + 2].pTexelBufferView = nullptr;
}

// Update descriptor sets
vkUpdateDescriptorSets(logicalDevice, static_cast<uint32_t>(descriptorWrites.size()), descriptorWrites.data(), 0, nullptr);
}

void Renderer::CreateGraphicsPipeline() {
Expand Down Expand Up @@ -600,7 +732,7 @@ void Renderer::CreateGrassPipeline() {
rasterizer.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO;
rasterizer.depthClampEnable = VK_FALSE;
rasterizer.rasterizerDiscardEnable = VK_FALSE;
rasterizer.polygonMode = VK_POLYGON_MODE_FILL;
rasterizer.polygonMode = VK_POLYGON_MODE_FILL /*Wireframe: VK_POLYGON_MODE_LINE*/;
rasterizer.lineWidth = 1.0f;
rasterizer.cullMode = VK_CULL_MODE_NONE;
rasterizer.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE;
Expand Down Expand Up @@ -717,7 +849,7 @@ void Renderer::CreateComputePipeline() {
computeShaderStageInfo.pName = "main";

// TODO: Add the compute dsecriptor set layout you create to this list
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout };
std::vector<VkDescriptorSetLayout> descriptorSetLayouts = { cameraDescriptorSetLayout, timeDescriptorSetLayout, computeDescriptorSetLayout };

// Create pipeline layout
VkPipelineLayoutCreateInfo pipelineLayoutInfo = {};
Expand Down Expand Up @@ -884,7 +1016,11 @@ void Renderer::RecordComputeCommandBuffer() {
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 1, 1, &timeDescriptorSet, 0, nullptr);

// TODO: For each group of blades bind its descriptor set and dispatch

for (int i = 0; i < scene->GetBlades().size(); ++i)
{
vkCmdBindDescriptorSets(computeCommandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, computePipelineLayout, 2, 1, &computeDescriptorSets[i], 0, nullptr);
vkCmdDispatch(computeCommandBuffer, (NUM_BLADES + WORKGROUP_SIZE - 1) / WORKGROUP_SIZE, 1, 1);
}
// ~ End recording ~
if (vkEndCommandBuffer(computeCommandBuffer) != VK_SUCCESS) {
throw std::runtime_error("Failed to record compute command buffer");
Expand Down Expand Up @@ -976,13 +1112,14 @@ void Renderer::RecordCommandBuffers() {
VkBuffer vertexBuffers[] = { scene->GetBlades()[j]->GetCulledBladesBuffer() };
VkDeviceSize offsets[] = { 0 };
// TODO: Uncomment this when the buffers are populated
// vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);
vkCmdBindVertexBuffers(commandBuffers[i], 0, 1, vertexBuffers, offsets);

// TODO: Bind the descriptor set for each grass blades model
vkCmdBindDescriptorSets(commandBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, grassPipelineLayout, 1, 1, &grassDescriptorSets[j], 0, nullptr);

// Draw
// TODO: Uncomment this when the buffers are populated
// vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
vkCmdDrawIndirect(commandBuffers[i], scene->GetBlades()[j]->GetNumBladesBuffer(), 0, 1, sizeof(BladeDrawIndirect));
}

// End render pass
Expand Down Expand Up @@ -1057,6 +1194,7 @@ Renderer::~Renderer() {
vkDestroyDescriptorSetLayout(logicalDevice, cameraDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, modelDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, timeDescriptorSetLayout, nullptr);
vkDestroyDescriptorSetLayout(logicalDevice, computeDescriptorSetLayout, nullptr);

vkDestroyDescriptorPool(logicalDevice, descriptorPool, nullptr);

Expand Down
3 changes: 3 additions & 0 deletions src/Renderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,15 @@ class Renderer {
VkDescriptorSetLayout cameraDescriptorSetLayout;
VkDescriptorSetLayout modelDescriptorSetLayout;
VkDescriptorSetLayout timeDescriptorSetLayout;
VkDescriptorSetLayout computeDescriptorSetLayout;

VkDescriptorPool descriptorPool;

VkDescriptorSet cameraDescriptorSet;
std::vector<VkDescriptorSet> modelDescriptorSets;
VkDescriptorSet timeDescriptorSet;
std::vector<VkDescriptorSet> grassDescriptorSets;
std::vector<VkDescriptorSet> computeDescriptorSets;

VkPipelineLayout graphicsPipelineLayout;
VkPipelineLayout grassPipelineLayout;
Expand Down
Loading