diff --git a/README.md b/README.md index bec6ca4..3256c70 100644 --- a/README.md +++ b/README.md @@ -3,13 +3,34 @@ Vulkan Flocking: compute and shading in one pipeline! **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 6** -* (TODO) YOUR NAME HERE - Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Jian Ru + Windows 10, i7-4850 @ 2.3GHz 16GB, GT 750M 16GB (Personal) - ### (TODO: Your README) +![](img/demo.gif) - Include screenshots, analysis, etc. (Remember, this is public, so don't put - anything here that you don't want to share with the world.) +### Q&A + +* **Why do you think Vulkan expects explicit descriptors for things like +generating pipelines and commands? HINT: this may relate to something in the +comments about some components using pre-allocated GPU memory.** + +To my understanding, you don't need a descriptor to generate commands. You can simply record commands using Vulkan API calls where you need to specify argument values and and the command buffer you want to record into. For generating pipelines and some other objects (e.g. descriptor sets, descriptor set layouts), Vulkan expects explicit descriptors because Vulkan is a low-level graphics API, instead of implicitly creating default pipeline(s) and providing you with function calls to configure them, it is the responsibility of programmers to explicitly state what we need or how we want things to be (e.g. what shader stages are enabled, wether or not to enable depth test, what is the depth test operator). + +* **Describe a situation besides flip-flop buffers in which you may need multiple +descriptor sets to fit one descriptor layout.** + +Descriptor set layouts specify in which shader stage, to which binding point of which target, there is something (e.g. buffer, texture) bound. Descriptor sets will further specify what is bound to each of these binding points (of various targets and shader stages). So whenever you need to bind different pieces of data to the same binding points, you can utilize multiple descriptor sets. For example, if I want to draw multiple models using the same set of shaders, I can create a descriptor set to hold the information of the vertex buffer of each model. Then I can just bind the corresponding descriptor set before I draw a different model rather than respecifying all the resource bindings and vertex data layouts again. It is kind of like VAO in OpenGL but now you actually hold a structure of descriptor set that contains all the meta information rather than a simple GLuint VAO handle. + +* **What are some problems to keep in mind when using multiple Vulkan queues?** + * **take into consideration that different queues may be backed by different hardware** + * **take into consideration that the same buffer may be used across multiple queues** + + According to the hints, the caveats I can imagine are hardware compatibility and synchronization. Since different queues may be backed by different hardware, we need to ensure the commands we use are supported if they will be executed in multiple queues. We also need to explicitly specify the correct queue family indices when creating memory barriers set between shaders on different queues. If a resource (e.g. buffers, textures) are used accross different queues, proper synchronization using memory barriers and fences is required. + +* **What is one advantage of using compute commands that can share data with a +rendering pipeline?** + +Avoid the super-duper expensive context switch!!! For example, switching between CUDA context and OpenGL context. A context is like a process on CPU. ### Credits diff --git a/base/vulkanexamplebase.cpp b/base/vulkanexamplebase.cpp index aa8a8a9..d402860 100644 --- a/base/vulkanexamplebase.cpp +++ b/base/vulkanexamplebase.cpp @@ -505,6 +505,7 @@ VulkanExampleBase::VulkanExampleBase(bool enableValidation, PFN_GetEnabledFeatur initxcbConnection(); #endif + enabledFeatures = {}; if (enabledFeaturesFn != nullptr) { this->enabledFeatures = enabledFeaturesFn(); diff --git a/base/vulkanexamplebase.h b/base/vulkanexamplebase.h index a30387e..fc59fa8 100644 --- a/base/vulkanexamplebase.h +++ b/base/vulkanexamplebase.h @@ -50,7 +50,7 @@ class VulkanExampleBase bool enableVSync = false; // Device features enabled by the example // If not set, no additional features are enabled (may result in validation layer errors) - VkPhysicalDeviceFeatures enabledFeatures = {}; + VkPhysicalDeviceFeatures enabledFeatures; // fps timer (one second interval) float fpsTimer = 0.0f; // Create application wide Vulkan instance diff --git a/data/shaders/computeparticles/particle.comp b/data/shaders/computeparticles/particle.comp index b7dc2f7..8e18fff 100644 --- a/data/shaders/computeparticles/particle.comp +++ b/data/shaders/computeparticles/particle.comp @@ -1,6 +1,7 @@ #version 450 #extension GL_ARB_separate_shader_objects : enable + #extension GL_ARB_shading_language_420pack : enable struct Particle @@ -24,7 +25,7 @@ layout(std140, binding = 1) buffer ParticlesB Particle particlesB[ ]; }; -layout (local_size_x = 16, local_size_y = 16) in; +layout (local_size_x = 256, local_size_y = 1) in; // LOOK: rule weights and distances, as well as particle count, based off uniforms. // The deltaT here has to be updated every frame to account for changes in @@ -43,10 +44,10 @@ layout (binding = 2) uniform UBO void main() { - // LOOK: This is very similar to a CUDA kernel. - // Right now, the compute shader only advects the particles with their - // velocity and handles wrap-around. - // TODO: implement flocking behavior. + // LOOK: This is very similar to a CUDA kernel. + // Right now, the compute shader only advects the particles with their + // velocity and handles wrap-around. + // TODO: implement flocking behavior. // Current SSBO index uint index = gl_GlobalInvocationID.x; @@ -55,20 +56,69 @@ void main() return; // Read position and velocity - vec2 vPos = particlesA[index].pos.xy; + vec2 vPos = particlesA[index].pos.xy; vec2 vVel = particlesA[index].vel.xy; - // clamp velocity for a more pleasing simulation. - vVel = normalize(vVel) * clamp(length(vVel), 0.0, 0.1); + // Flocking rules + int rule1Count = 0; + vec2 center = vec2(0); + vec2 separationVel = vec2(0); + int rule3Count = 0; + vec2 alignmentVel = vec2(0); + + for (int i = 0; i < ubo.particleCount; ++i) + { + if (i == index) continue; + + vec2 nPos = particlesA[i].pos; + vec2 nVel = particlesA[i].vel; + + float dist = distance(nPos, vPos); + + if (dist < ubo.rule1Distance) + { + center += nPos; + ++rule1Count; + } + + if (dist < ubo.rule2Distance) + { + vec2 repel = vPos - nPos; + separationVel += normalize(repel) * (ubo.rule2Distance - length(repel)); + } + + if (dist < ubo.rule3Distance) + { + alignmentVel += nVel; + ++rule3Count; + } + } + + if (rule1Count > 0) + { + center /= float(rule1Count); + vVel += (center - vPos) * ubo.rule1Scale; + } + + vVel += separationVel * ubo.rule2Scale; + + if (rule3Count > 0) + { + alignmentVel /= float(rule3Count); + vVel += alignmentVel * ubo.rule3Scale; + } + + // clamp velocity for a more pleasing simulation. + vVel = normalize(vVel) * clamp(length(vVel), 0.0, 0.1); - // kinematic update - vPos += vVel * ubo.deltaT; + // kinematic update + vPos += vVel * ubo.deltaT; // Wrap around boundary - if (vPos.x < -1.0) vPos.x = 1.0; - if (vPos.x > 1.0) vPos.x = -1.0; - if (vPos.y < -1.0) vPos.y = 1.0; - if (vPos.y > 1.0) vPos.y = -1.0; + if (vPos.x < -1.0) vPos.x = 1.0; + if (vPos.x > 1.0) vPos.x = -1.0; + if (vPos.y < -1.0) vPos.y = 1.0; + if (vPos.y > 1.0) vPos.y = -1.0; particlesB[index].pos.xy = vPos; diff --git a/data/shaders/computeparticles/particle.comp.spv b/data/shaders/computeparticles/particle.comp.spv index 059ab59..000223b 100644 Binary files a/data/shaders/computeparticles/particle.comp.spv and b/data/shaders/computeparticles/particle.comp.spv differ diff --git a/img/demo.gif b/img/demo.gif new file mode 100644 index 0000000..bc2ee56 Binary files /dev/null and b/img/demo.gif differ diff --git a/vulkanBoids/vulkanBoids.cpp b/vulkanBoids/vulkanBoids.cpp index 9b2f122..7b8fc00 100644 --- a/vulkanBoids/vulkanBoids.cpp +++ b/vulkanBoids/vulkanBoids.cpp @@ -158,6 +158,7 @@ class VulkanExample : public VulkanExampleBase { particle.pos = glm::vec2(rDistribution(rGenerator), rDistribution(rGenerator)); // TODO: add randomized velocities with a slight scale here, something like 0.1f. + particle.vel = 0.1f * glm::vec2(rDistribution(rGenerator), rDistribution(rGenerator)); } VkDeviceSize storageBufferSize = particleBuffer.size() * sizeof(Particle); @@ -244,7 +245,7 @@ class VulkanExample : public VulkanExampleBase VERTEX_BUFFER_BIND_ID, 1, VK_FORMAT_R32G32_SFLOAT, - offsetof(Particle, pos)); // TODO: change this so that we can color the particles based on velocity. + offsetof(Particle, vel)); // TODO: change this so that we can color the particles based on velocity. // vertices.inputState encapsulates everything we need for these particular buffers to // interface with the graphics pipeline. @@ -540,13 +541,34 @@ class VulkanExample : public VulkanExampleBase compute.descriptorSets[0], VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, 2, - &compute.uniformBuffer.descriptor) + &compute.uniformBuffer.descriptor), // TODO: write the second descriptorSet, using the top for reference. // We want the descriptorSets to be used for flip-flopping: // on one frame, we use one descriptorSet with the compute pass, // on the next frame, we use the other. // What has to be different about how the second descriptorSet is written here? + + // Binding 0 : Particle position storage buffer + vkTools::initializers::writeDescriptorSet( + compute.descriptorSets[1], // LOOK: which descriptor set to write to? + VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, + 0, // LOOK: which binding in the descriptor set Layout? + &compute.storageBufferB.descriptor), // LOOK: which SSBO? + + // Binding 1 : Particle position storage buffer + vkTools::initializers::writeDescriptorSet( + compute.descriptorSets[1], + VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, + 1, + &compute.storageBufferA.descriptor), + + // Binding 2 : Uniform buffer + vkTools::initializers::writeDescriptorSet( + compute.descriptorSets[1], + VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, + 2, + &compute.uniformBuffer.descriptor) }; vkUpdateDescriptorSets(device, static_cast(computeWriteDescriptorSets.size()), computeWriteDescriptorSets.data(), 0, NULL); @@ -590,6 +612,7 @@ class VulkanExample : public VulkanExampleBase // We also want to flip what SSBO we draw with in the next // pass through the graphics pipeline. // Feel free to use std::swap here. You should need it twice. + std::swap(compute.descriptorSets[0], compute.descriptorSets[1]); } // Record command buffers for drawing using the graphics pipeline @@ -694,7 +717,7 @@ class VulkanExample : public VulkanExampleBase vkCmdBindDescriptorSets(compute.commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, compute.pipelineLayout, 0, 1, compute.descriptorSets, 0, 0); // Record a dispatch of the compute job - vkCmdDispatch(compute.commandBuffer, PARTICLE_COUNT / 16, 1, 1); + vkCmdDispatch(compute.commandBuffer, PARTICLE_COUNT / 256, 1, 1); // Add memory barrier to ensure that compute shader has finished writing to the buffer // Without this the (rendering) vertex shader may display incomplete results (partial data from last frame)