Menu
  • HOME
  • TAGS

VTK OpenGL objects (3D texture) access from CUDA‏

Tag: opengl,cuda,vtk

Is there any proper way to access the low level OpenGL objects of VTK in order to modify them from a CUDA/OpenCL kernel using the openGL-CUDA/OpenCL interoperability feature?

Specifically, I would want to get the GLuint (or unsigned int) member from vtkOpenGLGPUVolumeRayCastMapper that points to the Opengl 3D Texture object where the dataset is stored, in order to bind it to a CUDA Surface to be able to access and modify its values from a CUDA kernel implemented by me.

For further information, the process that I need to follow is explained here: http://rauwendaal.net/2011/12/02/writing-to-3d-opengl-textures-in-cuda-4-1-with-3d-surface-writes/ where the texID object used there (in Steps 1 and 2) is the equivalent to what I want to retrieve from VTK.

At a first look at the vtkOpenGLGPUVolumeRayCastMapper functions, I don't find an easy way to do this, rather than maybe creating a vtkGPUVolumeRayCastMapper subclass, but even in that case I am not sure what should I modify exactly, since I guess that some other members depend on the 3D Texture values, and should be also updated after modifying it.

So, do you know some way to do this?

Lots of thanks.

Best How To :

Subclassing might work, but you could probably avoid it if you wanted. The important thing is that you get the order of the GL/CUDA API calls in the right order.

First, you have to register the texture with CUDA. This is done using:

cudaGraphicsGLRegisterImage(&cuda_graphics_resource, texture_handle,
GL_TEXTURE_3D, cudaGraphicsRegisterFlagsSurfaceLoadStore);

with the stipulation that texture_handle is a GLuint written to by a call to glGenTextures(...)

Once you have registered the texture with CUDA, you can create the surface which can be read or written to in your kernel.

The only thing you have to worry about from here is that vtk does not use the texture in between a call to cudaGraphicsMapResources(...) and cudaGraphicsUnmapResources(...). Everything else should just be standard CUDA.

Also once you map the texture to CUDA and write to it within a kernel, there is no additional work besides unmapping the texture. GL will get the modified texture the next time it is used.

How to create latitudinal (horizontal) contour lines in GLSL?

opengl,glsl,webgl

fwidth is not exactly generating the lines, mainly fract is. fwidth is only used to keep the line width constant in screen space. If you try to draw 1px lines only using fract, it will work. But if you want wide lines or antialiased lines, you'll need fwidth to make...

Tesla k20m interoperability with Direct3D 11

cuda,direct3d,tesla

No, this won't be possible. K20m can be used (with some effort) with OpenGL graphics on Linux, but at least up through windows 8.x, you won't be able to use K20m as a D3D device in Windows. The K20m does not publish a VGA classcode in PCI configuration space, which...

'an illegal memory access' when trying to write to a 2D array allocated using cudaMalloc3D

c,cuda

The reason the error doesn't occur on this line: REAL tmp = unew_row[j]; // no error on this line is because the compiler is optimizing that line out. It doesn't do anything useful, and so the compiler completely eliminates it. The compiler warning: xxx.cu(87): warning: variable "tmp" was declared but...

Is it possible to build a heatmap from point data at 60 times per second?

c++,opengl,visualization,simulation,heatmap

It is definitely feasible, probably even if the calculation are done by the CPU. Ideally you should be using the GPU. The APIs needed are either OpenCL or since you are rendering the results you might want to make use of Compute Shaders. Both techniques allow you to write a...

OpenGL: Defining variables in shaders

opengl,glsl

1. Question: Why is gl_Position a variable that has already been defined? This is because OpenGL/the rendering pipeline has to know which data should be used as basis for rasterization and interpolation. Since there is always exactly one such variable, OpenGL has the predefined variable glPosition for this. There are...

Why does Hyper-Q selectively overlap async HtoD and DtoH transfer on my cc5.2 hardware?

cuda

What you are observing is probably an artifact of running the code on a Windows WDDM platform. The WDDM subsystem has a lot of latency which other platforms are not hampered by, so to improve overall performance, the CUDA WDDM driver performs command batching. This can interfere with the expect...

OpenGL: GL_FRAMEBUFFER_UNSUPPORTED on specific combinations of framebuffer attachments

c++,opengl,textures,nvidia,framebuffer

Im really out of ideas what I can do to make it work... Your OpenGL implementation tells you, that the configuration you choose is not supported. You have to accept that. The OpenGL specification does not require that particular combination of formats to be supported by all implementation, so...

Can an unsigned long long int be used to store the output from clock64()?

cuda

There are various atomic functions which support atomic operations on unsigned long long int (ie. a 64-bit unsigned integer), such as atomicCAS, atomicExch and atomicAdd. And if you have a cc3.5 or higher GPU you have even more options. Referring to the documentation on clock64(): long long int clock64(); when...

OpenGL, Access violation

c++,opengl,access-violation

You did not initialize GLEW. Without doing that all the entry points provided by GLEW (which is everything beyond OpenGL-1.1) are left uninitialized and calling them crashes your program. Add if( GLEW_OK != glewInit() ) { return 1; } while( GL_NO_ERROR != glGetError() ); /* glewInit may cause some OpenGL...

GLFW3 create window returns null

c,opengl,struct,code-separation

The reason for not getting the window opened is that one has to specify GLFW_CONTEXT_VERSION_MINOR in addition to the other window hints. This could be done, e.g., with: glfwWindowHint(GLFW_CONTEXT_VERSION_MINOR, 3); ...

How to load data in global memory into shared memory SAFELY in CUDA?

c++,cuda,shared-memory

Consider one warp of the thread block finishing the first iteration and starting the next one, while other warps are still working on the first iteration. If you don't have __syncthreads at label sync2, you will end up with this warp writing to shared memory while others are reading from...

How to update the “forward” movement in OpenGL

c++,opengl,camera,fps

You should create vector of direction. Every time you move your mouse, you should reevaluate it according to angle. Directional vector should be vector on unit circle. Suppose you have directional vector direction = (0.4X, 0.6Z) (numbers can be unreal but let it be for example), then for moving forward...

Understanding Memory Replays and In-Flight Requests

caching,cuda

Effective load throughput is not the only metric that determines the performance of your kernel! A kernel with perfectly coalesced loads will always have a lower effective load throughput than the equivalent, non coalesced kernel, but that alone says nothing about its execution time: in the end, the one metric...

How can I pass a struct to a kernel in JCuda

java,struct,cuda,jni,jcuda

(The author of JCuda here (not "JCUDA", please)) As mentioned in the forum post linked from the comment: It is not impossible to use structs in CUDA kernels and fill them from JCuda side. It is just very complicated, and rarely beneficial. For the reason of why it is rarely...

Qt OpenGL transform feedback buffer functions missing

c++,qt,opengl,transform-feedback

It turned out that I never actually extended my class to use QOpenGLFunctions_4_3_Core, and it was instead just QOpenGLFunctions. Changing it to the former solved the problem.

Opengl - Is glDrawBuffers modification stored in a FBO? No?

opengl,state-machines,render-to-texture

Yes, the draw buffers setting is part of the framebuffer state. If you look at for example the OpenGL 3.3 spec document, it is listed in table 6.23 on page 299, titled "Framebuffer (state per framebuffer object)". The default value for FBOs is a single draw buffer, which is GL_COLOR_ATTACHMENT0....

Using a data pointer with CUDA (and integrated memory)

c++,memory-management,cuda

The pointer has to be created (i.e. allocated) with cudaHostAlloc, even on integrated systems like Jetson. The reason for this is that the GPU requires (zero-copy) memory to be pinned, i.e. removed from the host demand-paging system. Ordinary allocations are subject to demand-paging, and may not be used as zero-copy...

OpenGL stops rendering, possibly after an update

c++,opengl,glfw

Turns out I was using the Core OpenGL profile, which requires you to use Vertex Array Objects, which I didn't. Up until ~February, the graphics didn't mind, but after a certain driver update, it refused to render the object (Which I believe is the correct behaviour).

Is prefix scan CUDA sample code in gpugems3 correct?

cuda,gpu,nvidia,prefix-sum

It seems that you've made at least 1 error in transcribing the code from the GPU Gems 3 chapter into your kernel. This line is incorrect: temp[bi] += g_idata[ai]; it should be: temp[bi] += temp[ai]; When I make that one change to the code you have now posted, it seems...

How can I render an infinite 2D grid in GLSL?

opengl,glsl

A simple case of aliasing. Just like with polygon rendering, your fragment shader is run once per pixel. Colour is computed for a single central coordinate only and is not representative of the true colour. You could create a multisample FBO and enable super-sampling. But that's expensive. You could mathematically...

cudaMalloc vs cudaMalloc3D performance for a 2D array

c,cuda

The performance difference you observe is mostly due to the increased instruction overhead in the pitched memory indexing scheme. Because your array size is a large power of two in the major direction, it is very likely that the pitched array allocated with cudaMalloc3D is the same size as the...

Spotlight with shadows becomes square-like

c++,opengl,graphics,shader,shadow

This is intended. Since your shadowmap covers a pyramid-like region in space, your spotlight's cone can be occluded by it. This is happening because where you render something that is outside of the shadow camera's view, it will be considered unlitten. Therefore the shadow camera's view pyramid will be visible....

Unable to render a texture on a quad

c++,opengl,textures,texturing

i dont think you can just input a raw file to glTexImage2D, except if you store texture files in that format (which you probably dont). glTexImage2D expects a huge array bytes (representing texel colors), but file formats typically dont store images like that. Even bmp has some header information in...

Creating GUI with OpenGL

java,opengl

You can check out TWL's wiki here: "http://wiki.l33tlabs.org/bin/view/TWL/" it has some basic tutorials on how to use it, and here's a "Getting Started" page for niftyGUI: https://github.com/void256/nifty-gui/wiki/Getting-Started

Effect of rendering calls on performance

c++,opengl

In the way you describe that (checking if each polygon is within FOV), it will almost always be slower - GPU can do it faster. But this idea can be improved by organizing the polygons in some clever data structure, which can quickly cut out large numbers of polygons that...

OpenGL glTexImage2D memory issue

c,opengl

Which man page are you quoting? There are multiple man pages available, not all mapping to the same OpenGL version. Anyways, the idea behind the + 2 (border) is to have 2 multiplied by the value of border, which is in your case 0. So your code is just fine....

OpenGL - translation stretches and distorts sprite

c++,opengl,transform,translation,distortion

Matching OpenGL, GLM stores matrices in column major order. The constructors also expect elements to be specified in the same order. However, your translation matrix is specified in row major order: glm::mat4 trans = glm::mat4( 1.0f, 0.0f, 0.0f, translation.x, 0.0f, 1.0f, 0.0f, translation.y, 0.0f, 0.0f, 1.0f, translation.z, 0.0f, 0.0f, 0.0f,...

What is the difference between glUseProgram() and glUseShaderProgram()?

c++,c,opengl,shader

glUseShaderProgramEXT() is part of the EXT_separate_shader_objects extension. This extension was changed significantly in the version that gained ARB status as ARB_separate_shader_objects. The idea is still the same, but the API looks quite different. The extension spec comments on that: This extension builds on the proof-of-concept provided by EXT_separate_shader_objects which demonstrated...

Can the OpenGL 'deprecated' functions possibly be unsupported?

c++,c,opengl,graphics,deprecated

I'd like to extend on the article about deprecation in the OpenGL wiki which was given in the comments already. The current situation is that we can discern 3 "flavours" of OpenGL contexts on desktop platforms: "Legacy" GL. This means the GL from the old days, before there was any...

R32I sampler2D returns always 0

c++,opengl

Since you are using an integer format, you will have to use an isampler2D instead of your sampler. So for samplers, floating-point samplers begin with "sampler​". Signed integer samplers begin with "isampler​", and unsigned integer samplers begin with "usampler​". If you attempt to read from a sampler where the texture's...

Need help finding out why the texture doesn't load

java,opengl

Enable textures with glEnable(GL_TEXTURE_2D); ...

CUDA cuBlasGetmatrix / cublasSetMatrix fails | Explanation of arguments

cuda,gpgpu,gpu-programming,cublas

The only actual problem in your code is here: cudaMalloc( &d_x,sizeof(d_x) ); sizeof(d_x) is just the size of a pointer. You can fix it like this: cudaMalloc( &d_x,sizeof(x) ); If you want to find out if a CUBLAS API call is failing, then you should check the return code of...

GLSL - program link error: Slot 0 unavailable from layout location request

c++,opengl,glsl

Well, the situation is very clear. You already gave the answer yourself. Shouldn't the location of the second vector be (location = 1)? Yes. Or less specific: it should be something else than 0. Attribute locations must be unique in a single program, for obvious reasons. The code you copied...

scale texture opengl 2

opengl,textures,scale

Actually stretching a texture over a rectangle works with the texture coordinates. But if you want to repeat it you have to set: glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT); glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT); ...

glclearcolor best color?

opengl,optimization

No (technically there could be a difference, since OpenGL does not impose any performance requirements on the functioncall). Btw. you should set the clear color before calling clear....

No OpenGL context is current in the current thread

java,opengl,lwjgl

You need to call glfwMakeContextCurrent to bind the OpenGL context to your thread. There's a working example on the LWJGL website as well.

how to generalize square matrix multiplication to handle arbitrary dimensions

c,cuda,parallel-processing,matrix-multiplication

This code will work for very specific dimensions but not for others. It will work for square matrix multiplication when width is exactly equal to the product of your block dimension (number of threads - 20 in the code you have shown) and your grid dimension (number of blocks -...

Update a D3D9 texture from CUDA

c#,cuda,sharpdx,direct3d9,managed-cuda

As hinted by the commenter, I’ve tried creating a single instance of CudaDirectXInteropResource along with the D3D texture. It worked. It’s counter-intuitive and undocumented, but it looks like cuGraphicsUnregisterResource destroys the newly written data. At least on my machine with GeForce GTX 960, Cuda 7.0 and Windows 8.1 x64. So,...

Ambient and Specular lighting not working correctly in GLSL

opengl,glsl,lighting,phong

In your applyPointLight function, you're not using the diff and spec variables, which are presumably the light-dependent changes to diffuse and specular. See if the following works: vec3 diffuse = light.diffuse * surfaceColor * light.color * diff; vec3 specular = light.specular * surfaceSpecular * light.color * spec; ...

What is version of cuda for nvidia 304.125

ubuntu,cuda,ubuntu-14.04,nvidia

304.xx is a driver that will support CUDA 5 and previous (does not support newer CUDA versions.) If you want to reinstall ubuntu to create a clean setup, the linux getting started guide has all the instructions needed to set up CUDA 7 if that is your intent. I believe...

Passing an int to a function, then using that int to create an array

c++,arrays,opengl

Your second options is close, you can get at the underlying array of the vector by calling .data() myConstructor(int num) { std::vector <GLuint> textures(num); glGenTextures(num, textures.data()); } Assuming glGenTextures has a signature like void glGenTextures(int, GLuint*) I don't know much about this function, but be careful who owns that array....

The order of the linked libraries C++ linker

c++,opengl,opencl

The OpenCL interface library installed on your system may pull in a different libGL.so than the libGL.so that shall eventually be loaded by your program. For example if you've got installed the Mesa OpenCL implementation but are using the NVidia driver, then linking against Mesa's OpenCL may pull in Mesa's...

Java: FloatBuffer to OpenGL - wrap() vs. allocate() vs. BufferUtils.createBuffer()

java,opengl,buffer

BufferUtils returns a direct buffer whereas the others might not. You can check the directness of the wrap and allocate methods using isDirect() method. Wrappers are non direct....

Need Minimum Textures required for OpenGL

opengl

OpenGL 1.x and 2.x require at least 2 texture units. OpenGL 3.x and 4.x require at least 16. Most current GPUs have 32. You can find those values fairly easily in the OpenGL specification itself, in the "Implementation Dependent Values" table. This specific value is called MAX_TEXTURE_UNITS in 1.x and...

How do you build the example CUDA Thrust device sort?

c++,visual-studio-2010,sorting,cuda,thrust

As @JaredHoberock pointed out, probably the key issue is that you are trying to compile a .cpp file. You need to rename that file to .cu and also make sure it is being compiled by nvcc. After you fix that, you will probably run into another issue. This is not...

C++ Opengl - How to load tgas and pngs in modern OpenGL? [on hold]

image,opengl,c++11,png,tga

Writing a loader for TGA is relatively straightforward, so for an exercise: go for it. PNG on the other hand is a different kind of beast. It has a gazillion features, supports multiple compression schemes and encodings, all of which you have to support to load PNG files generated by...

Multitexturing theory with texture objects and samplers

opengl,textures

Much of this must have been explained before, but let me try and give an overview that will hopefully make it clearer how all the different pieces fit together. I'll start by explaining each piece separately, and then explain how they are connected. Texture Target This refers to the different...

OpenGL: Strange bahaviour of VBO deletion?

c++,opengl,vbo,opengl-3,vao

Reto Koradi already mentioned copy semantics. Another thing to keep in mind is that OpenGL allows context sharing, i.e. some objects are shared between the OpenGL contexts and deleting in one context deletes it from all contexts. Objects transcending shared contexts are textures buffer objects that are bound using glBindBuffer...

Why normal mapping doesn't appear correctly?

opengl,jogl

Your texture is a height map. Not a normal map. You have two options: Replace the height map with a normal map (they are usually blueish). Or calculate the normal from the height map. This can be done like this: float heightLeft = textureOffset(tex_normal, fs_in.UV, ivec2(-1, 0)).r; float heightRight =...

Why are shaders and programs stored as integers in OpenGL?

c++,opengl,opengl-es,integer,shader

These integers are handles.This is a common idiom used by many APIs, used to hide resource access through an opaque level of indirection. OpenGL is effectively preventing you from accessing what lies behind the handle without using the API calls. From Wikipedia: In computer programming, a handle is an abstract...