Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

GLES 3.0, instancing, and other changes to shaders in Codea 2.3.2

edited December 2015 in General Posts: 2,020

Until now Codea's shaders have used GLES 2.0

Starting with 2.3.2 beta build 56, optional support for GLES 3.0 is added.

This is only available on devices with an A7 chip or newer (iPad Air onwards).

(and is currently only available to Codea beta testers).

GLES 3.0 is optional, and can be added on a shader-by-shader basis (ie you can mix 2.0 and 3.0 in the same program) by placing #version 300 es in the string of the vertex and fragment shaders, so you should be able to add it gradually, if it's needed for your code.

This Apple page is quite a handy cheat sheet to the changes you need to make to the shader language : https://developer.apple.com/library/ios/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/AdoptingOpenGLES3/AdoptingOpenGLES3.html#//apple_ref/doc/uid/TP40008793-CH504-SW18

And regardless of which GLES version you use, you can never read this page enough times:

https://developer.apple.com/library/ios/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/BestPracticesforShaders/BestPracticesforShaders.html

The most useful resource is probably the quick reference card (pages 4 to 6 is the relevant section):

https://www.khronos.org/files/opengles3-quick-reference-card.pdf

And this is the bible:

https://www.khronos.org/registry/gles/specs/3.0/GLSL_ES_Specification_3.00.4.pdf

Here's some very simple sample code:

-- test of GLES 300 Matrix Mesh Buffer Test
--

function setup()
    m = mesh()
    m:addRect(0,0,200,200)
    m:setColors(color(255,200,0))
    m.shader = shader(vert, frag)

    dummy = m:buffer("dummy") --test vec4 attribute
    transform = m:buffer("transform") --test mat4 attribute
    local mat = matrix():translate(0,100,0) --should move rectangle up
    for i=1,m.size do
        dummy[i] = vec4(60,0,0,0) --works, moves rectangle to right
        transform[i] = mat
    end

end

function draw()
    background(40, 40, 50)
    perspective()
    camera(0,0,400, 0,0,0)

    m:draw()
end

vert=[[
#version 300 es

uniform mat4 modelViewProjection;

in mat4 transform;
in vec4 position;
in vec4 color;
in vec4 dummy;

out lowp vec4 vColor;

void main()
{
    vColor = color;
    gl_Position = modelViewProjection * (transform * position + dummy);
}
]]

frag=[[
#version 300 es

in lowp vec4 vColor; //NB varying becomes "in" in fragment shader
out lowp vec4 fragColor;

void main()
{
    fragColor = vColor;
}
]]
Tagged:

Comments

  • edited October 2015 Posts: 2,020

    One issue to be aware of:

    in 3.0, the old texture2D and texture3D commands have been replaced with a single texture command. The problem is that this clashes with the uniform texture variable that the Codea Mesh API passes to the shader.

    The workaround is to define your own custom texture variable eg:

    frag = [[
    #version 300 es
    
    uniform lowp sampler2D textureMap; //custom texture variable
    in highp vec2 vTexCoord;
    out lowp vec4 fragColor;
    
    void main()
    {
      fragColor = texture(textureMap, vTexCoord); //the new texture command
    }
    ]]
    

    Then, in the Codea code, replace

    myMesh.texture = "Dropbox:grass"

    with:

    myMesh.shader.textureMap = "Dropbox:grass"

    (note the .shader).

  • edited October 2015 Posts: 2,020

    How to convert a 2.0 shader to 3.0:

    1. In the vertex shader:

      • add #version 300 es to start
      • change attribute to in
      • change varying to out
    2. In the fragment shader:

      • add #version 300 es to start
      • change varying to in (nb different from the vertex shader, you can't just do a global find/replace on varying)
      • there is no built-in gl_FragColor variable anymore. Instead, you define any variable as an out eg out lowp vec4 fragColor;, out lowp vec4 fragmentColor; etc
      • the texture2D and texture3D functions have been replaced with a single texture function. Note that this clashes with the texture uniform passed by the Codea mesh API, so you will have to use a custom variable, eg uniform lowp sampler2D textureMap; which you access from Codea with myMesh.shader.textureMap instead of myMesh.texture (see post immediately above this one).
  • edited October 2015 Posts: 2,020

    Also, even if you don't use GLES 3.0, the GLES 2.0 shader specifications are now more fully supported by the Codea 2.3.2 mesh API, particularly with regard to matrices:

    • You can now have an array of uniform mat4s eg: uniform mat4 modelMatrices[20];
    • Attributes/buffers can now be matrices: eg attribute mat4 modelMatrix;
  • edited October 2015 Posts: 2,020

    added a link to the first post to the quick reference card

    (love the Khronos quick reference cards. Everything should have a quick reference card):

    https://www.khronos.org/files/opengles3-quick-reference-card.pdf

    Start your engines! B-)

  • edited October 2015 Posts: 2,020

    Initial thoughts:

    there's obviously an enormous amount to test here, and many features may require changes to the Codea API if they're going to be accessible.

    Things that GLES 3.0 supports that I'm going to try when I have a spare minute:

    textures on the vertex shader!

    Could be a great way of creating various vertex displacement effects, displacement mapping, height mapping with a texture, very cheap noise effects ( compared to the relatively expensive ashima arts pnoise/ snoise functions), providing data for instanced drawing, building a voxel engine etc

  • IgnatzIgnatz Mod
    Posts: 5,396

    looking forward to trying this soon, thanks for your efforts, much appreciated :bz

  • edited October 2015 Posts: 2,020

    Ok, first test of a unique-to-GLES 3.0 feature: textures on the vertex shader! (Here being used to displace vertices).

    It works!

    But, when you animate it, it's jerky. Haven't worked out quite why that is yet. I thought it was to do with the resolution of the texture, but when I increased it, it stayed jittery. Anyone have any ideas how to make the animation smoother?

    (Take two texture samples and interpolate between them somehow?)

    -- test of GLES 300 
    -- textures on the vertex shader for vertex displacement
    --
    
    function setup()
       spriteMode(CORNER)
    
        -- create a mesh of 10x10 quads. nb using addRect for a "Z-Up" world
        m = mesh()
        local w,h = 20,20
        for x=1,10 do
            for y=1,10 do
                local r = m:addRect((x-5.5)*w,(y-5.5)*h,w,h)
                m:setRectTex(r, (x-1)*0.1,(y-1)*0.1,0.1,0.1)
                m:setRectColor(r, math.random(255), math.random(255), math.random(255))
            end
        end
    
        -- create a noise image that will displace these vertices
        img = image(256,256)
        local fine = 10
        local seed = math.random() * 5000
        local seed2 = math.random() * 3000
        for x=1,img.width do
            for y = 1,img.height do
                local n1 = noise(x/fine, y/fine, seed)+1
                local n2 = noise(x/fine, y/fine, seed2)+1
                img:set(x,y, math.ceil(n1 * 20),math.ceil(n2*60),math.ceil( n1*128) )
                --Bigger value in blue because we're displacing more along Z axis (ie up)
            end
        end
    
        --set up the shader
        m.shader = shader(vert, frag)
        m.shader.heightMap = img
        offset = vec2(0,0)
        rot = 0
    end
    
    function draw()
        background(40, 40, 50)
        sprite(img, 0,0) --show a preview of the displacement map in the corner
        perspective()
        camera(0,-200,180, 0,0,30, 0,0,1) --Z up
    
        rot = (rot + 0.2)%360
        rotate(rot)
    
        offset = offset + vec2(0.0002, -0.0002) --why does this produce jerky motion?
        m.shader.offset = offset
        m:draw()
    end
    
    vert=[[
    #version 300 es
    
    uniform mat4 modelViewProjection;
    uniform sampler2D heightMap; //texture in vert shader! 3.0 only
    uniform highp vec2 offset;
    
    in highp vec2 texCoord;
    in highp vec4 position;
    in lowp vec4 color;
    
    out lowp vec4 vColor;
    
    void main()
    {
        vColor = color;
        vec3 displace = texture(heightMap, fract(texCoord + offset )).rgb;
        gl_Position = modelViewProjection * (position + vec4(displace * 100., 0.));
    }
    ]]
    
    frag=[[
    #version 300 es
    
    in lowp vec4 vColor; //NB varying becomes "in" in fragment shader
    out lowp vec4 fragmentColor; 
    
    void main()
    {
        fragmentColor = vColor;
    }
    ]]
    
    
  • Posts: 2,020

    Instancing is coming! While we wait, here's an interesting article with Android code examples and fps comparisons with and w/o instancing:

    https://software.intel.com/en-us/articles/opengl-es-30-instanced-rendering

  • edited October 2015 Posts: 2,020

    With regard to the vertex displacement shader (2 posts above this one). The animation is even jerkier if you set noSmooth(), so it is something to do with the way the texture is read. The spec says "level of detail [for texture lookup] is not implicitly computed for vertex shaders" (p93). Hmmm. I think someone needs to invest in a "GLES 3.0 programming recipes" book :-??

  • edited October 2015 Posts: 2,020

    Here's a very quick (and not terribly exciting) adaptation of @Simeon 's 2D instanced example, using 3D cubes instead of rects (GLES 2.0).

    function setup()
        m = mesh()
    
        m.vertices, m.texCoords = addCube()
        m:setColors(color(255,200,0))
        m.shader = shader(vert, frag)
        m.texture = readImage("Planet Cute:Wall Block")
        numInstances = 100
    
        dummy = m:buffer("dummy") --test vec4 attribute
        transform = m:buffer("transform") --test mat4 attribute
    
        -- Here we resize the transform buffer to
        -- the number of instances, rather than vertices
        -- note that the buffer can be any size:
        --  if you render N instances with a buffer sized to M
        --  where N > M then the instances will use a divided
        --  version of the buffer
        --
        -- E.g. I render 10 instances with 5 transforms then
        --  instance 1,2 use transform 1
        --  instance 3,4 use transform 2
        --  instance 5,6 use transform 3
        --  ... etc
        transform:resize(numInstances)
    
        -- We have to tell Codea to treat this buffer as an
        -- instanced buffer, that is, it is not per-vertex
        transform.instanced = true
    
        for i=1,m.size do
            -- Set per vertex
            dummy[i] = vec4(0,0,0,0)
        end
    
        for i=1,numInstances do
            -- Sets one transform *per instance*
            local x = ((i-1)%10) - 5.5
            local y = math.floor((i-1)/10) - 5.5
            transform[i] = matrix():translate(x * 2, y * 2, 0)
        end
        rot = 0
        pos = vec3(0,0,0)
    end
    
    function draw()
        background(40, 40, 50)
    
        perspective()
        camera(30,40,20, 0,0,0, 0,0,1)
        -- mesh:draw can now take a number of instances
        -- it draws this many, instanced buffers can be
        -- used to differentiate each instance within a
        -- shader
        rot = (rot + 0.2%360)
        pos = pos + vec3(-0.02,0,0)
        translate(pos:unpack())
        rotate(rot)
        m:draw(numInstances)
    end
    
    vert=[[
    uniform mat4 modelViewProjection;
    
    attribute mat4 transform;
    attribute vec4 position;
    attribute vec4 color;
    attribute vec4 dummy;
    attribute mediump vec2 texCoord;
    
    varying lowp vec4 vColor;
    varying mediump vec2 vTexCoord;
    
    void main()
    {
        vColor = color;
        vTexCoord = texCoord;
        gl_Position = modelViewProjection * (transform * position + dummy);
    }
    ]]
    
    frag=[[
    uniform sampler2D texture;
    varying lowp vec4 vColor;
    varying mediump vec2 vTexCoord;
    void main()
    {
        gl_FragColor = texture2D(texture, vTexCoord) * vColor;
    }
    ]]
    
    function addCube()
            local vertices = {
          vec3(-0.5, -0.5,  0.5), -- Left  bottom front
          vec3( 0.5, -0.5,  0.5), -- Right bottom front
          vec3( 0.5,  0.5,  0.5), -- Right top    front
          vec3(-0.5,  0.5,  0.5), -- Left  top    front
          vec3(-0.5, -0.5, -0.5), -- Left  bottom back
          vec3( 0.5, -0.5, -0.5), -- Right bottom back
          vec3( 0.5,  0.5, -0.5), -- Right top    back
          vec3(-0.5,  0.5, -0.5), -- Left  top    back
        }
    
    
        -- now construct a cube out of the vertices above
        local cubeverts = {
          -- Front
          vertices[1], vertices[2], vertices[3],
          vertices[1], vertices[3], vertices[4],
          -- Right
          vertices[2], vertices[6], vertices[7],
          vertices[2], vertices[7], vertices[3],
          -- Back
          vertices[6], vertices[5], vertices[8],
          vertices[6], vertices[8], vertices[7],
          -- Left
          vertices[5], vertices[1], vertices[4],
          vertices[5], vertices[4], vertices[8],
          -- Top
          vertices[4], vertices[3], vertices[7],
          vertices[4], vertices[7], vertices[8],
          -- Bottom
          vertices[5], vertices[6], vertices[2],
          vertices[5], vertices[2], vertices[1],
        }
    
        -- all the unique texture positions needed
        local texvertices = { vec2(0.03,0.24),
                              vec2(0.97,0.24),
                              vec2(0.03,0.69),
                              vec2(0.97,0.69) }
    
        -- apply the texture coordinates to each triangle
        local cubetexCoords = {
          -- Front
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
          -- Right
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
          -- Back
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
          -- Left
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
          -- Top
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
          -- Bottom
          texvertices[1], texvertices[2], texvertices[4],
          texvertices[1], texvertices[4], texvertices[3],
        }
        return cubeverts, cubetexCoords
    end
    
  • Posts: 2,020

    The GitHub repo for that book looks good:

    https://github.com/danginsburg/opengles3-book/

  • Jmv38Jmv38 Mod
    Posts: 3,295

    in the book i read

    Flat/smooth interpolators—In OpenGL ES 2.0, all interpolators were implicitly linearly interpolated across the primitive. In ESSL 3.00, interpolators (vertex shader outputs/fragment shader inputs) can be explicitly declared to have either smooth or flat shading.

    maybe this is related to jerkiness?

  • Posts: 2,020

    Yeah I think so. I think this corresponds to smooth/ noSmooth in the Codea API, eg a low res texture in the frag shader displays pixelated with noSmooth, and blurred/smoothed out with smooth, and in this example, if you set noSmooth, you can clearly see the animation has "steps" as it goes from pixel to pixel. The problem is, it also has a jittery quality in smooth mode too (as if it's going forwards and backwards over each part of the animation instead of just smoothly moving forwards). There are loads of new texture sampling functions in GLES 3.0 so maybe I need to try one of those instead of texture. I'll ask on stack exchange, as there are people there with lots of 3.0 experience.

  • edited October 2015 Posts: 1,976

    I tried playing around with cloth/hair simulation, here's what I came up with

    -- Cloth Sim
    
    -- Use this function to perform your initial setup
    function setup()
        displayMode(FULLSCREEN)
        numStrands = 300
        steps = 20
        vS = [[
    //
    // A basic vertex shader
    //
    
    //This is the current model * view * projection matrix
    // Codea sets it automatically
    uniform mat4 modelViewProjection;
    
    //This is the current mesh vertex position, color and tex coord
    // Set automatically
    attribute vec4 position;
    attribute vec4 color;
    attribute vec2 texCoord;
    attribute mat4 model;
    attribute mat4 bend;
    
    //This is an output variable that will be passed to the fragment shader
    varying lowp vec4 vColor;
    varying highp vec2 vTexCoord;
    
    void main()
    {
        //Pass the mesh color to the fragment shader
        vColor = color;
        vTexCoord = texCoord;
    
        //Multiply the vertex position by our combined transform
        mat4 mvp = modelViewProjection * model;
        for (float i = 0.0; i < abs(position.y - 0.5); i += 0.5) {
            mvp = mvp * bend;
        }
        gl_Position = mvp * position;
    }
    ]]
        fS = [[
    //
    // A basic fragment shader
    //
    
    //Default precision qualifier
    precision highp float;
    
    //This represents the current texture on the mesh
    uniform lowp sampler2D texture;
    
    //The interpolated vertex color for this fragment
    varying lowp vec4 vColor;
    
    //The interpolated texture coordinate for this fragment
    varying highp vec2 vTexCoord;
    
    void main()
    {
        //Sample the texture at the interpolated coordinate
        //lowp vec4 col = texture2D( texture, vTexCoord ) * vColor;
    
        //Set the output color to the texture color
        gl_FragColor = vColor;
    }
    ]]
        strand = mesh()
        strand.shader = shader(vS, fS)
        local vertices = {}
        local cols = {}
        local len = 0.5
        local wid = 0.05
        for i = 0, steps do
            table.insert(vertices, vec3(-wid, (-0.5 - i) * len, 0.0))
            table.insert(vertices, vec3(wid, (-0.5 - i) * len, 0.0))
            table.insert(vertices, vec3(wid, (0.5 - i) * len, 0.0))
            table.insert(vertices, vec3(-wid, (-0.5 - i) * len, 0.0))
            table.insert(vertices, vec3(wid, (0.5 - i) * len, 0.0))
            table.insert(vertices, vec3(-wid, (0.5 - i) * len, 0.0))
            local mult = math.random() * 0.2 + 0.9
            local col = color(84 * mult, 71 * mult, 58 * mult, 255)
            for j = 1, 6 do
                table.insert(cols, col)
            end
        end
        strand.colors = cols
        strand.vertices = vertices
        model = strand:buffer("model")
        model.instanced = true
        model:resize(numStrands)
        for i = 1, numStrands do
            model[i] = matrix():translate(i * 0.05, 0, 0)
        end
        bends = {}
        bend = strand:buffer("bend")
        bend.instanced = true
        bend:resize(numStrands)
        for i = 1, numStrands do
            bend[i] = matrix()
        end
        cam = vec2(-10, -10)
        timeSpeed = 1
    end
    
    -- This function gets called once every frame
    function draw()
        -- This sets a dark background color 
        background(191, 240, 255)
    
        -- This sets the line thickness
        strokeWidth(5)
    
        -- Do your drawing here
        cam = cam:rotate(math.rad(DeltaTime * timeSpeed * 22.5))
        camera(cam.x + numStrands * 0.025, 0, cam.y, numStrands * 0.025, -3.75, 0, 0, 1, 0)
        perspective(70)
        for i = 1, numStrands do
            bend[i] = matrix():rotate(
                (noise(ElapsedTime * timeSpeed, i * 0.05, 0) * 0.25
                + noise(ElapsedTime * timeSpeed, i * 0.025, 0) * 0.5
                + noise(ElapsedTime * timeSpeed, i * 0.0125, 0)) / 1.75
            * 2, 1, 0, 0):rotate(
                (noise(ElapsedTime * timeSpeed, i * 0.05, 16) * 0.25
                + noise(ElapsedTime * timeSpeed, i * 0.025, 16) * 0.5
                + noise(ElapsedTime * timeSpeed, i * 0.0125, 16)) / 1.75
            * 1, 0, 0, 1)
        end
        strand:draw(numStrands)
    end
    
    

    It can render 300 strands and keep a pretty stable 60 FPS...Instances seem pretty cool, and very powerful :D

  • SimeonSimeon Admin Mod
    Posts: 4,892

    @SkyTheCoder impressive! Looks great.

  • Just one note (don't have a recent enough iPad for GLES 3), but textures in vertex shaders came along I think in IOS 7/8 I did some stuff with it reasonably successfully a long time ago. They are very handy, but certainly not a ES3 only capability.

  • Posts: 2,020

    Gosh, @spacemonkey is right. I didn't realised that that had been enabled. Here's the above vertex displacement code for GLES 2.0. @spacemonkey do you know why the animation is jittery?

    -- test of GLES 300 
    -- textures on the vertex shader for vertex displacement
    --
    
    function setup()
        spriteMode(CORNER)
      --  noSmooth()
        m = mesh()
        local w,h = 20,20
        for x=1,10 do
            for y=1,10 do
                local r = m:addRect((x-5.5)*w,(y-5.5)*h,w,h)
                m:setRectTex(r, (x-1)*0.1,(y-1)*0.1,0.1,0.1)
                m:setRectColor(r, math.random(255), math.random(255), math.random(255))
            end
        end
        img = image(256,256)
        local fine = 10
        local seed = math.random() * 5000
        local seed2 = math.random() * 3000
        for x=1,img.width do
            for y = 1,img.height do
                local n1 = noise(x/fine, y/fine, seed)+1
                local n2 = noise(x/fine, y/fine, seed2)+1
                img:set(x,y, math.ceil(n1 * 20),math.ceil(n2*60),math.ceil( n1*128) )
            end
        end
        m.shader = shader(vert, frag)
        m.shader.heightMap = img
        offset = vec2(0,0)
        rot = 0
    end
    
    function draw()
        background(40, 40, 50)
        sprite(img, 0,0)
        perspective()
        camera(0,-200,180, 0,0,30, 0,0,1)
    
        rot = (rot + 0.2)%360
        rotate(rot)
    
        offset = offset + vec2(0.0002, -0.0002) --why does this produce jerky motion?
        m.shader.offset = offset
        m:draw()
    end
    
    vert=[[
    
    uniform mat4 modelViewProjection;
    uniform sampler2D heightMap;
    uniform highp vec2 offset;
    
    attribute highp vec2 texCoord;
    attribute highp vec4 position;
    attribute lowp vec4 color;
    
    varying lowp vec4 vColor;
    
    void main()
    {
        vColor = color;
        vec3 displace = texture2D(heightMap, fract(texCoord + offset )).rgb; //textureOffset(heightMap, texCoord, offset).rgb
        gl_Position = modelViewProjection * (position + vec4(displace * 100., 0.));
    }
    ]]
    
    frag=[[
    
    varying lowp vec4 vColor; 
    
    void main()
    {
        gl_FragColor = vColor;
    }
    ]]
    
  • Posts: 2,020

    @SkyTheCoder good work!

  • Posts: 2,020

    I have some disappointing news about instancing. I tried it with a simple OBJ model to see what the performance was like, and using instancing was actually significantly slower than just drawing the mesh at multiple locations. A 4572 vert model, with 60 instances (so 274320 verts altogether, not all that high). Drawing the mesh at 60 locations and the framerate stays at 60, while drawing the mesh once, with 60 instances, and the framerate drops to 45-47 or so. =((

    I'll post some code if I can clean it up a little.

    I guess more testing (and reading) is necessary to try to work out which situations actually benefit from instancing, maybe it is the case that it's more suited for particle type systems, 100s of instances of a relatively small amount of geometry, rather than, as here, smallish batches of larger models.

  • SimeonSimeon Admin Mod
    edited October 2015 Posts: 4,892

    @yojimbo2000 interesting, I wonder if it varies by hardware.

    Edit: if you have a sample where instancing is slower, share it here and I'll put it through the profiler to see what's going on.

  • Posts: 2,020

    @Simeon here is the code:

    https://gist.github.com/Utsira/5dcfd0ad57ceed56d8d2

    It will download the model you choose from GitHub. 60 copies of model number 2 (the default) is what I've used for testing. If you press "Load Normal", it will load and display the number of instances by drawing repeatedly. If you press "LoadInstanced" it will display the copies with instancing. I even swapped out the specular highlight shader for just diffuse shading in the instanced version to try to get it to run faster.

    I'm on an iPad Air 1.

  • Posts: 2,020

    Ok this is cool. It's the same example that @Simeon first posted, but instead of supplying an array of transformations, it calculates the positions by referring to the gl_InstanceIDEXT variable (It's the instance number).

    Interestingly, I could only access this by setting instanced drawing on with #extension GL_EXT_draw_instanced: enable. @Simeon I'm a bit confused, I'd assumed that the Codea mesh API must have been adding this line to the shaders automatically for instanced drawing to be accessible in GLES 2.0? How are you getting the instanced drawing in GLES 2 if you're not using this extension?

    function setup()
        m = mesh()
    
        m:addRect(0,0,30,30)
        m:setColors(color(255,200,0))
        m.shader = shader(vert, frag)
    
        numInstances = 100
    
    end
    
    function draw()
        background(40, 40, 50)
    
        translate(WIDTH/2 - 200, HEIGHT/2 - 200)
    
        -- mesh:draw can now take a number of instances
        -- it draws this many, instanced buffers can be
        -- used to differentiate each instance within a
        -- shader
        m:draw(numInstances)
    end
    
    vert=[[
    #extension GL_EXT_draw_instanced: enable 
    
    uniform mat4 modelViewProjection;
    
    attribute vec4 position;
    attribute vec4 color;
    
    varying lowp vec4 vColor;
    
    void main()
    {
        vColor = color;
        float xOffset = mod(float(gl_InstanceIDEXT), 10.) * 40. - 2.5;
        float yOffset = float(gl_InstanceIDEXT / 10) * 40. - 2.5;
        vec4 offset = vec4(xOffset, yOffset, 0, 0);
        gl_Position = modelViewProjection * (position + offset);
    }
    ]]
    
    frag=[[
    
    varying lowp vec4 vColor;
    
    void main()
    {
        gl_FragColor = vColor;
    }
    ]]
    
  • dave1707dave1707 Mod
    edited October 2015 Posts: 7,533

    @yojimbo2000 I took your code and added more instances so I could check the average frames per second. Both programs draw 16240 rects. The first program uses instanceing and the second just creates a mesh with that many rects. The instanceing runs at an average FPS of 54.5 while the non instanceing runs at an average of 59.6 . I think I did everything right.

    displayMode(FULLSCREEN)
    --54.5
    
    function setup()
        m = mesh()
        m:addRect(0,0,4,4)
        m:setColors(color(255,200,0))
        m.shader = shader(vert, frag)
        numInstances = 140*116
        tot,cnt=0,0
    end
    
    function draw()
        background(40, 40, 50)
        fill(255)
        tot=tot+DeltaTime
        cnt=cnt+1
        text("Avg FPS  "..string.format("%.2f",cnt/tot),WIDTH/2,HEIGHT-50)
        text("numInstances  "..numInstances,WIDTH/2,HEIGHT-80)
        translate(30,50)
        m:draw(numInstances)
    end
    
    vert=[[
        #extension GL_EXT_draw_instanced: enable     
        uniform mat4 modelViewProjection;
        attribute vec4 position;
        attribute vec4 color;    
        varying lowp vec4 vColor;
        void main()
        {   vColor = color;
            float xOffset = mod(float(gl_InstanceIDEXT), 140.) * 5.;
            float yOffset = float(gl_InstanceIDEXT / 140) * 5.;
            vec4 offset = vec4(xOffset, yOffset, 0, 0);
            gl_Position = modelViewProjection * (position + offset);
        }
        ]]
    
    frag=[[    
        varying lowp vec4 vColor;
        void main()
        {   gl_FragColor = vColor;
        }
        ]]
    
    displayMode(FULLSCREEN)
    --59.6
    
    function setup()
        m = mesh()
        xs,ys=140,116
        for x=1,xs do
            for y=1,ys do
                m:addRect(x*5,y*5,4,4)
            end
        end
        m:setColors(color(255,200,0))
        numInstances = xs*ys
        tot,cnt=0,0
    end
    
    function draw()
        background(40, 40, 50)
        fill(255)
        tot=tot+DeltaTime
        cnt=cnt+1
        text("Avg FPS  "..string.format("%.2f",cnt/tot),WIDTH/2,HEIGHT-50)
        text("numInstances  "..numInstances,WIDTH/2,HEIGHT-80)
        translate(30,50)
        m:draw()
    end
    
  • SimeonSimeon Admin Mod
    Posts: 4,892

    @yojimbo2000 I imagine the gl_InstanceID variable might only be available in #version 300 es shaders.

  • Posts: 2,020

    One nice thing about instancing is that if you're doing some kind of procedural animation, such as a disintegrating explosion shader, you'd normally have to set up some attribute that indicates which face the vertex belongs to, and where the centre of that face is, whereas that is handled automatically with instancing.

  • dave1707dave1707 Mod
    Posts: 7,533

    One thing I didn't take into account with my programs above is that the non instanceing mesh is static, so the same mesh is drawn constantly making it faster. I tried two other programs based on the 2 programs above where I moved all 16240 rects around per draw cycle. The non instanceing program went from 59 FPS to 4 FPS. The instanceing program went from 54 FPS to 47 FPS. So if you want to move a lot of rects around, it looks like instanceing works well.

  • IgnatzIgnatz Mod
    Posts: 5,396

    Here are my results on an Air 2

    Dave's code with 120,000 [non moving] rects (2x2 to fit them on the screen)
    Non instancing = close to 60
    Instancing = 16

    Yojimbo's models - instancing is about 2/3 of the speed of non instancing

    I've only just started playing around, will do some more and report

  • SimeonSimeon Admin Mod
    Posts: 4,892

    @Ignatz thank you for the Air 2 results!

    My suspicion is that the multi-threaded renderer allows the non-instanced rendering to feed more geometry to the GPU. That is, non-instanced is able to utilise more of the CPU to do geometry uploads to the GPU.

    The Air 2 results show a bigger difference because there are 3 CPU cores. The multi-threaded Codea renderer can keep queuing up non-instanced mesh calls.

  • IgnatzIgnatz Mod
    edited November 2015 Posts: 5,396

    @Simeon - Thanks for the explanation

    Let me know if you want any more tests

  • edited November 2015 Posts: 2,020

    Here is an adaptation of @LoopSpace 's explosion/ disintegration shader. It blows an image up into lots of little fragments. Everything is calculated from the instance ID. It fakes some "noise" by sampling the image texture (so you can see brighter parts of the image fly further when it fragments). You'd get a better result if you uploaded an additional noise texture to the vert shader, but I wanted to keep things simple. My suspicion is that @LoopSpace 's original will perform better, because all of the trajectory calculations are performed in advance and then preloaded into buffers in that version. But I do like the simplicity of the Codea side of this version. You only have to define one rect, no buffers, and instancing does the rest.

    EDIT: all calculations now done as vec2s

    function setup()
        m = mesh()
        local rows = 100 --number of rows and columns
        numInstances = math.tointeger( rows ^ 2) --instanced drawing does not work with, eg 400.0, must be a true integer
        m.texture = "Cargo Bot:Codea Icon"
        local quadSize = 4 --size in pixels of each rect
        m:addRect(0,0,quadSize, quadSize)
        m:setRectTex(1,0,0,1/rows,1/rows)
        m.shader = shader(vert, frag)
        m.shader.rows = rows
        m.shader.quadSize = quadSize
        explode = {animate = 0}
        exploded = 1 --a flag to toggle the explosion
        print("total verts:", numInstances * 6)
        print("tap to explode/ unexplode")
    end
    
    function draw()
        background(40, 40, 50)
    
        translate(WIDTH/2, HEIGHT/2)
    
        -- mesh:draw can now take a number of instances
        -- it draws this many, instanced buffers can be
        -- used to differentiate each instance within a
        -- shader
        m.shader.time = explode.animate
        m:draw(numInstances)
    end
    
    function touched(t)
        if t.state == BEGAN then
            tween.stopAll()
            local target = exploded * 8
            local time = math.abs(target - explode.animate) * 0.5
            tween(time, explode, {animate=target})
            exploded = 1 - exploded
        end
    end
    
    vert=[[
    #extension GL_EXT_draw_instanced: enable 
    
    uniform mat4 modelViewProjection;
    uniform sampler2D texture;
    uniform float time;
    uniform float quadSize;
    uniform float rows;
    float texel = 1./rows;
    float halfRows = (rows - 1.) * .5;
    
    attribute vec4 position;
    attribute vec4 color;
    attribute vec2 texCoord;
    
    varying lowp vec4 vColor;
    varying mediump vec2 vTexCoord;
    
    const vec2 gravity = vec2(0.,-400. ); //down on the y axis
    const float friction = 1. ;
    
    void main()
    {
        vColor = color;
    
        float xOffset = mod(float(gl_InstanceIDEXT), rows) ; //calculate offset based on instance number
        float yOffset = float(gl_InstanceIDEXT ) / rows;
        mediump vec2 texOffset = vec2(xOffset, yOffset) * texel; 
    
        vTexCoord = texCoord + texOffset; //apply offset to texCoord
        xOffset -= halfRows; //make origin the centre
        yOffset -= halfRows;
        vec2 offset = vec2(xOffset , yOffset) * quadSize; //apply offset to position
    
        vec4 noise = texture2D(texture, texOffset) -vec4(0.5); //sample the texture to add some "noise"
        vec4 noise2 =texture2D(texture, vec2(1.)-texOffset.yx) -vec4(0.5);
    
        vec2 velocity = (normalize(offset) + ((noise.gr * noise2.rb) * 3. )) * 300.; 
        lowp float angle = time * (noise.b * noise2.g) * 45.;
    
        highp vec2 A = gravity/(friction*friction) - velocity/friction;
        highp vec2 B = offset - A; 
    
        float angCos = cos(angle);
        float angSin = sin(angle);
        lowp mat2 rot = mat2(angCos, angSin, -angSin, angCos);
    
        vec2 pos = rot * position.xy;
        pos += exp(-time*friction)*A + B + time * gravity/friction; 
    
        gl_Position = modelViewProjection * vec4(pos, 0., 1.); 
    }
    ]]
    
    frag=[[
    #extension GL_EXT_draw_instanced: enable
    uniform sampler2D texture;
    
    varying lowp vec4 vColor;
    varying mediump vec2 vTexCoord;
    
    void main()
    {
        gl_FragColor = texture2D(texture, vTexCoord) * vColor;
    }
    ]]
    
  • IgnatzIgnatz Mod
    Posts: 5,396

    Here is a comment I found on a forum that may help explain why we aren't seeing better performance from instancing.

    "Instancing of this form (that is, sending the same mesh data with different instance data) is generally only useful performance-wise if all of the following are true:

    1) The mesh you want to render instanced is relatively small, in terms of number of vertices, but not too small (at least ~100 vertices, up to around ~5000 or so)

    2) The number of instances of this specific mesh being rendered is large (>1000)"

    The OpenGL wiki seems to support this

    "It is often useful to be able to render multiple copies of the same mesh in different locations. If you're doing this with small numbers, like 5-20 or so, multiple draw commands with shader uniform changes between them (to tell which is in which location) is reasonably fast in performance. However, if you're doing this with large numbers of meshes, like 5,000+ or so, then it can be a performance problem, and instancing can help."

  • IgnatzIgnatz Mod
    Posts: 5,396

    @yojimbo2000 - re your jittery example above, I just reduced the offset size until it became smooth, for me that was 0.00005

  • Posts: 2,020

    Now that 2.3.2 is out, I removed the beta tag from this thread

Sign In or Register to comment.