Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

What is the most efficient way to draw lots of meshes?

edited December 2014 in Questions Posts: 11

I am currently working on a Rts game and I want to draw at least a few hundred figures on the screen. Trees terrain and units. For the moment I am using "for" loops, but when I draw hundred trees the frame rate slows down to 30fps. I just want to know if someone knows a better way to draw a bunch of meshes at once?

Tagged:

Comments

  • Posts: 730

    Keep the number of meshes down to a minimum by using a sprite sheet approach. Here's a demo which draws 2000 objects as a single mesh. Each object is one of four possible images (the four corners of the Codea icon)

    -- spritemesh
    
    -- Use this function to perform your initial setup
    function setup()
        displayMode(FULLSCREEN)
        m=mesh()
        img=readImage("Cargo Bot:Codea Icon")
        m.texture=img
        obj={}
        for i=1,2000 do
            table.insert(obj,{x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2})
        end
    end
    
    -- This function gets called once every frame
    function draw()
        m:clear()
        -- This sets a dark background color 
        background(40, 40, 50)
        for i,s in pairs(obj) do                 
            local id=m:addRect(s.x,s.y,s.size,s.size,math.rad(s.a))
            m:setRectTex(id,s.xcoord,s.ycoord,0.5,0.5)
            s.a = s.a + s.spin
            s.x = s.x + s.xspd
            s.y = s.y + s.yspd
            if s.x>WIDTH then s.x=0 end
            if s.x<0 then s.x=WIDTH end
            if s.y>HEIGHT then s.y=0 end
            if s.y<0 then s.y=HEIGHT end
        end
    
        m:draw()
    
    end
    
    

    Don't know what the FPS is but I believe there is a bug in the current release of codea which may slow things down

  • Posts: 688

    Sprite sheets (upto 2048 pixels square) are definitely the way to go - texture changes are about the slowest thing you can do in OpenGL

  • dave1707dave1707 Mod
    Posts: 7,605

    @West I added frame rate code to your code and on my iPad Air, 2,000 meshes ran at 28. I upped it to 5,000 and that ran at 11. I upped it again to 10,000 and it ran at 5. Curious to see how it runs when we get the next version that fixes the speed drop.

  • IgnatzIgnatz Mod
    Posts: 5,396

    @dave1707 - that speed is linear, it says Codea can draw (this sprite) on your iPad, about 55,000 times a second.

    On my iPad3, I get an FPS of 17 for 2000 meshes (NB I am using a beta with the intended speed fix, which may give me an advantage).

  • Posts: 730

    @dave1707, @Ignatz thanks for the info. You both talk about 2000 meshes - I thought it was 2000 objects/Rects (4000 triangles) in a single mesh but maybe my terminology is off.

  • IgnatzIgnatz Mod
    Posts: 5,396

    @West - no, you're right, it's 2000 objects

  • Thank you everybody. This will help my game along quite a bit. :)>-

  • @West Is it possible to move a single figur using the translate function, without moving all of them?

  • Posts: 730

    The easiest way to do it with the above code would be something like:


    obj[17].x=obj[17].x+1

    which would move the 17th object to the right.

    Do this outside the loop and you'll probably want to remove the other movement by deleting the following lines

            s.a = s.a + s.spin
            s.x = s.x + s.xspd
            s.y = s.y + s.yspd
    

    Not got my IPad at the moment so can't check

  • Posts: 688

    @West - sorry to come late to the party... but... looking at your demo above, I think it would be a lot faster if you create the mesh once in the setup function instead of recreating it every frame and then storing the id returned from mesh:addRect in a table so that you can update the individual rectangles each frame instead (in fact you probably don't even need to do that if all your doing is adding rectangles - just use the required index in the mesh:setRect() call

    Also replace the

    for i,s in pairs(obj) do
    

    with

    local s
    for i=1,2000 do
       s = obj[i]
       ...
       ...
       m:setRect(i,...)
    

    As you'll have the overhead of 2000 function calls as lua calls the pairs iterator which is also quite slow.

    If you're worried about adding and deleting objects on the fly during your game, just add enough rects to the mesh at the start for your worst case scenario and then use the rects as a pool and just set unused ones to 1x1 pixels and move them off screen.

    That way your frame rate should be consistent regardless of how many objects you have moving around.

  • Posts: 730

    Hi @TechDojo - thanks for the pointers - will try it later. The example was butchered from a previous test of sprites vs meshes I had kicking about

  • @TechDojo Could you please post your code, where you can specify what models are moving?

  • Posts: 688

    @Holger_gott - I'll see if I can dig some out later but in the mean time, I'll make the changes to @West's above although I'm not able to test it... :)

    Here goes...

    -- spritemesh
    
    -- Use this function to perform your initial setup
    local numObjs = 2000
    local obj
    local objIDs = {}
    
    function setup()
        displayMode(FULLSCREEN)
        m=mesh()
        img=readImage("Cargo Bot:Codea Icon")
        m.texture=img
        obj={}
        local id
        for i=1,numObjs do
            obj[i] = {x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2})
    
            id=m:addRect(s.x,s.y,s.size,s.size,math.rad(s.a))
            m:setRectTex(id,s.xcoord,s.ycoord,0.5,0.5)
            objIDs[i] = id   -- not sure if this is required
        end
    end
    
    -- This function gets called once every frame
    function draw()
        -- This sets a dark background color 
        background(40, 40, 50)
        local s
        for i=1,numObjs do           
            s = obj[i]
    
            s.a = s.a + s.spin
            s.x = s.x + s.xspd
            s.y = s.y + s.yspd
            if s.x>WIDTH then s.x=0 end
            if s.x<0 then s.x=WIDTH end
            if s.y>HEIGHT then s.y=0 end
            if s.y<0 then s.y=HEIGHT end
    
            m:setRect(objIDs[i],s.x,s.y,s.size,s.size)
            -- m:setRect(i,s.x,s.y,s.size,s.size)    -- this *may* also work, not sure ????
        end
    
        m:draw()
    
    end
    

    This is the basic idea. Be interesting to see what kind of speed difference this makes especially as the number of objects ramps up. @dave1707 any chance of you putting your FPS code in this and posting some stats?

  • dave1707dave1707 Mod
    Posts: 7,605

    @TechDojo Your version needs some work to get it to run.

  • Posts: 730

    Here's a working version

    -- spritemesh
    
    -- Use this function to perform your initial setup
    local numObjs = 2000
    local obj
    local objIDs = {}
    
    function setup()
        displayMode(FULLSCREEN)
        m=mesh()
        img=readImage("Cargo Bot:Codea Icon")
        m.texture=img
        obj={}
        local id
        for i=1,numObjs do
            obj[i] = {x=math.random(WIDTH),y=math.random(HEIGHT),a=math.random(360),size=10+math.random(30),spin=-5+math.random(100)/10,xspd=-3+math.random(7),yspd=-3+math.random(7),xcoord=(math.random(2)-1)/2,ycoord=(math.random(2)-1)/2}
    
            id=m:addRect(obj[i].x,obj[i].y,obj[i].size,obj[i].size,math.rad(obj[i].a))
            m:setRectTex(id,obj[i].xcoord,obj[i].ycoord,0.5,0.5)
            objIDs[i] = id   -- not sure if this is required
        end
    end
    
    -- This function gets called once every frame
    function draw()
        -- This sets a dark background color 
        background(40, 40, 50)
    
        for i=1,numObjs do      
               local s      
            s = obj[i]
    
            s.a = s.a + s.spin
            s.x = s.x + s.xspd
            s.y = s.y + s.yspd
            if s.x>WIDTH then s.x=0 end
            if s.x<0 then s.x=WIDTH end
            if s.y>HEIGHT then s.y=0 end
            if s.y<0 then s.y=HEIGHT end
    
            m:setRect(objIDs[i],s.x,s.y,s.size,s.size,math.rad(s.a))
        end
    
        m:draw()
    
    end
    
    
  • dave1707dave1707 Mod
    Posts: 7,605

    I added my frame rate code and got the same values. 2,000 was 28, 5,000 was 11, 10,000 was 5. I tried 55,000 as @Ignatz suggested above and the frame rate was 1.

  • Posts: 688

    @dave1707, @West - thanks for fixing the code, I wasn't near my iPad so I was coding blond. Although personally I'd still move the 'local s' outside of the for loop.

    To be honest, im surprised at the speed timings I'm assuming you're using the new fixed beta. So recreating a mesh every frame takes the same time as updating each text?? When I get five minutes I'm going to try and create a kind of profiling framework so we can time these functions to get a better understanding of what's happening.

  • edited December 2014 Posts: 152

    Hi @TechDojo,

    We wrote a profiler for Codea, I cant get on to GitHub at the moment to create a gist, so code included below...

    It works on an 'object' level, e.g. a table that contains functions, or a class instance...

    Basically it swaps out every function it finds into a wrapper that does timing and counts of calls and calculates averages etc. It's a little clunky as it allows a maximum of 10 parameters per function, could probably do some cleverer arg unpacking...

    Usage is as follows, when game is running (from the console), or in code if you like:

    startProfiling(obj, delay)
    stopProfiling()
    

    If you add a delay time in seconds then profiling will automatically stop and halt the game and report to the console...

    Here are the helper functions, class definition is below:

    -- ------------------
    -- Profiler Functions
    -- ------------------
    local profiler = nil
    function startProfiling(obj, delay)
        if (profiler) then
            profiler:stop()
        end
        profiler = Profiler4Codea(obj)
        profiler:start()
        if(delay) then
            tween.delay(delay, function()
                stopProfiling()
                error("Stopping game for profiler results")
            end)
        end
    end
    
    function stopProfiling()
        if (profiler) then
            profiler:stop()
            local report
            print("TOTAL\r\n")
            report = profiler:report(Profiler4Codea.TotalTime)
            print(report)
            print("AVERAGE\r\n")
            report = profiler:report(Profiler4Codea.AvgTime)
            print(report)
            print("#INVOKED\r\n")
            report = profiler:report(Profiler4Codea.TimesInvoked)
            print(report)
        end
    end
    

    Class definition:

    Profiler4Codea = class()
    
    Profiler4Codea.TimesInvoked = "timesInvoked"
    Profiler4Codea.TotalTime = "totalTime"
    Profiler4Codea.AvgTime = "avgTime"
    
    Profiler4Codea.Descending = "descending"
    Profiler4Codea.Ascending = "ascending"
    
    local table_insert = table.insert
    
    local globalMetaData = {}
    function Profiler4Codea:init(obj, name)
    
        assert(obj, "No object supplied")
    
        self.obj = obj
        if (type(obj) ~= "table") then
            error("Profiler4Codea:init: obj must be table or class: " .. tostring(self.obj))
        end
        local metaTable = getmetatable(self.obj)
        if (metaTable) then
            self.obj = metaTable
        end
        self.metaData = {}
        if (name) then
            if (not globalMetaData[name]) then
                globalMetaData[name] = {}
            end
            table_insert(globalMetaData[name], self)
        end
    end
    
    function Profiler4Codea:start()
    
        self.clockTime = os.clock()
    
        for name, member in pairs(self.obj) do
            local mType = type(member)
            if (mType == "function" and name ~= "init" and name ~= "draw") then
                if (not self.metaData[name]) then
                    self.metaData[name] = {
                        totalTime = 0,
                        timesInvoked = 0,
                        func = name
                    }
                end
                self.metaData[name].origFunction = member
                self.obj[name] = function(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
                    local elapsedTime = os.clock()
                    local r1, r2, r3, r4, r5, r6, r7, r8, r9, r10 =
                        self.metaData[name].origFunction(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10)
                    self.metaData[name].totalTime = self.metaData[name].totalTime +
                        (os.clock() - elapsedTime)
                    self.metaData[name].timesInvoked = self.metaData[name].timesInvoked + 1
                    return r1, r2, r3, r4, r5, r6, r7, r8, r9, r10
                end
            end
        end
    end
    
    function Profiler4Codea:stop()
        -- Restore function pointers
        for name, meta in pairs(self.metaData) do
            self.obj[name] = meta.origFunction
        end
        return self:report()
    end
    
    function Profiler4Codea:report(sortKey, ascDesc, stringify)
    
        if (not self.clockTime) then
            return "No data"
        end
    
        sortKey = sortKey or Profiler4Codea.TimesInvoked
        ascDesc = ascDesc or Profiler4Codea.Descending
        if (stringify == nil) then stringify = true end
    
        local data = {}
        for name, meta in pairs(self.metaData) do
            -- Calculate average time so we can sort by if required
            if (meta.timesInvoked > 0) then
                meta.avgTime = meta.totalTime / meta.timesInvoked
            else
                meta.avgTime = 0
            end
            table_insert(data, meta)
        end
    
        if (ascDesc == Profiler4Codea.Descending) then
            table.sort(data, function(a, b) return b[sortKey] < a[sortKey] end)
        elseif (ascDesc == Profiler4Codea.Ascending) then
            table.sort(data, function(a, b) return a[sortKey] < b[sortKey] end)
        else
            error("Unknown sort key")
        end
    
        if (stringify) then
            local sb = {}
            table_insert(sb, "Profiler4Codea: Sample time: ")
            table_insert(sb, os.clock() - self.clockTime)
            table_insert(sb, "\r\n")
    
            table_insert(sb, "Sort key: ")
            table_insert(sb, sortKey)
            table_insert(sb, "\r\n")
    
            table_insert(sb, "Asc/Desc: ")
            table_insert(sb, ascDesc)
            table_insert(sb, "\r\n")
    
            for k = 1, #data do
                local meta = data[k]
                if (meta.timesInvoked > 0) then
                    table_insert(sb, "[")
                    table_insert(sb, meta.func)
                    table_insert(sb, ",")
                    table_insert(sb, tostring(meta.totalTime))
                    table_insert(sb, ",")
                    table_insert(sb, string.format("%d", meta.timesInvoked))
                    table_insert(sb, ",")
                    table_insert(sb, string.format("%.6f", meta.avgTime))
                    table_insert(sb, "]\r\n")
                end
            end
    
            return table.concat(sb)
        end
        return data
    end
    
    function Profiler4Codea.globalStop()
    
        for name, profilerList in pairs(globalMetaData) do
            for k = 1, #profilerList do
                local profiler = profilerList[k]
                profiler:stop()
            end
        end
    end
    
    function Profiler4Codea.globalReport(sortKey, ascDesc)
        -- Iterate over all object types
        local result = {}
        local instances = {}
        for name, profilerList in pairs(globalMetaData) do
            -- Iterate over each instance
            instances[name] = #profilerList
            for k = 1, #profilerList do
                local profiler = profilerList[k]
                -- Get data for instance
                local data = profiler:report(sortKey, ascDesc, false)
                -- Iterate over instance functions
                for k = 1, #data do
                    local meta = data[k]
    
                    if (not result[name .. "." .. meta.func]) then
                        result[name .. "." .. meta.func] = {
                            totalTime = 0,
                            timesInvoked = 0,
                            avgTime = 0
                        }
                    end
                    result[name .. "." .. meta.func].totalTime =
                    result[name .. "." .. meta.func].totalTime + meta.totalTime
                    result[name .. "." .. meta.func].timesInvoked =
                    result[name .. "." .. meta.func].timesInvoked + meta.timesInvoked
                end
            end
        end
        -- Calculate average
        local final = {}
        for k, data in pairs(result) do
            if (data.timesInvoked > 0) then
                data.avgTime = data.totalTime / data.timesInvoked
                data.id = k
                table_insert(final, data)
            end
        end
    
        sortKey = sortKey or Profiler4Codea.TotalTime
        ascDesc = ascDesc or Profiler4Codea.Descending
    
        if (ascDesc == Profiler4Codea.Descending) then
            table.sort(final, function(a, b) return b[sortKey] < a[sortKey] end)
        elseif (ascDesc == Profiler4Codea.Ascending) then
            table.sort(final, function(a, b) return a[sortKey] < b[sortKey] end)
        else
            error("Unknown sort key")
        end
    
        local sb = {}
        table_insert(sb, "Profiler4Codea: ")
        table_insert(sb, "\r\n")
    
        for name, count in pairs(instances) do
            table_insert(sb, "Class: ")
            table_insert(sb, tostring(count))
            table_insert(sb, "\r\n")
        end
    
        table_insert(sb, "Sort key: ")
        table_insert(sb, sortKey)
        table_insert(sb, "\r\n")
    
        table_insert(sb, "Asc/Desc: ")
        table_insert(sb, ascDesc)
        table_insert(sb, "\r\n")
    
        for k = 1, #final do
            local meta = final[k]
            if (meta.timesInvoked > 0) then
                table_insert(sb, meta.id)
                table_insert(sb, ",")
                table_insert(sb, tostring(meta.totalTime))
                table_insert(sb, ",")
                table_insert(sb, string.format("%d", meta.timesInvoked))
                table_insert(sb, ",")
                table_insert(sb, string.format("%.6f", meta.avgTime))
                table_insert(sb, "\r\n")
            end
        end
        return table.concat(sb)
    end
    
  • Posts: 688

    @brooksie and this is why I love this forum! Thanks :)

  • I got the same averages as @dave1707 on my iPad 4 using the latest beta.

    However, I decided to look a little closer at the numbers being produced and discovered that the variance in the framerate is quite large. The framerate ranges from about half the average to about twice. This results in extremely choppy animation.

    If your mesh rectangles are following a deterministic path (as they are in @West's code), it is far, far more efficient to use a shader to update their trajectory. Then you need to pass in a load of initial data but at each draw cycle you only pass in the elapsed time. The shader then computes the updated position. Moreover, as this is happening on the GPU, it can be done in parallel rather than in a single thread on the CPU. Not only is this far, far faster it also results in much smoother animation.

    For example, using my explosion shader (which does the same: moves and rotates rectangles), I get a framerate of 30 with 27,000 rectangles. At 55,000 rectangles, my framerate is 20. At 110,000 the framerate is 11. As with the code here, once it starts going down then it is inversely proportional to the number of rectangles but the number of rectangles needed before it starts going down is far higher.

    (Incidentally, @Ignatz's terminology is incorrect. A linear relationship is described by an equation of the form y = m x + c and this is not. Rather, it is inversely proportional in that the number of rectangles times the framerate is roughly constant. You could say that the relationship between the number of rectangles and the time taken for each frame to render is linear but "time taken" is the reciprocal of the framerate which was the quantity being discussed.)

    So if you can, shift the mesh's movement into a shader. An explanation of my explosion shader can be found at http://loopspace.mathforge.org/HowDidIDoThat/Codea/Shaders/.

  • Posts: 152

    Hi @LoopSpace,

    Regarding:

    ...it is far, far more efficient to use a shader to update their trajectory. Then you need to pass in a load of initial data but at each draw cycle you only pass in the elapsed time. The shader then computes the updated position.

    How do you pass arbitrary data in? Is it just a set of numeric variables, or do you 'spoof' up a texture and read results out of eg rgba values?

    Just curious...or does the shader now support table/array data?

    @Brookesi

  • @Brookesi Take a look at the link I posted. That contains the details of how to pass this information through.

  • dave1707dave1707 Mod
    Posts: 7,605

    @TechDojo @LoopSpace I'm still running the slow version of Codea. I expect the fixed version will result in a speed increase of about 3 times. I tried doing a more instant timing and found that the time varied a lot from frame to frame. That's why I used an average time over the whole run, it's easier to get a fixed value. I use the total number of draw cycles divided by the total of DeltaTime.

  • Posts: 688

    Hmm - as a lot of those triangles would be passing over each other as they move I wonder if the GPU is doing some clever stuff ignoring overdrawn pixels and therefore the actual drawtime could fluctuate, or alternatively it might be down to the rotation (matrix calc) taking effect, it might be interesting to stop the rotation and just use a fixed angle to see if that makes a difference.

    @dave1707 - I actually really noticed the slowdown for the first time yesterday, I was playing with an old fractal landscape demo, where most of the code is done in setup to actually generate the mesh and then each frame it simply repositions the camera. What I noticed is that the initial startup time took a lot longer (so much I'd initially thought the demo had crashed on the new build) but when actually running the difference was minimal (if anything). So I guess any timings on processor intensive operations should be ignored until the new version is released.

    @LoopSpace - thanks for sharing the shader code, I'm still trying to get my head around your perspective correct shader :)

  • edited December 2014 Posts: 430

    @dave1707 I ran West's code on my iPad and got the same figures as you did. So the speed-up from passing to shaders is entirely down to passing to shaders and not to being on different betas. Also, while average fps gives a reasonable overview, looking at variation can be important too. I looked at the time taken per frame and saw that it jumped a lot, so looked at the minimum and maximum time over the last ten frames as well. That's where I saw that it varies from half to double the average.

    @TechDojo Which is the "perspective correct shader" you're talking about?

  • IgnatzIgnatz Mod
    edited December 2014 Posts: 5,396

    @TechDojo - In my 3D work I've noticed (the obvious) that the more pixels need colouring, the slower it is, so I wondered if that had an effect here because the screen is so crowded.

    I tried simply restricting the images to 1/4 of the screen, so 3/4 was blank, and it had no effect on speed whatsoever, which surprised me, because another thing I learned from 3D is that OpenGL is extremely efficient at culling unseen vertices, and when you restrict screen space, that should mean fewer visible vertices.

  • Posts: 688

    @LoopSpace - it was this one http://loopspace.mathforge.org/HowDidIDoThat/Codea/Gradient/

    @Ignatz - From my readings of OpenGL I think it can detect (possibly through the use of a z buffer) if the pixels have already been drawn and then not draw them again - I remember something about rendering semi-transparent triangles and making sure that they are drawn in the correct Z order. This would obviously be beneficial if the triangles were pre-sorted in Z.

    I worked for a company many years ago that created an arcade board that was very good at rasterising spans of pixels for objects across each scanline ensuring that there was no overdraw. It was very fast and particularly good at scaling sprites (ala Afterburner and Outrun) but semi-transparency then was a real issue (I don't think it was supported - but then it was 1993 :) ).

Sign In or Register to comment.