Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Possibly better (for some cases) remove(...) function

edited April 1 in Code Sharing Posts: 1,187

I read a long and well-argued rant against lua’s implementation of table.remove(...).

Based on various suggestions from Stack Overflow, I’ve come up with an alternate version that seems to work well. I don’t know how it compares speed-wise, but it does at least seem to work for both arrays and key-value tables.

It relies on this method for checking whether a table is an array or not:


function isarray(tableT) --has to be a table in the first place of course if type(tableT) ~= "table" then return false end --not sure exactly what this does but piFace wrote it and it catches most cases all by itself local piFaceTest = #tableT > 0 and next(tableT, #tableT) == nil if piFaceTest == false then return false end --every numerical key except the last must have a key one greater for k,v in ipairs(tableT) do if tonumber(k) ~= nil and k ~= #tableT then if tableT[k+1] == nil then return false end end end --otherwise we probably got ourselves an array return true end

And then, using that, this function removes values from any kind of table without having to know their index:


function remove(targetTable, removeMe) --make a flag for scooting down all the elements an array local shouldMoveDown = false ; --loop over elements for i=1, #targetTable do --check for thing to remove if (targetTable[i] == removeMe) then --check if this is an array if isarray(targetTable) then --if it is, flip shouldMoveDown shouldMoveDown = true else --otherwise overwrite current element with last element targetTable[i] = targetTable[#targetTable] --and nil out last element and done! targetTable[#targetTable] = nil break end end --check if shouldMoveDown (which means it's confirmed as array) if shouldMoveDown then --check if we're not at the end if i ~= #targetTable then --if not, copy the next value over this one targetTable[i] = targetTable[i+1] else --if so, delete the last value targetTable[#targetTable] = nil end end end return targetTable, removeMe; end

I’m totally willing to concede I may have missed some cases, and I don’t know if these things will even be useful to anyone else. But apparently there’s lots of reasons to hate table.remove(...), so this could be an alternative.

Tagged:

Comments

  • edited April 1 Posts: 144

    This is a good implementation but maybe a bit heavy? I already shared it in the other thread but I’ve made this simple Queue data structure that lets you add items to them and remove by name while also providing an array for sequencing-


    Queue = class() function Queue:init() self.queue = {} self.map = {} end function Queue:enqueue(name, value) self.queue[#self.queue + 1] = value self.map[name] = #self.queue self.map[#self.queue] = name end function Queue:unqueue(name) local index = self.map[name] for i = index, #self.queue do self.map[i] = self.map[i + 1] if self.map[i] then self.map[self.map[i]] = i end self.queue[i] = self.queue[i + 1] end self.map[name] = nil end

    Usage

    myQueue = Queue()
    
    myQueue:enqueue(‘one’, ‘anything’)
    
    myQueue.queue[1] — anything
    
    myQueue.map[‘one’] — 1
    myQueue.map[1] — one
    
    myQueue.queue[myQueue.map[‘one’]] — anything
    myQueue.queue[myQueue.map[myQueue.map[1]]] — anything
    
    myQueue:unqueue(‘one’)
    
    myQueue.queue[1] — nil
    myQueue.map[‘one’] — nil
    myQueue.map[1] — nil
    
    

    I guess in my case you need to have a unique name (which can just be #myQueue.queue+1), while with yours it will work with any table. But keeping track of table array positions can get hairy so I find having a name identifier works best for me.

    But in your way you’re also searching by value which can be a problem:

     --check for thing to remove
            if (targetTable[i] == removeMe) then
    

    What if two different keys have the same removeMe value? How do you know you’re getting rid of the correct one? Also does this kind of equality check work on functions stored as the values or deeply nested tables?

  • Posts: 1,187
    Good questions.

    I think my code doesn’t improve at all on the out-of-the-box equivalency features of lua/Codea, but it should at least be at parity with them.

    It’s not really made to deal with tables that contain any given value more than once.
  • Posts: 1,187
    @skar your Queue class looks neat and helpful.

    I don’t fully understand metatables, but if I wanted to take this kind of functionality and make it a metatable, so I could add it to any normal lua table at will, would that work?
  • Posts: 144

    I don’t really understand metatables yet either. My experience is in JS programming and Lua is really similar. From what I can gather currently, metatables are very similar to js prototypes so in theory I think you are right, you should be able to add methods to the table metatable to handle the queue and unqueue

  • dave1707dave1707 Mod
    Posts: 9,287

    @UberGoober Why do you hate table.remove(). In your above code, you’re checking for an array. Lua doesn’t have arrays, just tables or strings. In your code where you find a match, you move the last table value to the current position and nil out the last table position. I don’t see any code to check if the last table value would have matched the removeMe value. It would then be skipped. It looks like you’re doing a lot of work but not guaranteeing that everything that should be removed is being removed. I find that using table.remove does exactly what it should.

  • Posts: 144

    From what I’ve read online, table.remove is slow and inefficient

  • dave1707dave1707 Mod
    Posts: 9,287

    @skar I took @UberGoober code above, stripped it down and tried it on a 1,000,000 item table. It was a lot faster than the table.remove, but it still had the problem of not checking for a match when moving the last item to the current.

  • Posts: 1,061

    when i want a table that supports remove, i generally use a hash-style table, which does remove in O(1) time. i've started wrapping tables with special properties in a small object that supports whatever special stuff i need.

  • edited April 1 Posts: 1,187

    Wait I misunderstood you @dave1707

  • Posts: 1,187

    @RonJeffries of course that’s great for a lot of cases, but to apply that trick when you have a table that has to keep a consistent sort order requires keeping two different tables, one for the sort order and one for the hash, and cross-referencing them, I think.

    I think that’s actually a pretty common approach for lua pros, but it makes my head hurt so I’m glad to avoid it when I can.

  • dave1707dave1707 Mod
    Posts: 9,287

    @UberGoober Your code has a “break” command in it which means you’ll exit your code after the first match. If a table contains more than 1 item to be removed, you’ll have to call it again. You’ll have to call it as many times as needed until no more items are removed.

    If the “break” command is removed, then you’ll check the whole table in one pass, but if there happens to be an entry to be removed at the end of the table, you’ll replace a matched entry with another entry that should be removed and you’ll miss it because you’ll go on to the next item before rechecking the new current entry.

    Just a quick example. Fill a table with all the same values. Then try to remove all the entries that have that value.

    Your code will work if a table only has 1 entry to be removed at a time, but that’s not always the case.

  • Posts: 1,187

    @dave1707 I think this now catches that “last element” bug. It actually avoids the swapping altogether—it should have been nilling the value in the first place.


    function remove(targetTable, removeMe) --check if this is an array if isarray(targetTable) then --flag for when a table needs to squish in to fill cleared space local shouldMoveDown = false --iterate over table in order for i = 1, #targetTable do --check if the value is found if targetTable[i] == removeMe then --if so, set flag to start collapsing the table to write over it shouldMoveDown = true end --if collapsing needs to happen... if shouldMoveDown then --check if we're not at the end if i ~= #targetTable then --if not, copy the next value over this one targetTable[i] = targetTable[i+1] else --if so, delete the last value targetTable[#targetTable] = nil end end end else --loop over elements for k, v in pairs(targetTable) do --check for thing to remove if (v == removeMe) then --if found, nil it targetTable[k] = nil break end end end return targetTable, removeMe; end

    As for removing only one value, that’s intended behavior. Modifying a table even once while iterating through it is asking for trouble, if I recall correctly, so I have to break and stop the iteration immediately after doing the removing.

  • dave1707dave1707 Mod
    Posts: 9,287

    @UberGoober It doesn’t matter if you remove only one match or you get rid of the “break” and you go thru the whole table and remove all the matches. The table still exists full size with nil entries. Using the “pairs” for loop ignores the nil entries, but using a [ ] for loop will read those nil entries. So depending on what you do with the table, you can end up with a table that increases in size and has a lot of nil entries in it.

    By putting nil in the table, your not modifying the table size or altering the position of unread entries. So you can go thru the whole table at once. The downside is a table filled with nil.

  • Posts: 1,061

    i think #targettable will not work if there are embedded nils

  • Posts: 1,061

    i think i'd use table.remove if i needed remove. it probably works.

  • dave1707dave1707 Mod
    edited April 2 Posts: 9,287

    @RonJeffries The # still works if there are nil values in the table. The # doesn’t work on keyed tables. I use table.remove also, but I always use it going thru a table in reverse order.

  • Posts: 1,187

    @RonJeffries did you test that theory? :)

    I have done tests with sparse arrays and, at least in the conditions I thought to test, it handles nil values fine.

    @dave1707 you know more than I do about all this stuff, but I just know from long hours spent searching for bugs that modifying something as you’re iterating through it is a bad idea in general—it might be totally fine if you know what you’re doing to the extent that you seem to, but I’ve personally lost too many hours to that beast to want to go near its cave again.

  • Posts: 1,187

    @RonJeffries table.remove() requires you to know the index of removal, and of course on hash tables there isn’t an index so you can’t use it at all.

    Personally I’d prefer a simple command that works on every kind of table and only requires that I know the thing I want gone and the table I want it gone from—though I’ll admit I’m scared to use what I’ve done here in my actual code because I’m not 100% sure of it.

    Btw here’s where I read the long rant against table.remove(): https://stackoverflow.com/questions/12394841/safely-remove-items-from-an-array-table-while-iterating/66768266#66768266 (it’s a huge answer written by a guy named Mitch McCabers, just scroll down and you can’t miss it).

  • edited April 2 Posts: 144

    I tested the table methods vs my queue and the queue is way slower :(


    -- Test local mathRandom = math.random local socket = require('socket') local time = socket.gettime local limit = 1000000 local remove = 1000 function setup() startTableFillTime = time() tableA = fillTable() endTableFillTime = time() tableFillTime = endTableFillTime - startTableFillTime startQueueFillTime = time() queueA = fillQueue() endQueueFillTime = time() queueFillTime = endQueueFillTime - startQueueFillTime print('tableFillTime: '..tableFillTime) print('queueFillTime: '..queueFillTime) end function draw() background(40, 40, 50) end function touched(touch) if touch.state == ENDED then removeFromTable() removeFromQueue() end end function fillTable() local t = {} for i = 1, limit do table.insert(t, mathRandom(1, limit)) end return t end function removeFromTable() local uniqueRandoms = getUniqueRandoms(1, limit, remove) local start = time() for k, v in pairs(uniqueRandoms) do table.remove(tableA, v) end local endT = time() print('removeFromTable: '..(endT - start)) end function fillQueue() local q = Queue() for i = 1, limit do q:enqueue(i, mathRandom(1, limit)) end return q end function removeFromQueue() local uniqueRandoms = getUniqueRandoms(1, limit, remove) local start = time() for k, v in pairs(uniqueRandoms) do queueA:unqueue(v) end local endT = time() print('removeFromQueue: '..(endT - start)) end function getUniqueRandoms(from, to, amount) local results = {} function recurRandom(f, t) local n = math.random (f, t) if results[n] then recurRandom(f, t) else results[n]=n end end for i = 1, (amount or 1) do recurRandom(from, to) end return results end Queue = class() function Queue:init() self.queue = {} self.map = {} end function Queue:enqueue(name, func) self.queue[#self.queue + 1] = func self.map[name] = #self.queue self.map[#self.queue] = name end function Queue:unqueue(name) local index = self.map[name] for i = index, #self.queue do self.map[i] = self.map[i + 1] if self.map[i] then self.map[self.map[i]] = i end self.queue[i] = self.queue[i + 1] end self.map[name] = nil end

    Maybe table.remove is much better now in Lua 5.4 so the old assumption of it being slow is wrong?
    This is the SO answer I had come across, not sure if it’s the same one @UberGoober saw: https://stackoverflow.com/a/53038524

  • Posts: 1,187

    @skar that’s the one. How did you link directly to that answer and not the whole question?

  • edited April 2 Posts: 144

    The problem with table.remove that I see is that we have to keep track of what is in what position.

    For my usage I want an array to loop through numerically, and then dynamically remove an item from the list, while everything else shifts down, still able to keep track of what is at which index, so I can remove or add anything else at any moment.

    @UberGoober ‘s solution would actually work well for me because my values are functions, and each function is unique even if it’s structurally identical, so I wouldn’t need to worry about breaking out of the loop too early. A better question is when would you as a programmer be unsure if your table is an array or a hashmap? It should always be pretty obvious which one you’re working with

    side note - @UberGoober the share button for each answer is at the bottom of the answer on the left hand side opposite the answerer’s name

  • dave1707dave1707 Mod
    Posts: 9,287

    @UberGoober When you use a “for” loop to iterate thru a table and you use table.remove(), when a matched item is removed, all the items above it are moved down 1 position to fill the void. The “for” index is increased by 1 and the next item is checked. The problem is table.remove() doesn’t check the item that was moved into the empty position, it’s skipped, because the “for” loop moved on to the next position. To eliminate that problem, you iterate thru the table in reverse order, from the end to the beginning. So it doesn’t matter if the items are moved down to fill the position of the deleted item, you’ve already checked all of them. And because you’re going down the table, nothing below the position you’re at have moved.

  • Posts: 1,061

    if i recall the lua spec, it does not say when a table will be in sequence form, and when not.

    The length operator applied on a table returns a border in that table. A border in a table t is any natural number that satisfies the following condition:

    (border == 0 or t[border] ~= nil) and t[border + 1] == nil
    

    In words, a border is any (natural) index present in the table that is followed by an absent index (or zero, when index 1 is absent).

    it goes on to say:

    When t is a sequence, #t returns its only border, which corresponds to the intuitive notion of the length of the sequence. When t is not a sequence, #t can return any of its borders. (The exact one depends on details of the internal representation of the table, which in turn can depend on how the table was populated and the memory addresses of its non-numeric keys.)

  • edited April 3 Posts: 1,187
    I’m not sure but this seems to say the opposite of what you’re suggesting it says.

    It reads to me that if the table is “a sequence” then #table is consistently the length of that sequence, and if it’s not, it’s unpredictable.

    Values with sequential numerical keys behave like a standard array and their length can be measured reliably with #, and if not, not.
  • Posts: 1,061

    An array with a nil in the middle is not a sequence. The rule is saying that if there is a hole in an array, Lua specifically says that # can return any bound, which may or may not be the last one. In short, if you nil an element of an array, # is no longer guaranteed to tell us what we think it should.

  • Posts: 1,061

    Here's a test:

    -- RJ 20200911
    -- Delete these and replace with your own.
    
    function testArraysWithNils()
        CodeaUnit.detailed = true
    
        _:describe("# may not be doing what you expect", function()
    
            _:before(function()
                ary = {}
                for i = 1,100 do
                    table.insert(ary, i)
                end
            end)
    
            _:after(function()
            end)
    
            _:test("# works", function()
                _:expect(#ary).is(100)
                ary[50] = nil
                _:expect(#ary).is(99) -- fails with 100
                local count = 0
                for i,x in ipairs(ary) do
                    count = count + 1
                end
                _:expect(count).is(99) -- fails with 49!!
                count = 0
                for i,x in pairs(ary) do
                    count = count + 1
                end
                _:expect(count).is(99) -- passes
                local deletes = {14, 54, 67, 34, 19, 81, 55, 64, 75, 93}
                for i,x in ipairs(deletes) do
                    ary[x] = nil
                end
                _:expect(#ary).is(100- #deletes - 1) -- fails with 100
                ary[135]=135
                _:expect(#ary).is(100 - #deletes) -- fails with 63!!
            end)
    
        end)
    end
    

    The last one is interesting. Before that assignment of 135, there are 89 non-nil elements in the array, the initial 100, minus the one at 50, minus the ten from the deletes loop. After the assignment into 135, there are 89 non-nils. # reports 63.

    You can't trust # if there are nils in the table.

  • Posts: 1,187

    @RonJeffries I don’t know where you think you and I are saying different things. You seem to be arguing against something I don’t think I asserted. I’m sure I could be wrong in either this assertion or that one, but if so, can we just say you aren’t telling me anything I intended to disagree with?

  • Posts: 1,187

    @RonJeffries also tests don’t really say anything about my function’s validity when they aren’t testing my function, no?

  • Posts: 1,061

    i'm not arguing against anyone or anything. i'm pointing out a fact, namely that # works in a literally undefined fashion when there are nils left in an array. i've not tried testing your function. if i wanted to remove things from a table, i'd most likely use a hash-style, since then remove is O(1), i.e. very fast.

    your function, as revised, may be valid. i don't know, and don't have an opinion. would you like me to study it and/or test it?

  • Posts: 1,187
    @RonJeffries, I understand the point you’re making, I just don’t know who you’re thinking needs to hear it.

    I’d be honored if you wanted to poke holes in my code, but I know you’re a busy man and I’d hate to delay your Dung work. :)

    Honesty, I’m totally okay if you have no interest.
  • Posts: 1,061

    Looking at this:

      if shouldMoveDown then
                    --check if we're not at the end
                    if i ~= #targetTable then
                        --if not, copy the next value over this one
                        targetTable[i] = targetTable[i+1]
                    else
                        --if so, delete the last value
                        targetTable[#targetTable] = nil
                    end 
                end
    

    it seems to me that the if/else isn't needed. in the else, i is #targetTable, and therefore targetTable[i+1] is nil.

    i have not tested that hypothesis.

  • edited April 4 Posts: 1,187
    I think the point is not to make i + 1 nil but to make the last element of the ordered table nil, since everything else has been moved down one index.
  • Posts: 1,061

    point is, since i+1 is past the end, it will be nil, setting i == last to nil. i've not tested it but i think the i = i+1 statement works at the end w/o the special check. i could be wrong, but don't think i am.

  • edited April 4 Posts: 1,187

    You’re right I’m wrong you’re big I’m small you’re smart I’m dumb.

    —paraphrased from “Matilda”

    I commented out the “then” part and it still passes all my tests.

    Nicely spotted.

    Nice to have those tests!

Sign In or Register to comment.