#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

# Experiments with vec2 userdata

edited August 2012 Posts: 489

I rewrote some code using vectors, by applying Codea's `vec2` userdata type - and it seemed to run much slower as a result. That is not what I had expected. The code below explores that further:

```--
-- Codea's vec2 userdata
--
function setup()
local n = 100000
local d
local v1 = vec2(1, 2)
local v2 = vec2(4, 5)
local v1x = v1.x
local v1y = v1.y
local v2x = v2.x
local v2y = v2.y
local tb1 = {x=1, y=2}
local tb2 = {x=4, y=5}

print("Vectors - minus and len")
t1 = os.clock()
d = 0
for i = 1, n do
local v3 = v2 - v1
d = d + v3:len()
end
dt1 = os.clock() - t1
print("Result:"..d)
print(dt1)
print()

print("Vectors - partial")
t2 = os.clock()
d = 0
for i = 1, n do
local v3x = v2.x - v1.x
local v3y = v2.y - v1.y
d = d + math.sqrt(v3x*v3x + v3y*v3y)
end
dt2 = os.clock() - t2
print("Result:"..d)
print(dt2)
print("Saving (%):", (1 - dt2/dt1)*100)
print()

print("Vectors - dist")
t3 = os.clock()
d = 0
for i = 1, n do
d = d + v2:dist(v1)
end
dt3 = os.clock() - t3
print("Result:"..d)
print(dt3)
print("Saving (%):", (1 - dt3/dt1)*100)
print()

print("Tables")
t4 = os.clock()
d = 0
for i = 1, n do
local v3x = tb2.x - tb1.x
local v3y = tb2.y - tb1.y
d = d + math.sqrt(v3x*v3x + v3y*v3y)
end
dt4 = os.clock() - t4
print("Result:"..d)
print(dt4)
print("Saving (%):", (1 - dt4/dt1)*100)
print()

print("Pure number types")
t5 = os.clock()
d = 0
for i = 1, n do
local v3x = v2x - v1x
local v3y = v2y - v1y
d = d + math.sqrt(v3x*v3x + v3y*v3y)
end
dt5 = os.clock() - t5
print("Result:"..d)
print(dt5)
print("Saving (%):", (1 - dt5/dt1)*100)
end

function draw()
background(0)
end
```

On my iPad2, this gives the following output:

```Vectors - minus and len
Result:424760
0.560913

Vectors - partial
Result:424760
0.409302
Saving (%): 27.0294

Vectors - dist
Result:424760
0.261353
Saving (%): 53.4059

Tables
Result:424760
0.111877
Saving (%): 80.0544

Pure number types
Result:424760
0.0709839
Saving (%): 87.3449
```

It seems that `vec2` comes at a price, the cost being speed.

Tagged:

• Posts: 580

Thanks for doing this, I've wondered about the performance implications of using vec2. Since Lua allows multiple return values, I wonder if it would be better to have a set of functions that take and return vectors by their individual components instead. It would probably be a lot less convenient though.

• Posts: 121

That's really interesting, I want to make some test too because I was pretty sure (and I wrongly never checked) that vec2 math should be faster than lua math on generic numbers/tables ecc. also due to some discussion on this forum. Probably something that could make this calc faster using vec2 would be the possibility to not create each time a new vec2 obj (that I fear is the real cause performance problem), like having methods that allows to apply the transformations (like rotate, translate, ecc) directly on the same vec2 or on a vec2 passed as parameter. @Simeon what do you think about @mpilgrem results?

• edited September 2012 Posts: 5,399

Those are very interesting results. I suspect there may be a lot of overhead when constructing a new userdata type, as well as calling out to C. So for the types of simple calculations you're performing, the overhead outweighs the benefits.

This is the source code for our vec2 implementation (from the Codea Runtime Library): https://github.com/TwoLivesLeft/Codea-Runtime/blob/master/CodeaTemplate/LuaLibs/vec2.c

Perhaps performance would be better if we re-wrote this as a pure-Lua library?

• Posts: 5,399

You can vastly improve the performance of the `vec2` benchmarks by locally caching the functions. The biggest slowdown is lookups on the vec2 members.

``````    print("Vectors - minus and len")
t1 = os.clock()
d = 0
local len = v2.len
local v3 = nil
for i = 1, n do
v3 = v2 - v1
d = d + len(v3)
end
dt1 = os.clock() - t1
print("Result:"..d)
print(dt1)
print()

print("Vectors - dist")
t3 = os.clock()
d = 0
local dist = v2.dist
for i = 1, n do
d = d + dist(v2, v1)
end
dt3 = os.clock() - t3
print("Result:"..d)
print(dt3)
print("Saving (%):", (1 - dt3/dt1)*100)
print()
``````

This gives me a saving of ~76% on that particular test.

Caching the function is not ideal, though. But for tight loops, this might be necessary for good performance.

• edited September 2012 Posts: 5,399

This appears to be due to Lua's `luaL_checkudata` call, which validates the vec2 type for safety before attempting to perform the operation.

It's quite a slow call, I'm going to try to find a way to work around this while maintaining a safety check.

Edit: I am able to speed up the built-in vectors so that they are faster than the "Pure number" and table solutions, however Codea could potentially be crashed by passing in an incorrect type (for example, passing a vec2 into a vec4 length function). Unsure whether it would be worth sacrificing stability for speed.

• Posts: 3,297

@Simeon maybe add a global setting function vec2check(boolean), set by default to true? When writing and debugging it would be true, and set to false when game is ready?

• Posts: 5,399

I have an experimental fix that is still safe to use.

It's still always going to e faster to cache the method calls in locals prior to entering a tight loop, though. But that's just the way Lua is.