Make lua custom libraries? - Secure your source code
Warning - If you are a user of Corona SDK, please do not waste your time reading further, this article does not apply to CoronaSDK, as it does not have the capability to do advance stuff. Right, so now if you are still reading this, then obviously you are using a more powerful framework that allows you to do a lot of cool things. This has been tested with Gideros and not with Moai, but you are welcome to try this and provide feedback.
The typical scenario
you are a developer that writes a wonderful library and offer a host of features to the users. The user gets the library is source code format (plain text) and all of your IP is gone, you want to update but you are worried sick of opening up your cool routine to everyone that buys your code and then plagiarizes it deriving from that (one of the reasons why I had to pull PropertyBag off). You might have done some cool things that you might not have wanted to share, but unless you distributed the whole set of files, you couldn't ship your library. Some lua based frameworks wall their entire code and stuff from users but leave their developers in the lurch to distribute all of their code in plain text. Gideros thankfully is not like that and this article uses the features available there.
Know thy tools
Since we shall work with Lua, you need to know what lua is all about. The first thing that you need to know about lua is that it compiles everything to bytecode and that is what the VM executes. So while what we see as the source code, is actually compiled and run. The other thing is that lua bytecode is very temperamental, bytecode that is compiled for a particular system will not run on another system. This does not mean that if I compile it on Mac A, it will not run on Mac B, it means that the compiled header has to match the lua VM on the target machine. The mobile devices are 32-bit lua VMs where as the desktops (not 100% on the PC) are 64bit.
If you are a Corona user still reading, then try running a project with just the line
Though Corona works with compiled files, it does not allow the developer to include their own files. nor does it allow for loadstring or dofile, these are overwritten by closures (if you dump the _G object, you would see that).
So when using Gideros, the thing to note is that on the Desktop and the Simulator, you will need the 64-bit version of the bytecode and on the device (if you export to xCode for building) you will require the 32-bit version.
Now let us test the same,
first create a file called req.lua and write the following code in it.
save it, now create another file called main.lua and write the following code
if you run the app, you will see that the console displays the text "GiderosStudio bytecode test" this is what it should have done, after all this is how you write the code, now let us compile it and use the compiled version. For this you will require the luac which is the lua compiler, there is one that is included with GiderosStudio but that is a 32bit version, what we need is a 64bit version. Most likely you will have to download the lua 5.1.4 source and compile it for 32bit and 64 bit so that you can play with it while you test and learn lua.
So assuming that you have been able to compile the 64bit version of lua and are in possession of both lua and luac (64-bit). We just run the command on a terminal window (note the $ is the prompt that you see, it is not to be typed with the command)
on the first line we have renamed our req.lua text file to file.lua so as to avoid interference with the project. Next we compile the file.lua to req.lua and it is already referenced in the proejct, so we do not have to do much, if we run it again, we will get the text. If you feel that there is some form of caching involved, you can go to the terminal and run
and run your program, it will work once again.
One small thing, if you have compiled using the 32bit version of luac, it will complain about a bad header in the precompiled chunk.
In our test scenario, if you do not have or could not manage to compile or download the 64bit version of the luac, then you can use lua itself to compile for you.
start a new lua project, and type the following code into it
please note that you might have to change the ozapps in the path to your own username.
now copy this req.lua which should have been created on your desktop to your project.
When you want to build for the device, this compiled version will not work as it is a 64bit version and the mobile device requires a 32bit version of the compiled file.
Use case Scenarios
If you ask how is all of that applicable, well what can I say. Maybe you are continuing reading this article despite the warning above ;)
The first few things that come to my mind are
1. You can provide libraries like how TextCandy or ParticleCandy was but instead of being source code (text) it will be bytecode so the IP is safe
2. You can provide a 64-bit version for the users to try in their simulator and if they like it and would like to use it in their projects on a device, they can purchase the 32bit bytecode version to include in their projects. So you have a wonderful Try before you buy.
3. You can by having bytecode avoid unnecessary changes that some user could make to your code.
Just like Java or C#, bytecode is not 100% safe but the average user cannot read that like they can from a source/text file. There are other ways to help avoid decompilation. I had suggested to one user that was creating a library for CoronaSDK that he should try and obfuscate his code to safe guard his source, he did follow that advice but to an extent. The funny thing is that when you try and decompile the bytecode, it is in a form that is similar to obfuscated code, so it would not be hard for a trained professional to read that code.
If you were looking for a way to distribute your code and not have it in plain sight, yes plain sight as lua source code (if you ask does anyone do that?, yes Diner Dash and the other apps in that series that use the Playground SDK have all of their lua files in plain text. so you can download the free version of Diner Dash and if you know how to have a peek, you can see that the lua files are all in plain text) then this article could have been one of the solutions that you can utilize.
If you would be interested in such articles or tools, please indicate the same in the comments, I am working on a few tools that will help you safeguard your code from the casual roving user.
Code Dumps
Here's what the compiled req.lua file looks like
and the disassembly would look like
and the disassembled source code
ADDED NOTE:
This article has had a mixed reaction, there are those with the "Holier than thou" attitude that claim that it is easy to get the source code of any bytecode with off the shelf tools and some others that have had an issue to not removing the debug data from the bytecode. To answer to both these points,
1. Decompilation - Yes decompilation is possible, if you have read the above article carefully, I did mention that just like Java or C#, the bytecode can be decompiled, I have not said at any point that it cannot. The point here is that supplying a plain text source code file that can be read plain off the text editor vs a bytecode that will require a little effort on the part of the person wanting to read the code.
2. Debugging information - I have in the example run luac without the -s option which strips the debug information, it was to keep this simple than complicate and try to explain what debugging information is and how it is relevant, etc. Secondly the people that have commented seem to be biased and did not really care to read the article and have just gone, Hmm, Err. NO!! The whole point of this article was that the code rather than being distributed in plain text can be distributed in bytecode format, and the desktop version uses a 64bit while the mobile device version requires 32bit.
I can understand that you could be angry as you do not get to share the love and be able to use this functionality. You can stop being angry and jump on to Gideros and give it a try it does not hurt, unless your entire identity is based on another SDK.
So in general Developers, just a small preview of what you can and cannot do, if you take a standard Objective-C app, you can find that the entire list of API's that were used in that app. If you are dedicated enough, you can also run gdb, get to that point, patch code, change it and continue working. Point of this, even assembly code is not safe, someone will find a way to hack into it. The number of people that can do that would reduce significantly from those than can read a text file and those that can read bytecode.
The typical scenario
you are a developer that writes a wonderful library and offer a host of features to the users. The user gets the library is source code format (plain text) and all of your IP is gone, you want to update but you are worried sick of opening up your cool routine to everyone that buys your code and then plagiarizes it deriving from that (one of the reasons why I had to pull PropertyBag off). You might have done some cool things that you might not have wanted to share, but unless you distributed the whole set of files, you couldn't ship your library. Some lua based frameworks wall their entire code and stuff from users but leave their developers in the lurch to distribute all of their code in plain text. Gideros thankfully is not like that and this article uses the features available there.
Know thy tools
Since we shall work with Lua, you need to know what lua is all about. The first thing that you need to know about lua is that it compiles everything to bytecode and that is what the VM executes. So while what we see as the source code, is actually compiled and run. The other thing is that lua bytecode is very temperamental, bytecode that is compiled for a particular system will not run on another system. This does not mean that if I compile it on Mac A, it will not run on Mac B, it means that the compiled header has to match the lua VM on the target machine. The mobile devices are 32-bit lua VMs where as the desktops (not 100% on the PC) are 64bit.
If you are a Corona user still reading, then try running a project with just the line
require("none")and you will see that it spews errors that there is no such file as none.lua or an in archive none.lu or none.blu
Though Corona works with compiled files, it does not allow the developer to include their own files. nor does it allow for loadstring or dofile, these are overwritten by closures (if you dump the _G object, you would see that).
So when using Gideros, the thing to note is that on the Desktop and the Simulator, you will need the 64-bit version of the bytecode and on the device (if you export to xCode for building) you will require the 32-bit version.
Now let us test the same,
first create a file called req.lua and write the following code in it.
name="GiderosStudio bytecode test"
save it, now create another file called main.lua and write the following code
require("req") print(name)
if you run the app, you will see that the console displays the text "GiderosStudio bytecode test" this is what it should have done, after all this is how you write the code, now let us compile it and use the compiled version. For this you will require the luac which is the lua compiler, there is one that is included with GiderosStudio but that is a 32bit version, what we need is a 64bit version. Most likely you will have to download the lua 5.1.4 source and compile it for 32bit and 64 bit so that you can play with it while you test and learn lua.
So assuming that you have been able to compile the 64bit version of lua and are in possession of both lua and luac (64-bit). We just run the command on a terminal window (note the $ is the prompt that you see, it is not to be typed with the command)
$ mv req.lua file.lua $ luac -o req.lua file.lua
on the first line we have renamed our req.lua text file to file.lua so as to avoid interference with the project. Next we compile the file.lua to req.lua and it is already referenced in the proejct, so we do not have to do much, if we run it again, we will get the text. If you feel that there is some form of caching involved, you can go to the terminal and run
$ mv req.lua req.luand if you run the program now, you will see that it complains about a missing req.lua file. and if you revert back by using
$ mv req.lu req.lua
and run your program, it will work once again.
One small thing, if you have compiled using the 32bit version of luac, it will complain about a bad header in the precompiled chunk.
In our test scenario, if you do not have or could not manage to compile or download the 64bit version of the luac, then you can use lua itself to compile for you.
start a new lua project, and type the following code into it
local l = loadstring('name="GiderosStudio bytecode test"') local s = string.dump(l) print(s) local _file,_err = io.open("/Users/ozapps/Desktop/req.lua","wb") _file:write(s) io.close(_file)
please note that you might have to change the ozapps in the path to your own username.
now copy this req.lua which should have been created on your desktop to your project.
When you want to build for the device, this compiled version will not work as it is a 64bit version and the mobile device requires a 32bit version of the compiled file.
Use case Scenarios
If you ask how is all of that applicable, well what can I say. Maybe you are continuing reading this article despite the warning above ;)
The first few things that come to my mind are
1. You can provide libraries like how TextCandy or ParticleCandy was but instead of being source code (text) it will be bytecode so the IP is safe
2. You can provide a 64-bit version for the users to try in their simulator and if they like it and would like to use it in their projects on a device, they can purchase the 32bit bytecode version to include in their projects. So you have a wonderful Try before you buy.
3. You can by having bytecode avoid unnecessary changes that some user could make to your code.
Just like Java or C#, bytecode is not 100% safe but the average user cannot read that like they can from a source/text file. There are other ways to help avoid decompilation. I had suggested to one user that was creating a library for CoronaSDK that he should try and obfuscate his code to safe guard his source, he did follow that advice but to an extent. The funny thing is that when you try and decompile the bytecode, it is in a form that is similar to obfuscated code, so it would not be hard for a trained professional to read that code.
If you were looking for a way to distribute your code and not have it in plain sight, yes plain sight as lua source code (if you ask does anyone do that?, yes Diner Dash and the other apps in that series that use the Playground SDK have all of their lua files in plain text. so you can download the free version of Diner Dash and if you know how to have a peek, you can see that the lua files are all in plain text) then this article could have been one of the solutions that you can utilize.
If you would be interested in such articles or tools, please indicate the same in the comments, I am working on a few tools that will help you safeguard your code from the casual roving user.
Code Dumps
Here's what the compiled req.lua file looks like
0000 ** global header start ** 0000 1B4C7561 header signature: "\27Lua" 0004 51 version (major:minor hex digits) 0005 00 format (0=official) 0006 01 endianness (1=little endian) 0007 04 size of int (bytes) 0008 08 size of size_t (bytes) 0009 04 size of Instruction (bytes) 000A 08 size of number (bytes) 000B 00 integral (1=integral) ** global header end ** 000C ** function [0] definition (level 1) ** start of function ** 000C 2300000000000000 string size (35) 0014 6E616D653D224769+ "name=\"Gi" 001C 6465726F73537475+ "derosStu" 0024 64696F2062797465+ "dio byte" 002C 636F646520746573+ "code tes" 0034 742200 "t\"\0" source name: name=\"GiderosStudio bytecode test\" 0037 00000000 line defined (0) 003B 00000000 last line defined (0) 003F 00 nups (0) 0040 00 numparams (0) 0041 02 is_vararg (2) 0042 02 maxstacksize (2) * code: 0043 03000000 sizecode (3) 0047 01400000 [1] loadk 0 1 ; "GiderosStudio bytecode test" 004B 07000000 [2] setglobal 0 0 ; name 004F 1E008000 [3] return 0 1 * constants: 0053 02000000 sizek (2) 0057 04 const type 4 0058 0500000000000000 string size (5) 0060 6E616D6500 "name\0" const [0]: "name" 0065 04 const type 4 0066 1C00000000000000 string size (28) 006E 47696465726F7353+ "GiderosS" 0076 747564696F206279+ "tudio by" 007E 7465636F64652074+ "tecode t" 0086 65737400 "est\0" const [1]: "GiderosStudio bytecode test" * functions: 008A 00000000 sizep (0) * lines: 008E 03000000 sizelineinfo (3) [pc] (line) 0092 01000000 [1] (1) 0096 01000000 [2] (1) 009A 01000000 [3] (1) * locals: 009E 00000000 sizelocvars (0) * upvalues: 00A2 00000000 sizeupvalues (0) ** end of function ** 00A6 ** end of chunk **
and the disassembly would look like
; Name: ; Defined at line: 0 ; #Upvalues: 0 ; #Parameters: 0 ; Is_vararg: 2 ; Max Stack Size: 2 1 [-]: LOADK R0 K1 ; R0 := "GiderosStudio bytecode test" 2 [-]: SETGLOBAL R0 K0 ; name := R0 3 [-]: RETURN R0 1 ; return
and the disassembled source code
name = "GiderosStudio bytecode test"as we expected it to be.
ADDED NOTE:
This article has had a mixed reaction, there are those with the "Holier than thou" attitude that claim that it is easy to get the source code of any bytecode with off the shelf tools and some others that have had an issue to not removing the debug data from the bytecode. To answer to both these points,
1. Decompilation - Yes decompilation is possible, if you have read the above article carefully, I did mention that just like Java or C#, the bytecode can be decompiled, I have not said at any point that it cannot. The point here is that supplying a plain text source code file that can be read plain off the text editor vs a bytecode that will require a little effort on the part of the person wanting to read the code.
2. Debugging information - I have in the example run luac without the -s option which strips the debug information, it was to keep this simple than complicate and try to explain what debugging information is and how it is relevant, etc. Secondly the people that have commented seem to be biased and did not really care to read the article and have just gone, Hmm, Err. NO!! The whole point of this article was that the code rather than being distributed in plain text can be distributed in bytecode format, and the desktop version uses a 64bit while the mobile device version requires 32bit.
I can understand that you could be angry as you do not get to share the love and be able to use this functionality. You can stop being angry and jump on to Gideros and give it a try it does not hurt, unless your entire identity is based on another SDK.
So in general Developers, just a small preview of what you can and cannot do, if you take a standard Objective-C app, you can find that the entire list of API's that were used in that app. If you are dedicated enough, you can also run gdb, get to that point, patch code, change it and continue working. Point of this, even assembly code is not safe, someone will find a way to hack into it. The number of people that can do that would reduce significantly from those than can read a text file and those that can read bytecode.
Very impressive indeed, it's posts like this from talented devs like you which are persuading me to move from Corona to Gideros.
ReplyDeleteGood post... it's time to move. ;-)
ReplyDeleteCiao
Marco
Really enjoyed the read. Thanks :)
ReplyDelete