Since Ruud holds the secret to how the tokenizing works, he is the best one to answer just how much of the original would be restored. I have run some experiments to see if the var names are tokenized or not. I thought they too would have been tokenized to really optimize the code but if I use longer var names, the tokenized file is larger. I think var names should also be converted to tokens for best performance.

Comments seem to not make it through the tokenization since changing the amount of comments does not change the resultant file size.

I can see where some companies may try to ban the use of KiX if we have obfuscation without reversal. Very often, it is not mainstream programmers that code scripts but rather sysadmins. I work for divisional IT and constantly run into red tape by corporate IT that wants DivIT to be nothing more than PC monkeys yet CorpIT cannot provide the level of service that my users need. There are a lot of companies that ban KiX because it is viewed as ShareWare without full support. Add to that, the argument that sysadmins, who may not follow rigid procedures of version and change control with the potential to build obfuscated code and not protect the source, will have the potential to create a vast legacy of unsupportable scripts.

There are a lot of people that are pinning their hopes on obfuscation being a panacea of encryption to not only protect their code as proprietary, but also as a secure(?) way to delegate admin rights through their scripts. It is not now and probably will never be truly secure, at least not as long as it runs in the security context of the user.

If the emphasis is more on simple obfuscation, Brian's suggestion to use compression after the tokenization might suffice. Still, it is a matter of time before someone cracks either the simple or the more complex obfuscation and then publishes it on the web.

I just can't think of how one can have obfuscation and the guaranteed ability to reverse it to clear text without a backdoor password that would soon enough be in the public domain. About the only thing I can think of is to embed the domain name and SID in the script and to only allow a Domain Admin to use the backdoor password. While the domain name could be duplicated, the SID would not be the same.


Edited by Les (2004-01-02 04:51 AM)
_________________________
Give a man a fish and he will be back for more. Slap him with a fish and he will go away forever.