Static analysis of JS to find malicious obfuscation

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Static analysis of JS to find malicious obfuscation

tofumatt
Hi there SpiderMonkey folks!

I’m tofumatt, I'm working with Stuart Colville on the new add-ons validator, written in JS.

One of the things we’d like to improve in this validator is the ability to detect rule bypassing via code obfuscation. For example, mozIndexedDB is a deprecated identifier and that is easy to find with a custom ESLint rule. But if someone types:


var badDB = ‘m’;
badDB += ‘oz’
badDB = badDB + ‘IndexedDB’;
var myDeprecatedDB = window[badDB];


The existing validator and our scans for an identifier with AST (using ESLint/ESPrima) don’t catch it.

Are there any tools (especially JS ones!) that can be used to at least detect this kind of obfuscation? Without it the validator remains more an advisory/helpful tool than something we could use to automate security validation.

Apologies if this is the wrong list; didn’t know exactly who to turn to for this (I’ve also asked static analysis and Security folks). If I should check with someone specific, please let me know.

Cheers,

- tofumatt
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Reply | Threaded
Open this post in threaded view
|

Re: Static analysis of JS to find malicious obfuscation

Benjamin Bouvier-2
Hi tofumatt,

As far as I can tell, this can't be done with a static analyzer, as your
need is specifically to be able to dynamically match constructed strings.

Any form of static pattern matching for these rules (e.g. looking for +=
where any of the operands is a string, or string addition) could be
defeated and thus useless, as there are probably an infinity of ways to
create these strings (one could use an array containing the entire alphabet
and access it at different indexes to reconstruct strings; or also use
String.fromCharCode; or more generally, use any bijective function that can
convert a string into something else).

Really, what you want to do is to evaluate the code in a JS environment.
But then the problem of mocking all Web APIs isn't trivial to solve. And
then you'd have a hard time making sure that all code paths are actually
taken. You could use a Proxy on the global object (namely. the window), add
a trap on "has" or "get" and make sure that the target doesn't belong to a
list of black-listed APIs, but I am not even sure this would be enough.

There has been a few projects aiming at tainting user-provided strings, to
mark them as potentially dangerous in several places (and forbid their use
in these places). As far as I can tell, these projects have been put on
hold, as they add a lot of complexity to the engine. Nicolas Pierron will
probably be able to tell more about that and if it applies to this use case.

Hope this helps a bit!
Cheers,
Benjamin


On Thu, Oct 8, 2015 at 10:32 AM, tofumatt <[hidden email]> wrote:

> Hi there SpiderMonkey folks!
>
> I’m tofumatt, I'm working with Stuart Colville on the new add-ons
> validator, written in JS.
>
> One of the things we’d like to improve in this validator is the ability to
> detect rule bypassing via code obfuscation. For example, mozIndexedDB is a
> deprecated identifier and that is easy to find with a custom ESLint rule.
> But if someone types:
>
>
> var badDB = ‘m’;
> badDB += ‘oz’
> badDB = badDB + ‘IndexedDB’;
> var myDeprecatedDB = window[badDB];
>
>
> The existing validator and our scans for an identifier with AST (using
> ESLint/ESPrima) don’t catch it.
>
> Are there any tools (especially JS ones!) that can be used to at least
> detect this kind of obfuscation? Without it the validator remains more an
> advisory/helpful tool than something we could use to automate security
> validation.
>
> Apologies if this is the wrong list; didn’t know exactly who to turn to
> for this (I’ve also asked static analysis and Security folks). If I should
> check with someone specific, please let me know.
>
> Cheers,
>
> - tofumatt
> _______________________________________________
> dev-tech-js-engine mailing list
> [hidden email]
> https://lists.mozilla.org/listinfo/dev-tech-js-engine
>
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Reply | Threaded
Open this post in threaded view
|

Re: Static analysis of JS to find malicious obfuscation

David Bruant-5
Le 08/10/2015 13:18, Benjamin Bouvier a écrit :
> Hi tofumatt,
>
> As far as I can tell, this can't be done with a static analyzer, as your
> need is specifically to be able to dynamically match constructed strings.
Agreed. Security properties of this sort are not possible via
eslint-style static analysis.
It's probably possible via something like TypeScript. I feel TypeScript
is capable of catching aliases of the global object (or at least tell
you when analysis stopped is lost) and thus, patterns like
globalAlias[dynamicString] could be detected and fordibben in addons.

> Really, what you want to do is to evaluate the code in a JS environment.
> But then the problem of mocking all Web APIs isn't trivial to solve. And
> then you'd have a hard time making sure that all code paths are actually
> taken. You could use a Proxy on the global object (namely. the window), add
> a trap on "has" or "get" and make sure that the target doesn't belong to a
> list of black-listed APIs, but I am not even sure this would be enough.
The step further is sandboxing addon JS Caja-style (caja:
https://developers.google.com/caja/ ). This makes possible to prevent
using objects (or more accurately "capabilities") that you don't want
authors to have access to (like mozIndexedDB).
This sort of sandboxing is possible today and is being made easier with
proxies and soon the loader API.

Even if not using Caja, I'm sure the Caja folks will be happy to hear
about such an initiative and might be able to provide guidance. I can too.


>> I’m tofumatt, I'm working with Stuart Colville on the new add-ons
>> validator, written in JS.
Where can I read more about this initiative, which bug to follow?

Thanks,

David
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Reply | Threaded
Open this post in threaded view
|

Re: Static analysis of JS to find malicious obfuscation

tofumatt
Regarding sandboxing Add-on JS; do you mean at runtime in the browser? We’re working solely on a validator; the current one doesn’t identify these kinds of exploits and relies on manual review to catch them. I’d love to automate that as I’d think a machine could do it better and certainly faster.

Sorry, I can’t quit tell how Caja works as parts of their site (especially the demos) are a bit broken.

To be clear: we aren’t looking to prevent users from using these APIs at runtime (if a user wants to create their own add-on that uses eval() that’s fine, it just wouldn’t be allowed on AMO), but rather identifying their usage. It seems like Caja might let me do that, but I’m not clear as their site/docs seem to be partially broken and I can’t read any of the examples.

If you want to follow our progress the project is on GitHub at https://github.com/mozilla/addons-validator; this specific issue is at https://github.com/mozilla/addons-validator/issues/46.

There’s a tracking bug at https://bugzilla.mozilla.org/show_bug.cgi?id=1212829.

- tofumatt

On 8 October 2015 at 13:20:29, David Bruant ([hidden email]) wrote:

Le 08/10/2015 13:18, Benjamin Bouvier a écrit :  
> Hi tofumatt,  
>  
> As far as I can tell, this can't be done with a static analyzer, as your  
> need is specifically to be able to dynamically match constructed strings.  
Agreed. Security properties of this sort are not possible via  
eslint-style static analysis.  
It's probably possible via something like TypeScript. I feel TypeScript  
is capable of catching aliases of the global object (or at least tell  
you when analysis stopped is lost) and thus, patterns like  
globalAlias[dynamicString] could be detected and fordibben in addons.  

> Really, what you want to do is to evaluate the code in a JS environment.  
> But then the problem of mocking all Web APIs isn't trivial to solve. And  
> then you'd have a hard time making sure that all code paths are actually  
> taken. You could use a Proxy on the global object (namely. the window), add  
> a trap on "has" or "get" and make sure that the target doesn't belong to a  
> list of black-listed APIs, but I am not even sure this would be enough.  
The step further is sandboxing addon JS Caja-style (caja:  
https://developers.google.com/caja/ ). This makes possible to prevent  
using objects (or more accurately "capabilities") that you don't want  
authors to have access to (like mozIndexedDB).  
This sort of sandboxing is possible today and is being made easier with  
proxies and soon the loader API.  

Even if not using Caja, I'm sure the Caja folks will be happy to hear  
about such an initiative and might be able to provide guidance. I can too.  


>> I’m tofumatt, I'm working with Stuart Colville on the new add-ons  
>> validator, written in JS.  
Where can I read more about this initiative, which bug to follow?  

Thanks,  

David  
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine
Reply | Threaded
Open this post in threaded view
|

Re: Static analysis of JS to find malicious obfuscation

Nicolas B. Pierron
In reply to this post by tofumatt
Hi tofumatt,

On 10/08/2015 10:32 AM, tofumatt wrote:

> One of the things we’d like to improve in this validator is the ability to detect rule bypassing via code obfuscation. For example, mozIndexedDB is a deprecated identifier and that is easy to find with a custom ESLint rule. But if someone types:
>
>
> var badDB = ‘m’;
> badDB += ‘oz’
> badDB = badDB + ‘IndexedDB’;
> var myDeprecatedDB = window[badDB];
>
>
> The existing validator and our scans for an identifier with AST (using ESLint/ESPrima) don’t catch it.

I think your validator should not attempt at checking any results of an
evaluation.  JavaScript is a turing complete language and I am sure we can
find more imaginative ways to work around such system (1).

I think the validator should look for a list of functions / syntax which
might be used to build such patterns.

namely, you want to prevent any source of work-around:
  - aliasing of the global.
  - using sub-notation for accessing computed property name on the global.
  - executing computed code. (new Function / eval) (2)

Unfortunately, I think they are many cases where these functions might be
useful in practice, such as generating asm.js modules based on incremental
patches.  This is why I agree on the fact that having a sand-boxed
environment might make more sense.

Basically, if the validator does not validate the code because of suspicious
code, I do not see why we could not warn the developer of the addon that the
validator failed, and as such the addon will run in a degraded environment.

> Are there any tools (especially JS ones!) that can be used to at least detect this kind of obfuscation? Without it the validator remains more an advisory/helpful tool than something we could use to automate security validation.

Static analysis on static languages is a difficult problem, otherwise we
would have no more crashes in Firefox anymore.

The only way you can make this easier for you is to restrict the language so
much that your static analyzers becomes obvious, otherwise I would probably
not recommend to rely on it as a proof of security.

What kind of security issues are you trying to prevent with the validator?

(1)
   function bruteForce(alph, hash, len) {
     var res;
     while (true) {
       for (var i = 0, res = ""; i < len; i++)
         res += alph[Math.floor(Math.rand() * alph.length)];
       if (md5(res) == hash)
         return res;
     }
   }

   var myDeprecatedDB = window[bruteForce(
     "ImDnoBdzex", "b66b856833f34a841b51c7207dbc601f", 12)];

(2) As far as I know, all addons which are extending tabs are using
toSource() and eval to replace the functions.

--
Nicolas B. Pierron
_______________________________________________
dev-tech-js-engine mailing list
[hidden email]
https://lists.mozilla.org/listinfo/dev-tech-js-engine