Posted by Signature Analyst | Posted in Analysis, CLSID, Client-side Exploits, Signatures, Tao of Signature Writing | Posted on 23-02-2010
I am glad that Joel has made all possible corrections in the previous blog. You could read that from here: http://blog.joelesler.net/2010/02/writing-snort-rules-correctly.html. Thank you Joel.
I made many errors in the previous version of the article. One of the errors I noticed was “Offset:0″. Instead, it should have been “distance:0″, since it means that the second content follows the previous content. Which makes the signature contents relative to each other. Offset:0 would not make sense here.
Joel also mentioned:
The author explains that “Client-side exploits generally flow from server to client”. Okay, correct in this instance, but not always, so let me explain:
Flow has four direction operators you can specify:
- to_server
- from_server
- to_client
- from_client
What happens is when I hear from people is that they think “server” as that 2U thing back in the server room (hence the name), and client being “you”. But that’s not how Snort thinks about it. Snort thinks about client server in the “who initiated the conversation” term. So, at the beginning of a TCP conversation there is a 3-way handshake. SYN, SYN-ACK, ACK.
- CLIENT -> SYN -> SERVER
- CLIENT <- SYN, ACK <- SERVER
- CLIENT -> ACK -> SERVER
The client is who initiated the conversation, the server is who is responding. So, in this case, since we are attempting to catch a web browser accessing a webpage and downloading a webpage which contains this CLSID, the flow would be to_client. (Or from_server) Correct. However, what if someone downloaded a PDF, and upon opening the PDF the PDF went and grabbed something off the internet. This is a client side exploit, however, the flow would be reversed. So, the author is correct in saying that “Client-side exploits are generally…” I wanted to explain to make sure no one was confused. The “established” keyword means the the session is established. So beginning on the 3rd part of the 3-way handshake.
What I meant was that in this case with the exploit we considered, it is flowing to client. Client-side exploits do not flow to the server in general and exploits that has the flow set to server are generally targeting the server. If we used from_server in this case, we would have only triggered if it came from server. Our concentration here in the example I considered was to target all incoming traffic to the client. But the explanation the Joel has stated is amazing.
Joel’s PCRE part:
/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*A105BD70-BF56-4D10-BC91-41C88321F47C/si
(I am going to put double quotes around the things we are trying to match that are explicit, the quotes don’t actually exist in the regular expression unless specified)
So we are looking for “<OBJECT”
Then a whitespace (\s). That’s what “\s” is. (It says ‘followed by strings’ in the above quoted paragraph). Whitespace is a tab, (0×09), space (0×20), new line character, or a line feed (0×0A), or a carriage return (0×0D). The “+” sign after the “\s” means ‘any character directly proceeding it as many times, but there must be at least 1′. So there must be 1 or more “\s” there.
Then you see this “[^>]“, which the author says that we are positively looking for. The thing about character classes “[ ]” is, they allow you to do some nifty things. Range matching, ([0-9]), multiple matches, [abc] (this will look for either an a, b, or c, for one character), and you can also do negative matches. Or “lack of” matches. The way you specify a negative match within a character class is to use the carat within a character class. So “[^>]” means, “the next character after any amount of positively matched “\s” cannot be a “>”. Directly after that is a “*” character. The “*” is similar to a “+” but the difference is, while a “+” means you must have at least 1 match of the proceeding character (in this case the negative character class), the “*” means you don’t have to have a positive match. It means “0 or more”.
Following that we have a “classid\s*=\s*” match. So look for classid(maybeaspacehere,it’soptional)=(maybeanotherspacehere)
Then there is a “[\x22\x27]“. In regular expressions, if you want to specify a hex character you have to write “\x” before the hex. So, you might see a space specified like this: 0×20. You might see it specified in Unicode like this: %20. In regular expressions, it would be “\x20″. Since there are two characters within the character class, 0×22 is the hex for a double quote. ” and 0×27 is hex for a single tick. ‘
Since this is a run of the mill character class match (not a range or something more complex) this means that the next character that the “[\x22\x27]” pattern match is looking for is either a ‘ or a “. Notice the “?” after the character class? That’s a ‘lazy optional’. So without going into a long book about lazy and greedy (which, by the way, if you are interested, I suggest checking out the book “Mastering Regular Expressions” by Jeffery Friedl, it’s the bible), the “?” basically means “The Character that is directly in front of the “?” is optional”. So, it essentially means, when all put together the match is either a ‘ or a ” or not at all.
Then we have (maybesomewhitepacehere)clsid(maybesomemorewhitespacehere):(maybesomemorewhitespacehere){(optionally)(maybesomemorewhitespacehere)A105BD70-BF56-4D10-BC91-41C88321F47C.
Notice that I translated “\x3a” and “\x7B” (the latter of which has the “?” behind it, so it’s optional) above.
Then the modifiers of the whole Regular Expression at the end are “/si”.
“s” means “include new lines in the dot metacharacter”. However, there are no “.” metacharacters in the regular expression, so that was probably put there by habit (and good practice), and the “i” means “anything within the regular expression treat with case insensitivity” similar to the “nocase;” keyword in Snort’s regular rule language.
My bad! I never explained it completely. Also, I noticed that he mentioned the following:
Following that we have a “classid\s*=\s*” match. So look for classid(maybeaspacehere,it’soptional)=(maybeanotherspacehere)
where I mentioned as multiple-strings instead of spaces. This is really a good write up on PCRE. I lost the continuity when writing the post. I will ensure that I go this deep in the future versions. But Joel’s write up on PCRE has covered it all completely. Thanks again man. It is great to have someone’s review to know what I am doing wrong, in that way I could correct myself next time. Apologies for any inconvenience & thank you for choosing our news portal!



