Signature generation, analysis and testing is a complex art, that requires many parameters to be coordinated. “Tao” is “the ultimate principle of the universe”, and in here it means that signature writing requires an analyst to understand the principles in the signature universe. Since each signature format has its own structural design and components to perform pattern matching of the exploit or malicious software/code from entering into the network, one has to know the entire signature universe on the whole and how it interacts with the other universe. This might look like a multiverse theory, as seen in science fiction movies. Multiverse is also defined in Parallel universe:
“Parallel universe or alternative reality is a self-contained separate reality coexisting with one’s own. A specific group of parallel universes is called a multiverse, although this term can also be used to describe the possible parallel universes that constitute physical reality.”
We are not trying to mix up facts with fiction, but we want to help your learning curve go steep by keeping it interesting. Emerging Threats has classified its signature set, by splitting it into several domains. But most of these domains might be inter-related as they are within the same universe. We are not going to discuss about Emerging Threats Rulebase in this article, but we are going to look into the mindset that is required for becoming a signature writer in general. We are not trying to say that you could become a signature writer if and only if you have this mindset, but we are only trying to say that having this mindset could add an extra advantage towards steepening your learning curve.
If you would like to write signatures and if you believe that you are a layman in the SigUniverse, you might want to learn more about the focus of the results produced. Result is going to be a signature that when enabled performs one of the following checks:
- Anomaly-based
- Decoy-based
- Malware-based
- Exploit-based
- Policy-based
- and more signature triggers…
Trigger here, means that the signature is triggered when one of the above was observed. That is, if a traffic that is being monitored is something that is not according to the RFC based desciption of the protocol, then it is deviating from norm. At this point of time, it would generate an anomaly signature to trigger. Hence, one should always set a boundary or rather construct a perimeter from which they are deriving their input from. That is, you must be focused on the purpose of the signature, the data-set that you are going to use to analyze and generate the signature from, the data-set that you are going to test the signature against and other parameters or components that could help you in the process.
The first rule of thumb – KISS – Keep It Simple [Stupid]. If you do not keep it simple or if you keep it very simple, either way you would become a stupid. What we are trying to say here is, if you keep it TOO Simple then you would have many false-positive hits and if you complicate the signature to look only for a very specific thing, while the attack is also possible through many other vectors then the result would be false-negative in nature. Hence, while keeping it simple one might also want to remember that Simple does not mean that it should be small in number/quantity or easy to generate, and also that good signature does not mean that the signature has to look complex or should be so complicated to even create it. It is more of an equilibrium that we are trying to settle in for, as shown in the following image:

NOTE: This is not a real/true research output of the triggered signatures of any particular device. This is only a sample graph to help understanding the equilibrium between false-positives and false-negatives. In reality, there aren’t any specific equilibrium cases that would come as a straight line as shown above. It would be more of a regional value as shown below:

One must understand that there is no single perfect signature that can be the only solution to any given problem. There are several possibilities of arriving at a signature that is somewhere close to the end-results that you would like to obtain. But to arrive at close enough results, you should make sure that the problem set you have taken has mapped the TRUE source of the problem/issue. Let us look into this with 2 scenarios and analyze your steps.
Scenario 1.
You are a security analyst for an enterprise. Your enterprise uses Joomla for its main corporate website. You found that there is a most recent vulnerability in Joomla. How would your proceed?
Review of the Scenario:
In this scenario, you are provided with the focus “Joomla”. The following image shows the way you would be looking at the attack vectors:

Once you know the attack vectors, you would start looking for couple of ways to determine the next step. If you are vulnerability specialist and a reverse engineer, you would try reversing Joomla and finding out the list of vulnerabilities in it. Automated tools could give you results only to some extent, manual analysis is time consuming too. Hence, in this scenario if you are the only analyst for your company and if you are not a reverse engineer, then you would start looking out for resources that could give you a better picture of what could be the real focus on. If you came across the Exploit-DB portal from Offensive-Security group, you would notice the most recent posting [under Web exploits section] of the Joomla Component com_kunena Blind SQL Injection Vulnerability, you would wonder if you are vulnerable to that. You might want to understand the exploit vector and how the attack happens, check for privileges that are required for this to be performed, check for patches, etc. but before all that, you might want to write a simple signature for all packets that comes through the end-point device.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[~]>> TITLE: Joomla (com_kunena) BLIND SQL Injection Vulnerability
[~]>> LANGUAGE: PHP
[~]>> DORK: N/A
[~]>> RESEARCHER: B-HUNT3|2
[~]>> CONTACT: bhunt3r[at_no_spam]gmail[dot_no_spam]com
[~]>> TESTED ON: LocalHost
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[~]>> DESCRIPTION: Input var do is vulnerable to SQL Code Injection
[~]>> AFFECTED VERSIONS: Confirmed in 1.5.9 but probably other versions also
[~]>> RISK: Medium/High
[~]>> IMPACT: Execute Arbitrary SQL queries
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[~]>> PROOF OF CONCEPT:
[~]>> http://[HOST]/[JOOMLA_PATH]/index.php?option=com_kunena&Itemid=86&
func=announcement&do=[SQL]
[~]>> {RETURN TRUE::RETURN FALSE} ---> VIEW TIME RESPONSE ||| HIGH: TRUE
||| LOW: FALSE
[~]>> http://server/[JOOMLA_PATH]/index.php?option=com_kunena&Itemid=86
&func=announcement&do=show', link='0wn3d', task='0wn3d' WHERE userid=62 AND
1=if(substring(@@version,1,1)=5,benchmark(999999,md5(@@version)),1)/*
[~]>> http://server/[JOOMLA_PATH]/index.php?option=com_kunena&Itemid=86&
func=announcement&do=show', link='0wn3d', task='0wn3d' WHERE userid=62 AND
1=if(substring(@@version,1,1)=4,benchmark(999999,md5(@@version)),1)/*
[~]>> Note: There are more affected vars.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Now that you have the exploit in your hand, you could think of a pretty simple signature with the above data. With the above data, if you write a signature with just “/index.php?” as the file-frame, you would have a hit for every single access to every directory on the web-site since index.php is very common. Hence, keeping it that simple here is not a good idea, although you do not want to do an exact match of the above given example either, say “http://server/[JOOMLA_PATH]/index.php?option=com_kunena&Itemid=86&func=announcement&do=show’, link=’0wn3d’, task=’0wn3d’ WHERE userid=62 AND 1=if(substring(@@version,1,1)=4,benchmark(999999,md5(@@version)),1)/*”, because that would lead to false-negatives for sure. This is where, you would start thinking about the equilibrium.
Scenario 2.
In this scenario, you do not know the tools that are used in your corporation. You don’t know what servers are running in the DMZ , what are the various custom applications in each of them and so on. In this case, how would you focus?
In such case, you would not know what you are protecting and what you are fighting against. Something like what you see below:
NOTE: This cannot be entirely true. Although what we are trying to suggest is that, it is possible for someone to not know every possible application running on a server. In such a case, you might not know if you would have already protected it with a signature that was enabled for something else, or if you had left the loop hole open. In either way, you are pretty much doing something without having a clear focus on what is happening. This could lead to bad end-results.
In this blog, we were aiming to help you with the fact that “focus” is the most important factor of researching data to find the subset, from which you would generate a signature to protect your network. We would continue the series by talking more about the different types of signatures and how could you generate the subset of data that is most accurate in generating those signatures.
Thank you for choosing our blog! We hope that this was helpful…