many antivirus programs using signature-based malware detection. Here is creating signatures for ClamAV. I can understand how they create signatures considering that the whole file is a malware, but I couldn't understand how to find malware when it is in the body of the file - the hash would be another. Anybody knows?
My answer is not specific to ClamAV; instead I've answered in a general sense. Maybe this is helpful for you.
First of all a virus signature is not necessarily a hash value of a file. A signature is usually a string of bits found in a file, although a hash value could also be used as a signature.
Suppose, for example, that a virus contains the string of bits 0x23956a58bd910345
. We can consider this string to be a signature of the virus, and we can search for this signature in the files on a system. However, even if we find the signature, we can’t be certain that we’ve found the virus, since other innocent files could contain the same string of bits.
Its interesting to note that if the bits in files were random, the chance of such a false match would be negligible at 1/2^64. reference
There many ways to generate signature and/or features for malware detections. Learn more here.
Also, there is other way to detect viruses:
1 Anomaly detection - also known as behaviour analysis - tracks of activities of an executable like:
- Modified or created files
- Registry modification
- Which DLLs were loaded before execution
- Accessed virtual memory
- Created processes
- Network connections opened and the packets transmitted
- What storage areas the malware accessed, installed services and kernel drivers as well as other information.
reference
2 Change detection - a file that unexpectedly changes may indicate an infection.
How can we detect changes? Hash functions are useful in this regard. Suppose we compute hashes of all files on a system and securely store these hash values. Then, at regular intervals, we can recompute the hashes and compare the new values with the previously stored values. If a file has changed in one or more bit positions — as it might in the case of a virus infection — we'll find that the newly computed hash does not match the previously computed hash value.
There are many disadvantages to change detection. Files on a system often change due to normal system functions rather than malicious behaviour. As a result, change detection is likely to yield many false positives, which places a heavy burden on users and administrators. If a virus is inserted into a file that changes often, it will likely slip through a change detection regimen. [reference: Mark-Stam's Book INFORMATION SECURITY]
And you thinks correct hash mechanism is weak method to for detection.
In my research work, I compared and classified more than 2000 real viruses using 14 antivirus tools and I found that ClamAV is very bad at detecting virus! Here is the link for a paper describing MOMENTUM.