Working on a React/Django app. I have files being uploaded by users through the React front-end that end up in the Django/DRF back-end. We have antivirus (AV) running on the server constantly, but we want to add stream scanning before it is written to disk.
It is a bit over my head as how to set it up. Here are a few sources I am looking at.
How do you virus scan a file being uploaded to your java webapp as it streams?
Although accepted best answer describes it being "... quite easy" to setup, I'm struggling.
I apparently need to cat testfile | clamscan -
per the post and the corresponding documentation:
How do you virus scan a file being uploaded to your java webapp as it streams?
So if my back-end looks like the following:
class SaveDocumentAPIView(APIView):
permission_classes = [IsAuthenticated]
def post(self, request, *args, **kwargs):
# this is for handling the files we do want
# it writes the files to disk and writes them to the database
for f in request.FILES.getlist('file'):
max_id = Uploads.objects.all().aggregate(Max('id'))
if max_id['id__max'] == None:
max_id = 1
else:
max_id = max_id['id__max'] + 1
data = {
'user_id': request.user.id,
'sur_id': kwargs.get('sur_id'),
'co': User.objects.get(id=request.user.id).co,
'date_uploaded': datetime.datetime.now(),
'size': f.size
}
filename = str(data['co']) + '_' + \
str(data['sur_id']) + '_' + \
str(max_id) + '_' + \
f.name
data['doc_path'] = filename
self.save_file(f, filename)
serializer = SaveDocumentSerializer(data=data)
if serializer.is_valid(raise_exception=True):
serializer.save()
return Response(status=HTTP_200_OK)
# Handling the document
def save_file(self, file, filename):
with open('fileupload/' + filename, 'wb+') as destination:
for chunk in file.chunks():
destination.write(chunk)
I think I need to add something to the save_file
method like:
for chunk in file.chunks():
# run bash comman from python
cat chunk | clamscan -
if passes_clamscan:
destination.write(chunk)
return HttpResponse('It passed')
else:
return HttpResponse('Virus detected')
So my issues are:
1) How to run the Bash from Python?
2) How to receive a result response from the scan so that it can be sent back to the user and other things can be done with the response on the back-end? (Like creating logic to send the user and the admin an email that their file had a virus).
I have been toying with this, but not much luck.
Running Bash commands in Python
Furthermore, there are Github repos out there that claim to marry Clamav with Django pretty well, but they either haven't been updated in years or the existing documentation is pretty bad. See the following:
https://github.com/vstoykov/django-clamd
Ok, got this working with clamd. I modified my SaveDocumentAPIView
to the following. This scans the files before they are written to disk and prevents them from being written if they infected. Still allows uninfected files through, so the user doesn't have to re-upload them.
class SaveDocumentAPIView(APIView):
permission_classes = [IsAuthenticated]
def post(self, request, *args, **kwargs):
# create array for files if infected
infected_files = []
# setup unix socket to scan stream
cd = clamd.ClamdUnixSocket()
# this is for handling the files we do want
# it writes the files to disk and writes them to the database
for f in request.FILES.getlist('file'):
# scan stream
scan_results = cd.instream(f)
if (scan_results['stream'][0] == 'OK'):
# start to create the file name
max_id = Uploads.objects.all().aggregate(Max('id'))
if max_id['id__max'] == None:
max_id = 1
else:
max_id = max_id['id__max'] + 1
data = {
'user_id': request.user.id,
'sur_id': kwargs.get('sur_id'),
'co': User.objects.get(id=request.user.id).co,
'date_uploaded': datetime.datetime.now(),
'size': f.size
}
filename = str(data['co']) + '_' + \
str(data['sur_id']) + '_' + \
str(max_id) + '_' + \
f.name
data['doc_path'] = filename
self.save_file(f, filename)
serializer = SaveDocumentSerializer(data=data)
if serializer.is_valid(raise_exception=True):
serializer.save()
elif (scan_results['stream'][0] == 'FOUND'):
send_mail(
'Virus Found in Submitted File',
'The user %s %s with email %s has submitted the following file ' \
'flagged as containing a virus: \n\n %s' % \
(
user_obj.first_name,
user_obj.last_name,
user_obj.email,
f.name
),
'The Company <no-reply@company.com>',
['admin@company.com']
)
infected_files.append(f.name)
return Response({'filename': infected_files}, status=HTTP_200_OK)
# Handling the document
def save_file(self, file, filename):
with open('fileupload/' + filename, 'wb+') as destination:
for chunk in file.chunks():
destination.write(chunk)