Django: FileField with ContentType and File Size Validation

25comments

24th September 2010 in Coding Tags: django, programming, python, security, upload

If you are looking for a django "FileField" to use in your model with MAX_UPLOAD_SIZE and CONTENT_TYPE restrictions this post is for you.

Immagine for example that you need to add the possibility to upload a PDF or Zip file in the django admin, but you want to be sure your client will upload only these two filetypes and not other ones.
You might also need the same functionality in the frontend, so why not use "ModelForm" and have just one validation for both applications (admin and frontend)?

With this custom file field you can configure the maximum file size for each FileField independantly from the default behavior of Django.

You can also consider this post as a reference about "How to add a custom filefield with custom behavior to your django application".

ContentTypeRestrictedFileField

Let's consider an example of an existing application named "app".

Create a new file in the directory of the application and call it "extra.py" (you can give it a better name if you can think about one) and paste this code:

from django.db.models import FileField
from django.forms import forms
from django.template.defaultfilters import filesizeformat
from django.utils.translation import ugettext_lazy as _

class ContentTypeRestrictedFileField(FileField):
    """
    Same as FileField, but you can specify:
        * content_types - list containing allowed content_types. Example: ['application/pdf', 'image/jpeg']
        * max_upload_size - a number indicating the maximum file size allowed for upload.
            2.5MB - 2621440
            5MB - 5242880
            10MB - 10485760
            20MB - 20971520
            50MB - 5242880
            100MB 104857600
            250MB - 214958080
            500MB - 429916160
    """
    def __init__(self, *args, **kwargs):
        self.content_types = kwargs.pop("content_types")
        self.max_upload_size = kwargs.pop("max_upload_size")

        super(ContentTypeRestrictedFileField, self).__init__(*args, **kwargs)

    def clean(self, *args, **kwargs):        
        data = super(ContentTypeRestrictedFileField, self).clean(*args, **kwargs)
        
        file = data.file
        try:
            content_type = file.content_type
            if content_type in self.content_types:
                if file._size > self.max_upload_size:
                    raise forms.ValidationError(_('Please keep filesize under %s. Current filesize %s') % (filesizeformat(self.max_upload_size), filesizeformat(file._size)))
            else:
                raise forms.ValidationError(_('Filetype not supported.'))
        except AttributeError:
            pass        
            
        return data

Adding the Custom FileField to your model

# models.py in yourproject/app/

from django.db import models
from app.extra import ContentTypeRestrictedFileField

class MyApp(models.Model):
    """ My application """
    name = models.CharField(max_length=100)
    description = models.CharField(max_length=250)
    file = ContentTypeRestrictedFileField(
        upload_to='pdf',
        content_types=['application/pdf', 'application/zip'],
        max_upload_size=5242880
    )
    created = models.DateTimeField('created', auto_now_add=True)
    modified = models.DateTimeField('modified', auto_now=True)

    def __unicode__(self):
        return self.name

Be sure to use a number for max_upload_size value, if you use a string it won't work.

If you're replacing a FileField you won't need to re-sync your db, while if you are just adding this functionality to your model you will need to re-sync your db:

$ python manage.py syncdb

Concern About Security

Even if this method is effective to restrict the content-type, a malicious attacker could still rename a script or whatever to the supported filetype.

To be sure the file being uploaded is really what it claims to be, you have to read it and check it. I even heard about performing a virus check, that wouldn't be a bad Idea either.

Using also FILE_UPLOAD_MAX_MEMORY_SIZE

Even using this custom filefield a user can upload a huge file to the web server, that will be stored in the temporary directory before being rejected.

Only after the upload of the file is complete the system will return the validation error message and delete the file, this means that any user would be able to stress the server with an upload of a huge file (1 GB) for example.

Fortunately this would never happen because django has a FILE_UPLOAD_MAX_MEMORY_SIZE variable that is set default to 2.5 megabytes.

The difference is that when trying to upload a file that exceeds FILE_UPLOAD_MAX_MEMORY_SIZE the server will just interrupt the connection without returning a nice validation error, which could be potentially confusing for your users.

My suggestion is to set FILE_UPLOAD_MAX_MEMORY_SIZE in your settings.py file so it allows files slightly larger than ContentTypeRestrictedFileField.

So if in your model field you're setting 30 MB as maximum, set FILE_UPLOAD_MAX_MEMORY_SIZE at 45 MB.
This way the system will block users that try to upload files that are much larger than the limit, but still providing a nice validation error to those ones that would be exceeding the limit of few megabytes.

Retweet

Comments

  1. 1.

    pyrou said:

    ( on 24th of September 2010 at 18:05 )

    use this snippets ;) it will allow to define max_upload_size='5MB'
    (I used dots to simulate tabs.. because your form sucks ^^)


    class filesize(str):
    ....SUFFIXES = ['KB','MB','GB','TB','PB','EB','ZB','YB']
    ....def bytesize(self):
    ........try:
    ............exp = self.SUFFIXES.index(self.__str__()[-2:])
    ............return int(float(self.__str__()[:-2])*1024**(exp+1))
    ........except ValueError:
    ...........return self

    print filesize('2.5MB').bytesize() # 2621440
    print filesize(1024).bytesize() # 1024
    print filesize('1024KB').bytesize() # 1048576

  2. 2.

    Federico Capoano said:

    ( on 25th of September 2010 at 17:18 )

    Thanks pyrou for the useful snippet, is this on djangosnippet.org?

  3. 3.

    Byron said:

    ( on 2nd of October 2010 at 00:16 )

    What i have to do for get it work with south migrations,??

  4. 4.

    Federico Capoano said:

    ( on 2nd of October 2010 at 00:48 )

    I don't have a clue, I never used south. But in theory this field behaves in the same way as a normal Django FileField, so if you can integrate that with South, then this custom file should work too.

  5. 5.

    Byron said:

    ( on 2nd of October 2010 at 00:50 )

    I solve my problem changing some lines...

    def __init__(self, content_types=None, max_upload_size=None, **kwargs):
    self.content_types = content_types
    self.max_upload_size = max_upload_size
    super(ContentTypeRestrictedFileField, self).__init__(**kwargs)

    And adding the specific rule for south and the end of the file

    from south.modelsinspector import add_introspection_rules
    add_introspection_rules([], ["^app\.extra\.ContentTypeRestrictedFileField"])

  6. 6.

    Federico Capoano said:

    ( on 2nd of October 2010 at 00:54 )

    Very good, thanks! I'm sure it will be useful for other people coming from search engines.

  7. 7.

    Byron said:

    ( on 2nd of October 2010 at 02:34 )

    Yes , The field works greate , here some content types that i used.

    content_types=['application/pdf', 'application/zip', 'application/vnd.ms-powerpoint', 'application/vnd.ms-excel', 'application/msword', 'application/vnd.oasis.opendocument.text', 'application/vnd.oasis.opendocument.spreadsheet', 'application/vnd.oasis.opendocument.presentation']

  8. 8.

    Federico Capoano said:

    ( on 2nd of October 2010 at 11:48 )

    Remember, though, that to be really 100% sure your users are uploading those content types you have to read the files and check them, but this is not necessary if you're using that field in the admin only.

  9. 9.

    Richard Henry said:

    ( on 26th of October 2010 at 00:26 )

    FYI, interrupting the connection isn't the behaviour of the FILE_UPLOAD_MAX_MEMORY_SIZE setting.

    From the docs: "The maximum size (in bytes) that an upload will be before it gets streamed to the file system."

    After an uploaded file hits the maximum upload size, Django will start to stream it to the tmp directory on disk rather than into RAM. It just means that uploading a 1GB file won't consume anywhere up to 1GB of memory during the upload. Django will still allow an upload of any size.

    (Try it out by setting the value to 1KB and uploading a 1MB file.)

    Some more useful settings here: http://docs.djangoproject.com/en/dev/topics/htt...

    If you actually want to put a hard limit on upload sizes (in such a way that the connection is dropped), do it at the web server level. In Apache, use the LimitRequestBody directive. In Nginx, you will want to set client_max_body_size.

    Here's how to do that in Nginx: http://wiki.nginx.org/NginxHttpCoreModule#clien...

    Then set an appropriate error page for "Request Entity Too Large" (413) errors, as this will be shown if the limit is hit.

    In summary:
    If you're allowing 500KB file uploads, then say so in the help_text and set that as the max_upload_size via something like this field. Then configure your webserver to close the connection on requests larger than 2MB, and show the 413 error page.

    Thanks for writing the field, with a couple of minor tweaks I'm going to be using it in production.

  10. 10.

    Federico Capoano said:

    ( on 26th of October 2010 at 09:54 )

    Thanks Richard for adding this info.
    If you add useful improvements and you share them on the web let us know.

  11. 11.

    JagoanLeks said:

    ( on 3rd of November 2010 at 22:29 )

    maaf saya masih pemula dalam hall ini .

    bisakah anda mengajari saya ?

    Biasanya saya just download template lengkap/layout .

    T__T

  12. 12.

    Bill said:

    ( on 13th of June 2011 at 17:03 )

    Thank you very much, you saved me a day.

    Thank you.

  13. 13.

    Federico Capoano said:

    ( on 13th of June 2011 at 17:23 )

    @Bill: You are very welcome. Glad it helped you.

  14. 14.

    Bill said:

    ( on 22nd of June 2011 at 04:37 )

    Hi Federico, I just found that some JPG file could not be uploaded only in IE8 when using your snippet.

    I used the "image/jpeg" to check for content type
    I am still looking for workaround. seems IE8 fails to retrieve the content-type file header?

  15. 15.

    Bill said:

    ( on 22nd of June 2011 at 05:03 )

    I just found out that, IE8 interpret JPG as 'image/pjpeg' rather than 'image/jpeg'

    after adding this content type to your code, it works perfectly.

    Thank you.

  16. 16.

    Federico Capoano said:

    ( on 22nd of June 2011 at 10:01 )

    @BIll: yes IE interprets images as pjpeg which I think stands for progressive jpeg. I remember one time I've gone crazy to resolve this issue. Classic headaches caused by IE!

  17. 17.

    Data Center India said:

    ( on 21st of September 2011 at 09:44 )

    I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon.

  18. 18.

    angelo said:

    ( on 21st of November 2011 at 18:19 )

    Thanks! It helped me a lot!

  19. 19.

    JonatasCD said:

    ( on 28th of March 2012 at 20:41 )

    hey.

    and if I need to add a check of file name charset` ? how it looks like?

    cheers

  20. 20.

    JonatasCD said:

    ( on 4th of April 2012 at 21:12 )

    answering myself:

    I used slugify to deal with filenames ;)

    thanks to stack overflow

  21. 21.

    kofi said:

    ( on 15th of September 2012 at 20:16 )

    hi , did as written but aint working keep running into errors, but would like to know if the extra.py could be placed in the views.py in django without any hassle!

  22. 22.

    Diana said:

    ( on 14th of October 2012 at 05:41 )

    Hi!! the code works fine..but I can't upload .docx files! using 'application/msword' only works with .doc files. Any ideas pliss?

  23. 23.

    Federico Capoano said:

    ( on 18th of October 2012 at 10:38 )

    @Diana: take a look here:
    http://stackoverflow.com/questions/4212861/what...

  24. 24.

    Pradeep Kumar said:

    ( on 4th of July 2013 at 09:06 )

    How Can I use This in forms??

  25. 25.

    Kenneth said:

    ( on 1st of August 2013 at 22:27 )

    Neat blog! Is your theme custom made or did you download it from somewhere?
    A design like yours with a few simple adjustements would really make my
    blog stand out. Please let me know where you got your theme.
    With thanks

Comments are closed.

Comments have been closed for this post.

Categories

Let's be social

Popular posts

Latest Comments

  1. I got very good results with this, thanks for sharing.

    By Yasir Atabani in How to speed up tests with Django and PostgreSQL

  2. Hi Amad, for any question regarding OpenWISP, use one of the support channels: http://openwisp.org/support.html

    By Federico Capoano in How to install OpenWISP

  3. Sir please guid , i have install the ansible-openwisp2 , now how to add the access points . What is the next procedure . Please help.

    By Ahmad in How to install OpenWISP

  4. Hi Ronak, for any question regarding OpenWISP, use one of the support channels: http://openwisp.org/support.html

    By Federico Capoano in netjsonconfig: convert NetJSON to OpenWRT UCI

  5. Hi, I have installed openwisp controller using ansible playbook. Now, i am adding the configurations automatically using OPENWRT devices in openwisp file by specifying shared_key so can you suggest me if I want to set limit to add configuration how can i do it?

    By Ronak in netjsonconfig: convert NetJSON to OpenWRT UCI

Popular Tags

battlemesh censorship creativity criptography django event fosdem google-summer-of-code ibiza inspiration javascript jquery linux nemesisdesign netjson ninux nodeshot open-source openwisp openwrt performance photo programming python security staticgenerator talk upload wifi wireless-community