Learning Django Middleware by exploring CSRF.

Talking about how much people assume me to be experienced in django, I'm one of those guys who've never seriously thought about how django does specific stuffs and rather rely on it works kind of thoughts. I just simply make my own if I don't want to hassle with django generics and mixins, or I spend 5second of my development time trying to search for it in docs ( thanks for the wonderful examples).

Recently, I had a software engineer interview, and one question that hit me was you've done django but have you written your own csrf (Cross Site Request Forgery) validator? I had written once in node when using with ejs, but I had never thought about how django does it... I answered the same.

CSRF

Cross Site Request Forgery refers when an attacker from another website can submit a form in my website.

Imagine you have a form which allows user to delete his account when he hits a POST request to /deleteprofile.

Now, a third party website can make same form as yours, provide a better SEO and use ads campaign on facebook to make a form which will instead submit a deletion request to your backend.

This is a serious issue.

image.png -Image from Portswigger

You need to validate that the request you've received is originated by you and authorized by you to be performed.

Read More About CSRF: portswigger.net/web-security/csrf

One of the general solution to solve this issue is to use csrf token. Now, the token will be given by your backend to the webpage which is responsible to submit the form as a cookie.

And, the webpage will submit you the data including the csrf token. Now, you will validate if the submitted token is same as csrf token issued by your backend.

This cuts off all the attackers trying to request to your system from another system or trying to trick your users into submitting data unknowingly.

How Django handles security?

If you didn't know django comes with pretty much all the middlewares required to secure your backend. If you want to be practical, just check the settings.py file in your recent django project and find a list called MIDDLEWARE.

#other code....
MIDDLEWARE = [
    'django.middleware.security.SecurityMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware', # this is the one
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
#other code...

CsrfViewMiddleware

Let's get into the github repository of django and try to locate where the source of this middleware is located. Open File in Github

A little Context on Django Middlewares.

In django, a middleware is just a request interceptor which has some hooks associated with it.

If you have zero experience, have a read here.

Some hooks are mentioned below...

  • process_view() is called just before Django calls the view.
  • process_request() is called just before Django parses the requst.

Now, you're ready to go into this journey.

In django, a csrf token contains two part mask and actual token. Each part contains 32 characters consisting of letters and digits.

CSRF_SECRET_LENGTH = 32
CSRF_TOKEN_LENGTH = 2 * CSRF_SECRET_LENGTH
CSRF_ALLOWED_CHARS = string.ascii_letters + string.digits

There are couple of helper methods defined there which help us to create a token, mask token, unmask token and compare tokens...

For this purpose.

 def _get_token(self, request):
        if settings.CSRF_USE_SESSIONS:
            #not important for this context.
        else:
            try:
                cookie_token = request.COOKIES[settings.CSRF_COOKIE_NAME]
            except KeyError:
                return None
            csrf_token = _sanitize_token(cookie_token)
            if csrf_token != cookie_token:
                request.csrf_cookie_needs_reset = True
            return csrf_token

This method extracts the token from the cookies using the CSRF_COOKIE_NAME as defined in the settings.

Now, let's go into the lifecycle.

def process_request(self, request):
        csrf_token = self._get_token(request)
        if csrf_token is not None:
            # Use same token next time.
            request.META['CSRF_COOKIE'] = csrf_token

This hook utilizes the above helper method to extract the token and save it in request.META.

Now let's get into process_view.

if getattr(request, 'csrf_processing_done', False):
            return None

skipping checks if processing is already done.

if getattr(callback, 'csrf_exempt', False):
            return None

skipping checks if csrf_exempt is on.

if request.method not in ('GET', 'HEAD', 'OPTIONS', 'TRACE'):
    if getattr(request, '_dont_enforce_csrf_checks', False):
         return self._accept(request)

csrf token check is only done for put, patch, post and delete methods only. Also, it's worth nothing django doesn't accept post request without csrf tokens by default.

Then the check starts checking if the HTTP_REFERRER is valid or not.

request_csrf_token = ""
if request.method == "POST":
    try:
           request_csrf_token = request.POST.get('csrfmiddlewaretoken', '')
     except OSError:
            pass

Now, we get the token sent by the user in the request.

And then the django checks the token with the issued token.

request_csrf_token = _sanitize_token(request_csrf_token)
if not _compare_masked_tokens(request_csrf_token, csrf_token):
       return self._reject(request, REASON_BAD_TOKEN)

Now, I guess this also explains to you about django middleware. A middleware essentially captures a request in middle, and does something with it and it's upto the middleware whether to forward to next middleware / view or just drop it.