django
When it comes to web automation we all know the importance of requests and beautifulsoup4 library.
It’s always fun to automate the boring stuff in python. When it comes to web automation we all know the importance of requests and beautifulsoup4 library .Requests allows you to send HTTP/1.1 requests extremely easily and Beautifulsoup4 makes it easy to scrape information from web pages.
Everything was working fine until I needed to make a POST request for registration of a new user .First challenge was to handle the CSRF token in the Django POST request. I somehow managed to retrieve the CSRF token from Django's request header.
import requests
r = requests.get('https://example.com/user')
cookie = r.headers[‘['Set-Cookie']’]
csrf = cookie.split(';')[0].split('=')[1]
Now I provided CSRF in my post request and was hoping for 200 response codes from the server. But I got a 403 (request forbidden) response code. And a message asking me to send cookie data , Referrer etc. in response header.
I dived into the documentation of requests library and found the option to use session objects. The Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance.
Moreover, you can manually add cookies to your session. It really solved my issue and was able to make requests successfully with 200 response codes.
import requests
from bs4 import BeautifulSoup
# Creating session object
s = requests.Session()
url = 'https://www.example.com/user/register/'
r_get = s.get(url)
# Registration form
csrf = r_get.headers['Set-Cookie'].split(';')[0].split('=')[1]
username = 'random_user'
email = 'example@mail.com'
password = 'my_password'
# Setting credentials
cred = {
'csrfmiddlewaretoken': csrf,
'email': email,
'username': username,
'password1': password,
'password2': password,
}
# Setting headers
headers = {
'Referrer': 'https://www.example.com/user/register/',
'User-Agent': 'Mozilla/5.0’,
'Referrer Policy': 'no-referrer-when-downgrade',
}
# Making post request
r_post = s.post(url, data=cred, headers=headers)
Conclusion
Whenever you need to access a website having a secured environment, always use session objects from the requests library.
Moreover, if you’re making several requests to the same host, the underlying TCP connection will be reused, which can result in a significant performance increase and A Session object has all the methods of the main Requests API.
Automation is always fun. And sometimes it’s not easy to achieve it and I struggled a lot due to simple added CSRF security.
Hope it’s helpful to you guys and some of you may be facing the same issue. Let us know whether it helped or not. It really keeps me motivated to share my experience and I also request to share your experience as well as it may be helpful to some of us.