Scraping Data from Twitter: Towards Data Science

Rate this item
(0 votes)

The Information Technology Department organised a webinar  on “Scraping Data from Twitter”.

Social networks such as Twitter and Facebook are huge data sources for any Data Science project. Data gathering from various sources and preparing it is the first and essential step in any data science project. To foster the process of preparing one’s own dataset for data science-related projects, the Research and Consultancy Committee, in collaboration with the Staff Development Committee, organised a webinar on data scraping from Twitter on 17 March 17 at 12 noon. The webinar was designed for data science enthusiasts who wish to start data science projects from scratch using Python. The webinar was delivered by Mr Mangesh Wanjari, Information Technology staff member. 

The speaker started the session by highlighting the importance of social media in data science and the impact of social network analysis. Afterwards, the steps to carry out the social network analysis were covered.

Next, the speaker critically reviewed the various data sources and shed light on the importance of data scraping, among other data extraction techniques. Afterwards, Mr Mangesh discussed why Twitter is commonly used more than other social media channels. 

The Tweepy Python package used for accessing Twitter API and how to get a Twitter developer account were introduced next.

After an informative introduction, the speaker demonstrated a hands-on session via the Google Colab platform. The demonstration covered many concepts, including:

  • How to install the required packages
  • How to access files from Google Drive or local drive on Google Colab
  • How to connect to Twitter API
  • How to scrape tweets based on keywords
  • How to store the extracted data into Pandas DataFrame for further analysis

In the end, the speaker allotted some time for a questions and answers session. The participants received the content in a very positive manner.

The feedback was collected from the participants on a scale of 5, the result of the feedback was 4.7

 

Read 106 times Last modified on Monday, 18 April 2022 04:23
Monday, 18 April 2022 00:00 Written by  IT Department In IT
Login to post comments