Saturday, 21 July 2018

Predicting future of Pakistan through social media: A Battle between PTI, 


Dr Mubarik Ali, 
Sustainable Development Advances ( 

Social media has exploded as a category of online discourse where common persons can create contents, share them, bookmark them, and publish them at a prodigious rate. Examples include, Facebook MySpace, Twietter, Digg, and Youtube. Two worldwide popular social media websites, Twitter and Facebook, demonstrate its explosive growth and profound influence. Both Twitter and Facebook are in the top 10 most-visited websites in the world according to Alexa ranking. According to a recent estimate, more than 2000 Millions users check their Facebook account on monthly basis, followed by Twitter, which has more than 300 Million active users. Furthermore Youtube is claimed to have more than 1500 Millions users and LinkedIn has more than 260 Millions users. Social media is rapidly changing the public disclosure in society due to its simplicity, ease of use, and speed and reach. Interest in social media, from individuals (especially youth) and businesses alike, is rising worldwide and it is setting trends and agendas in topics that range from the environment and politics to technology and the entertainment industry.

As more people offer up posts and tweets about their likes and dislikes, social media data can be constructed as a form of collective wisdom. This collective intelligence data, if extracted and analyzed properly, can lead to useful predictions of certain human related events. Such diction has great benefits in many realms, such as finance (predicting real-world outcomes), product marketing (consumer insights) and politics (predicting election outcomes as done by Nate Silver for US elections). Other researchers have also claimed that social media based predictions outperform some other traditional techniques such as surveys and opinions polls. In this regard, our social media prediction for Pakistan election should be more accurate than the recent television surveys and polls.

Pakistan is an emerging country having inchoate software industry. According to an estimate about 50 Millions people in Pakistan uses social media, most of them are youth between 18 to 35 years of age. The enormity of the information about Pakistan politics that propagates through large user communities presents an interesting opportunity for harnessing that data into a form that allows us to predict the future of a political party. We also build models to aggregate the opinions of the collective population and gain useful insights into their behavior, while predicting future trends of a specific political party. The success of fivethirtyeight, a website that made predictions of the US elections based social media data ( further strengthen our claim to make prediction for Pakistan election through social media.

As in Pakistan 2018’s election it is the first time most of the youth has registered for the votes, hence using their opinions about different political parties one can infer the success or failure of these parties.

Figure 1: Figure illustrating the percentage increase in the number of social media (Facebook) users in Pakistan (from June 2013 to June 2018). There has been tremendous increase in social media users (Source:

In this article, we used Brandwatch social media monitoring software ( We crawled data from major websites including Facebook, Twitter, Youtube, and other open social media forums (more than 100 websites in total) from Jan 2018 to June 2018. We used state-of-the-art sentiment (sentiment is the overall emotion of a customer towards a product, it can be positive showing positive mode/tone of the user, negative showing negative mode/tone of the user, and neutral where a user’s tone is not biased towards any of the emotion) analysis algorithms.

Volume Based Prediction

Volume is the total number of mentions (e.g. a text snippet representing description, comment, suggestion, etc.) scripted by social media users associated to a particular political party. Volume plays a major role in predicting events, for example, predicting the revenue a movie will make in the box office before release is estimated based on the total number of users/mentions being talked for that movie in Web.

The next figure shows the overall volume of the main political parties in Pakistan. There are total 242 thousands mentions. We observe that PTI is the most famous party over the social media and people are talking continuously about PTI, highlighted by spikes. Political anchor called it a huge success of PTI and steps towards election campaign of PTI.

* All dates have been hidden to omit any controversy. 

Figure 1: A snapshot of overall volume of main political parties in Pakistan from Jan 2018 to Apr 2018. There were about 242 thousands mentions during 3 months periods (Apr 2018 to July 2018). 42% mentions have negative sentiment whereas only 16% have positive sentiment (see footnote about sentiments). 

Figure 2: Volume of the mentions coming from Punjab (Pakistan). The figure has the same pattern as depicted by Figure 1.

Figure 3: Comparing the volume of different parties over social media for last three months.

Sentiment based Prediction:

The second analysis is the sentiment-based analysis. Sentiment is emotional tone of a sentence for a specific target. For example if our target is “PTI” the some sample sentences and sentiments are given the following table.

Table 1: describing how sentiment works Sentiments can be positive, negative, and neutral and they depend on the target. For example the tone of the sentence: “PTI is making good progress than PMLN” is good/positive for PTI and bad/negative for PMLN. We are more interested in collection of mentions having positive and negative sentiments.
PTI is making good progress than PMLN
Positive for PTI
Nawaz Sharif has been disqualified by Supreme Court
Positive for PTI
PTI is run by master of U-turn (Imran Khan)
Negative for PTI
PTI is made by Imran Kahn
Neutral for PTI
PTI is making good progress than PMLN
Negative for PMLN

Sentiment plays a major role in Stock market prediction. For example, researchers have used sentiment of the people to predict stock market trends (Forex market). Our sentiment predictor is very simple---we map the sentiment of people towards a political party as the success and failure of the party. The extensive analysis can be described into following main points:

Figure 4: Net sentiment of all mentions related to a specific political party. PTI is appearing to fade its good/positive sentiment over period of time (becoming more negative); however, the sentiment is getting better/positive from the month of Apr 2108 to July 2018. The sentiment of PTI is getting better than PMLN towards end of July 2018. PPP has the worse sentiment, which is consistently negative over the period of time. The sentiment of PMLN is improving over July 218, which can produce spike in a week.

1-    The general sentiment of Pakistani nation about all political parties is skewed towards negative. It shows the general despair of nation over all political parties---people believe that none of the active party can bring the real change in Pakistan.

2-    PTI has sentiment that is more neutral rather than negative showing nation trusts more over PTI compared to other parties. In other words, people want to give PTI a chance.

3-    The sentiment of PMLN and PPP is negative indicating Pakistani nation does not has any positive expectations from these parties. The sentiment of the PPP is worse compared to the rest.

The bottom line of this article is that we can predict the future of the Pakistan through social media using simple prediction algorithms. We used two simple prediction algorithms, volume based and sentiment based. The chief points of our analysis  can be underlined as follows:

1-    PTI has build good auspicious momentum in recent weeks and, if the momentum continues, PTI consistency has more chances of sweeping the elections based on our predicting algorithms.

2-    PMLN has close competition with PTI and can be considered the second favorable political party.

3-    PPP is the least favorable party in eyes of Pakistani nation and has less chance to win.

We will continue this analysis on bi-weekly basis. We will predict the probability by which a political party will win in major provinces of Pakistan.

Special Thanks to Brandwatch ( for providing us valuable tool to analyze the data.

[1] Sentiment analysis is a well-studied problem in linguistics and machine learning. It is a classification problem, where a given text needs to be labeled as
Positive, Negative or Neutral based on its tone and context about a specific subject.

Wednesday, 13 July 2016

Electricity Crisis in Pakistan and Solutions

Though the Govt is deploying massively in infrastructure (eg Quaid-i-Azam solar park worth of 1000KW) but there are other issues/solution we are ignoring. Major solar energy companies (NRG Clean Power, SunPower by Positive Energy Solar, Boston Solar, etc) should be consulted based on public private partnership and this should be deployed to individual houses. Believe, the God gifted geo-graphical Pakistan got so much potential, that once solar and wind parts are deployed at gross level; we can overcome this 10,000 MW shortfall in 2 years in contrast to hydro-electric (which can take a decade) . 
We need to overcome these issues:
1- Affordable:We should consider the rate of electricity being produced. The current rate is around Pkr 14/unit and the customer is paying around Pkr 11/unit. There are huge losses in terms of subsidy, transmission losses (infrastructure breakout, old-fashioned ways to transmission by K-Electric), and theft losses (around 22% is lost due to theft)?
Solution: Clearly, we must focus on these issues rather than merely producing more power. Subsidy policy might be revised as some mafia is exploiting it rather than being helping poor
2- Collection of Bills:Recent reports shows, abt 80-90% bills are being recovered. This is another loss, we are facing. There are reports saying electricity loss due to theft and non-collection (and some other issues, e.g. difference between unit production price and unit payment by customer, which we might call subsidy) account roughly for 40%.
Solution: We should develop better mechanism for payment of bills and tracking.
3- Dont subsidies:The energy being produced i very expensive compared to neighbouring regions (on avg 14 PKR/unit). The customer pay around 11/unit. The loss is being subsided (Rest assure it is "LOSS").
Solution: We need to revise policies of subsidise and shift our focus from "Oil-based Power" plants to renewable sources (say solar, gas, etc). We are producing 37% energy from oil (contrast to the rest of the world which is 6%), which is needed to be switched to coal/Soar.
4- Erroneous Shortfall:As there might be considerable amount of massive who are not connected to the grid, hence, we might not have precise view of energy shortfall (which at present is the difference between demand and supply). Moreover, we cant even assume that the ones, who are connected to grid, will use a specific amount of energy (they might consume more).
Report say, Pakistan needs around 15,000 to 20,000 MW electricity per day, however, currently it is able to produce only 11,500 MW per day hence there is a shortfall of about 4000 to 9000 MW per day.
Solution: Better surveys are needed for estimation.
We can get the solar energy as long as sun is there and it's totally free of cost. Central station should be created, so that in summer, the individual plants can send extra energy to those remote locations; resulting in "free/cheap energy". On serious note, you dont need to construct massive debatable water dams, (say kala bagh etc), and conventional hydel, hydral, and coal plants, to overcome energy crises. The Planning Commission is doing GREAT (especially in CPEC energy projects), but comprehensive approach is needed to be followed---even controlling losses can reduce crisis significantly.

Tuesday, 4 March 2014

Musi Ali's Journey to PhD

If you believe in yourself and work hard enough nothing is impossible!

Dr. Mustansar Ali Ghazanfar is a young researcher in the field software and IT (machine learning and data mining). He is an inspirational example for young Pakistani students living in remote villages having poor schooling and background.

His primary education is from a remote area of Sialkot (Govt. primary school Zafarwal, Narowal), where the schools do not even have basic facilities; such as buildings, furniture, fans, pure water, white/black board, toilets, etc. He with his fellows used to sit on mat (a thick hard cover that is used to store goods for transport) rather than chairs regardless of the weathering conditions. Sometimes, the only shelter they had was trees during bad weathering, that provide cooling during summer and their small branches served as lights and heating source. Furthermore, the teaching standards were really poor—the course was in Urdu, the teachers got no formal training (such as communication skills, art of teaching, etc.), and even worse the teachers used to keep their students busy in their own home chores. After securing top in his school repeatedly, he got admitted to Govt. High school Zafarwal, Narowal, where the conditions of teaching were again not satisfactory (if not worse). He passed his schooling while securing highest marks among his cohort from 6th to 10th grades.

At the age of 16, having no vision, he was worried about his further education, and luckily his elder brother, who was a student at Rawalpindi Medical College, suggested him to apply for FG Sir Syed College Rawalpindi, one of the prestigious colleges in Federal board. He applied for admission at this college, and the college readily accepted his admission seeing his credentials. “Those were the toughest days of my life”, he recalls. I have to cycle more than 10 km everyday to college in the morning then to a tuition center in the afternoon regardless of the weather conditions. Due to some time crisis, at times, I don't have enough time to eat my lunch; however, this served as a source of massive inspiration. Furthermore, as there was a sudden transition of medium of teaching from Urdu to English, I had to work extra hard (sometimes working more than 15 hours a day) to compete with my fellow students, who got excellent schooling background, e.g. beaconhouse, Roots, etc compared with mine (coming from Urdu medium school from a remote village). I worked hard and I was among top 1% students of my college during 1999-2001.

After his FSC, he was offered admissions (straight at open merit) in UET-Lahore and Taxila in Electrical Engineering; however, he chose Software engineering knowing its future growth. He repeatedly secured first position and got scholarship (department topper's scholarship) during his BSc Engineering at UET-Taxila. He got gold medal from UET-Taxila software engineering department for securing highest (90%) grades in the department in 2006. He also got massive cash prize from Chairman Nescom, Dr Samar Mubarik, for his achievements.

After his graduation, he served at ministry of defence as Assistant Manager for a short while where he got opportunity to work on the software part of Pak-Sat IR (the Pakistan first ever indigenous satellite). He got a chance to have informal discussion with Dr. Riaz Suddle (Director Suparco Lahore) about satellite communication.

He was awarded with full scholarship for his higher studies in fall 2007 from HEC, Pakistan. He was admitted by Electronics and Computer Science (ECS) department of University of Southampton, which one of the best  ECS departments in UK (among top 3 department in electronics and computer science according to Guardian University Guide and Southampton university is ranked in the top 51-100 universities for Computer Science consistently in the QS World Rankings) having deep collaboration with MIT USA. He got opportunity to have discussions with reputed faculty members of Southampton, e.g. sir Tim Berners-Lee (, the inventor of the World Wide Web (WWW). He worked under the supervision of Prof. Nick Jennings (One of the best researchers in Europe in Artificial Intelligence having h-index of 97 (the second top non-American according to Palsberg)). His MSc dissertation got highest grades among his class and got published in a well-known conference.

He was awarded with PhD scholarship from University of Southampton in fall 2008 and he worked on Instant Knowledge project, a research project funded by Vodafone, aiming at providing personalized recommendation to mobile users. During his PhD, he published more than 10 international journals and conference papers. One of his journal paper (published in Information Sciences) is among top journals of the world (impact factor > 3). One of his papers was granted with best student paper award in Rome, Italy in 2011. He has presented his work across whole Europe, Eastern countries, and USA (twice). He passed his PhD viva at the age of 27 from University of Southampton, UK. During his PhD, he also worked at Brandwatch, a multinational firm working on data mining and machine learning where he improved their language detection framework by 10% (accuracy improvement). He served as a mentor for MSc Engineering students during his stay at University of Southampton providing them career counseling, providing advice for home sickness to international students, and assisting in their daily business.

He was approached by various Web giants such as, Microsoft, Google, and other multinational companies worldwide; however, he preferred to serve Pakistan’s inchoate software industry. He is currently serving as an Assistant Professor at UET-Taxila where he is supervising various MSc and BSc Engineering students. He is an active blogger ( and regularly writes for newspaper. He also served at British Red cross during his stay in United Kingdom as a Volunteer. He is an active sports person, playing all sort of games, including long tennis, cricket, soccer, swimming, running, and Gym. His detailed professional profile can be found at‎ .


·    Best Student Paper award in IADIS European conference on data mining, July 2011, Italy
·    Got PhD studentship from MobileVCE (
·    Achieved scholarship from HEC ( for doing MSc in U.K.
·    Top in Software Deptt.,University of Southampton, 2007/08 session.
·    MSc Dissertation (proposing a solution for collation formation among multi-agent systems) got heights grades in the department.
·    Gold Medalist of UET-Taxila in Software Deptt. (Overall 4 yr top)
·    Achieved scholarships repeatedly from UET Taxila (topper scholarship).
·    Achieved grand cash prize from Chairman NESCOM (Dr. Samar Mubarik) Pakistan, Apr. 2007
·    Achieved scholarship from PITB (, Oct 2005 


·      Data Mining and Machine Learning: Recommender Systems, Collaborative Filtering, Information Retrieval, Kernel based learning, Classification techniques, Dimensionality reduction techniques for handling large scale data, Clustering, Google Page Rank,
·      Forecasting, Personalization, Prediction
·      Quantitative Analysis, Quant research and development, Hedge fund
·      Software Development, Object Oriented Programming

Google+ Badge