研究生: 吳威霖
Wu, Wei-Lin
論文名稱: 從空品物聯網探討空氣品質對於社群媒體風向之影響
A Study on Impacts of Air Quality on Social Media Polarity
指導教授: 陳伶志
Chen, Ling-Jyh
學位類別: 碩士
系所名稱: 資訊工程學系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 43
中文關鍵詞: 空氣品質社群媒體文字探勘情緒分析詞向量網路聲量
英文關鍵詞: Air Quality, Social Media, Text Mining, Sentiment Analysis, Distributed Vector, Public Internet Sentiment
DOI URL: http://doi.org/10.6345/NTNU201900427
論文種類: 學術論文
相關次數: 點閱:108下載:6
  • 空氣汙染會提高呼吸道疾病及死亡之風險,是目前全世界都必須正視的環境議題,而現今民眾得知空氣品質的方法大多還是依賴環保署的儀器資料,但其目的是為了監測大範圍空氣品質的長期變化趨勢且所提供的空氣品質數值多為小時平均值,其即時性並無法讓民眾及早預防突發的空氣汙染。然而在社群媒體發達的時空背景下,民眾發文時的情緒能夠反應出民眾當下最直接的感受,且這些情緒同時也會被空氣品質的好壞所影響,因此本篇論文提出了一個分析文章標題情緒的方法,去分類社群媒體文章的情緒,並找出其與空氣品質的關聯性。研究中使用了環保署的空氣品質資料及批踢踢實業坊的文章資料,以情緒分類模型將文章區分為正面及負面,其準確度可以達到 85 %,並將正負面文章的數量與空氣品質數值進行初步分析,而結果中可以發現空氣品質在影響民眾情緒上有相關連性,實驗過程中也能觀察到民眾的慣用詞語及對特定詞語的觀感,在 PM2.5 相關研究及政策制定等方面能夠作為一項參考的依據。

    Study shows air pollution will increase the risk of respiratory diseases and death. It is one of important environmental issues that is often cause a great concern for the world. Nowadays, people could obtain air quality with the data from the Environment Protection Administration. Due to the purpose of monitoring large-scale and long term trend regarding air quality and its average value, the immediacy of air quality data is inadequate for people to prevent sudden air pollution accidents. However, social networks have turned into part of people's daily lives. The aspect of articles that people published can accurately reflects public perception, which may be affected by air quality. In this research, a method is proposed for analyzing and classifying the emotions according to the title and content of articles with their relevance to air quality. The classification model determines the relationship between air quality and public internet sentiment based on the volume of internet post and their emotion ratio. Using the data from the Environment Protection Administration and PTT (the largest Bulletin Board System in Taiwan), the proposed model able to divides article titles into positive/negative emotion with 85% accuracy. The results show that air quality has relevance to people’s emotion. Also, the idiom and people’s emotion against particular words can be associated. Thus, this research work can contribute to PM2.5 related work and aid to policy- making process.

    圖目錄 V 表目錄 VI 第一章 緒論 1 第二章 相關探討 3 2.1 自相似程度 (Self-Similarity) 3 2.2 時間序列分解 (Time Series Decomposition) 4 2.3 空品事件 5 第三章 研究方法 6 3.1 資料來源及前處理 6 3.1.1 資料來源 6 3.1.2 資料前處理 7 3.1.3 資料分類 10 3.2 關鍵詞擷取 (Keyword Extraction) 11 3.2.1 TFIDF(Term Frequency, Inverse Document Frequency) 12 3.2.2 Delta TFIDF 14 3.2.3 LLR (Log-Likelihood Ratio) 15 3.3 Word2Vec模型 18 3.4 特徵向量轉換 20 3.4.1 標註關鍵詞 20 3.4.2 相似詞轉換 21 3.5 Weighted Average 24 3.6 分類器訓練 25 第四章 實驗結果 26 4.1 實驗設定 26 4.2 模型比較 27 4.3 空氣品質AQI與社群媒體 29 4.4 社群媒體聲量變化 34 4.4.1 慣用詞語 34 4.4.2 空氣議題聲量變化 35 4.4.3 地區聲量變化 37 4.4.4 政治人物聲量變化 38 第五章 結論與未來展望 39 參考文獻 41

