首页 » 技术分享 » 用python分析川普就职演讲稿

用python分析川普就职演讲稿

 

先贴一篇IBM沃森文本分析历届美国总统就职的演讲稿:http://36kr.com/p/5062661.html

2017年1月20日中午,唐纳德-特朗普在首都华盛顿宣誓就职,正式成为美国第45任总统。完成了从房地产老板直接到美国总统的华丽转身!今儿我们就对他的演讲稿进行文本分析,看看能不能翻出啥有意思的点。

首先,在CNN上面找到了川普就职的演讲稿及演讲视频,具体网址:https://edition.cnn.com/2017/01/20/politics/trump-inaugural-address/index.html

 

代码如下:

 

speech_text='''演讲稿内容'''

speech=speech_test.lower().split() #对演讲稿内容进行小写转化,并进行单个词的分割

dic={}  #建立空词典,储存演讲稿中的词
for word in speech:
if word in dic:
dic[word]+=1
else:
dic[word]=1

dic#显示词典,接下来就是对词典进行处理,词典长这么个样子(只复制了一部分)

 

{'"how': 1,
 '--': 4,
 '17:17': 1,
 '2017,': 1,
 '20th': 1,
 'a': 15,
 'about': 2,
 'accept': 1,
 'across': 5,
 'action': 1,
 'action.': 1,
 'address': 1,
 'administration': 1,
 'affairs,': 1,
 'again,': 1}



 

 

 

 

 

 

 

 

 

import operator

swd=sorted(dic.items(),key=operator.itemgetter(1),reverse=True) #对词典内容进行value的提取,并且按照value逆序进行排序

swd#显示swd,具体如下,按照数值从大到小进行的排序(限于篇幅,只复制了一部分):

[('and', 73),
 ('the', 71),
 ('of', 48),
 ('our', 48),
 ('we', 45),
 ('will', 40),
 ('to', 37),
 ('is', 21),
 ('a', 15),
 ('for', 15),
 ('are', 14),
 ('in', 14),
 ('but', 13),
 ('all', 12),
 ('from', 12),
 ('be', 12),
 ('their', 11),
 ('american', 11),
 ('your', 11),
 ('not', 10),
 ('america', 9),
 ('this', 9),
 ('it', 9),
 ('that', 8),
 ('again.', 8),
 ('with', 8),
 ('every', 7),
 ('one', 7),
 ('you', 7),
 ('people', 6),
 ('great', 6),
 ('country', 6),
 ('on', 6),
 ('has', 6),
 ('back', 6),
 ('while', 6),
 ('by', 6),
 ('no', 6),
 ('new', 6),
 ('same', 6),
 ('president', 5),
 ('they', 5),
 ('have', 5),
 ('across', 5),
 ('right', 5),
 ('never', 5),
 ('at', 5),
 ('make', 5),
 ('you.', 4),
 ('america,', 4),
 ('world', 4),
 ('been', 4),
 ('today', 4),
 ('or', 4),
 ('--', 4),
 ('everyone', 4),
 ('which', 4),
 ('as', 4),
 ('nation', 4),
 ('other', 4),
 ('bring', 4),
 ('now', 3),
 ('its', 3),
 ('people.', 3),
 ('together,', 3),
 ('these', 3),
 ('too', 3),
 ("nation's", 3),
 ('factories', 3),
 ('protected', 3),
 ('there', 3),
 ('here', 3),
 ('america.', 3),
 ('whether', 3),
 ('millions', 3),
 ('many', 3),
 ('an', 3),
 ('so', 3),
 ('i', 3),
 ("we've", 3),
 ('foreign', 3),
 ('countries', 3),
 ('must', 3),
 ('let', 3),
 ('do', 3),
 ('when', 3),
 ('heart', 3),
 ('entire', 2),
 ('americans,', 2),
 ('thank', 2),
 ('citizens', 2),
 ('national', 2),
 ('face', 2),
 ('get', 2),
 ('done.', 2),
 ('obama', 2),
 ('very', 2),
 ('because', 2),
 ('transferring', 2),
 ('power', 2),
 ('party', 2),
 ('small', 2),
 ('government', 2),
 ('share', 2),
 ('wealth.', 2),
 ('politicians', 2),
 ('jobs', 2),
 ('country.', 2),
 ('capital,', 2),
 ('land.', 2),
 ('moment', 2),
 ('belongs', 2),
 ('united', 2),
 ('states', 2),
 ('day', 2),
 ('forgotten', 2),
 ('men', 2),
 ('women', 2),
 ('now.', 2),
 ('movement', 2),
 ('before.', 2),
 ('safe', 2),
 ('good', 2),
 ('like', 2),
]


































































































































#可以看得出来,里面有很多没有用处的词语,我们需要进行剔除,所以需要导入停止词。

import nltk

from nltk.corpus import stopwords

stop_words=stopwords.words('English')

stop_words#可以查看导入了哪些停止词,限于篇幅,也只黏贴一部分

['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their']




































for k,v in swd: #将swd中去除停止词
if k not in stop_words:
print(k,v)

我们可以看到词汇统计结果如下:


american 11
america 9
again. 8
every 7
one 7
people 6
great 6
country 6
back 6
new 6
president 5
across 5
right 5
never 5
make 5
you. 4
america, 4
world 4
today 4
-- 4
everyone 4
nation 4
bring 4
people. 3
together, 3
nation's 3
factories 3
protected 3
america. 3
whether 3
millions 3
many 3
we've 3
foreign 3
countries 3
must 3
let 3
heart 3
entire 2
americans, 2
thank 2
citizens 2
national 2
face 2
get 2
done. 2
obama 2
transferring 2
power 2
party 2
small 2
government 2
share 2
wealth. 2
politicians 2
jobs 2
country. 2
capital, 2
land. 2
moment 2
belongs 2
united 2
states 2
day 2
forgotten 2
men 2
women 2
now. 2
movement 2
before. 2
safe 2
good 2
like 2
stops 2
dreams 2
glorious 2
destiny. 2
oath 2
allegiance 2
borders 2
made 2
left 2
even 2
workers 2
first. 2
jobs. 2
fight 2
breath 2
winning 2
seek 2
nations 2
life 2
old 2
loyalty 2
always 2
talk 2
time 2
god 2
bless 2
donald 1
trump's 1
inaugural 1
address 1
17:17 1
chief 1
justice 1
roberts, 1
carter, 1
clinton, 1
bush, 1
obama, 1
fellow 1
world: 1
we, 1
joined 1
effort 1
rebuild 1
restore 1
promise 1
determine 1
course 1
years 1
come. 1
challenges. 1
confront 1
hardships. 1
job 1
four 1
years, 1
gather 1
steps 1
carry 1
orderly 1
peaceful 1
transfer 1
power, 1
grateful 1
first 1
lady 1
michelle 1
gracious 1
aid 1
throughout 1
transition. 1
magnificent. 1
today's 1
ceremony, 1
however, 1
special 1
meaning. 1
merely 1
administration 1
another, 1
another 1
washington, 1
d.c. 1
giving 1
you, 1
words 1
past: 1
inauguration 1
speech 1
library 1
long, 1
group 1
capital 1
reaped 1
rewards 1
borne 1
cost. 1
washington 1
flourished 1
prospered 1
left, 1
closed. 1
establishment 1
itself, 1
victories 1
victories; 1
triumphs 1
triumphs; 1
celebrated 1
little 1
celebrate 1
struggling 1
families 1
changes 1
starting 1
here, 1
now, 1
moment: 1
gathered 1
watching 1
day. 1
celebration. 1
this, 1
truly 1
matters 1
controls 1
government, 1
controlled 1
january 1
20th 1
2017, 1
remembered 1
became 1
rulers 1
longer. 1
listening 1
came 1
tens 1
become 1
part 1
historic 1
likes 1
seen 1
center 1
crucial 1
conviction: 1
exists 1
serve 1
citizens. 1
americans 1
want 1
schools 1
children, 1
neighborhoods 1
families, 1
themselves. 1
reasonable 1
demands 1
righteous 1
public. 1
citizens, 1
different 1
reality 1
exists: 1
mothers 1
children 1
trapped 1
poverty 1
inner 1
cities; 1
rusted-out 1
scattered 1
tombstones 1
landscape 1
nation; 1
education 1
system 1
flush 1
cash, 1
leaves 1
young 1
beautiful 1
students 1
deprived 1
knowledge; 1
crime 1
gangs 1
drugs 1
stolen 1
lives 1
robbed 1
much 1
unrealized 1
potential. 1
carnage 1
pain 1
pain. 1
dreams; 1
success 1
success. 1
heart, 1
home, 1
office 1
take 1
americans. 1
decades, 1
enriched 1
industry 1
expense 1
industry; 1
subsidized 1
armies 1
allowing 1
sad 1
depletion 1
military; 1
defended 1
refusing 1
defend 1
own; 1
spent 1
trillions 1
dollars 1
overseas 1
america's 1
infrastructure 1
fallen 1
disrepair 1
decay. 1
rich 1
wealth, 1
strength, 1
confidence 1
disappeared 1
horizon. 1
one, 1
shuttered 1
shores, 1
thought 1
upon 1
behind. 1
wealth 1
middle 1
class 1
ripped 1
homes 1
redistributed 1
world. 1
past. 1
looking 1
future. 1
assembled 1
issuing 1
decree 1
heard 1
city, 1
hall 1
power. 1
forward, 1
vision 1
govern 1
on, 1
going 1
decision 1
trade, 1
taxes, 1
immigration, 1
affairs, 1
benefit 1
families. 1
protect 1
ravages 1
making 1
products, 1
stealing 1
companies, 1
destroying 1
protection 1
lead 1
prosperity 1
strength. 1
body 1
never, 1
ever 1


转载自原文链接, 如需删除请联系管理员。

原文链接:用python分析川普就职演讲稿,转载请注明来源!

0
相关推荐
Copyright © 2020 | SEO分享博客 | 冀ICP备15004514号-2 | 网站地图 HMJ-Blog Theme by 何敏杰