文本预处理
文本是一类序列数据,一篇文章可以看作是字符或单词的序列,文本数据常见的预处理四个步骤如下:
- 读入文本
- 分词
- 建立字典,将每个词映射到一个唯一的索引(index)
- 将文本从词的序列转换为索引的序列,方便输入模型
读入文本
数据集:英文小说——H. G. Well的Time Machine
import collections
import re
def read_time_machine():
#只读方式打开存放在与代码文件相同目录下的文本
with open('timemachine','r') as f:
lines=[re.sub('[^a-z]+',' ',line.strip().lower()) for line in f]
return lines
lines=read_time_machine()
print('# sentences %d'%len(lines))
# sentences 3222
lines[0]
'home kesci work timemachine txt ustar kesci users the time machine by h g wells '
要点
1、.strip():
str.strip([chars]);去除字符串前面和后面的所有设置的字符串,默认为空格.例子如下:
st=" wang ge "
st=st.strip()
print("lao"+st+"!")
laowang ge!
如果设置了字符序列的话,那么它会删除,字符串前后出现的所有序列中有的字符(包括两字符之间的序列)。但不会清除空格。
st=st.strip('l,o,e')
print(st)#l、a、o、e均被删除
wang g
2、.lower()
str.lower():把字符串中的大写字母变为小写。例子如下:
st="ABCDe"
print(st.lower())
abcde
3、re.sub()
替换字符串中的某些子串,可以用正则表达式来匹配被选子串。
re.sub(pattern, repl, string, count=0, flags=0)
pattern:表示正则表达式中的模式字符串;
repl:被替换的字符串(既可以是字符串,也可以是函数);
string:要被处理的,要被替换的字符串;
count:匹配的次数, 默认是全部替换
import re
st='aswdefrgTHHGFsxgshfubRFGG,df gtb 5 $*/ghm'
print(re.sub('[^a-z]+','换',st.strip().lower()))#替换‘[^a-z]+’代表的所有字母,用‘换’字替换不是字母的它们,发现一串非要字母仅以一个‘换’代替
aswdefrgthhgfsxgshfubrfgg换df换gtb换ghm
分词
对每个句子进行分词,也就是将一个句子划分成若干个词(token),转换为一个词的序列。
lines[0]='the time machine by h g wells '
def tokenize(sentences,token='word'):
"""把句子划分成单词或字符序列"""
if token=='word':#单词级分词
return [sentence.split(' ') for sentence in sentences]#sentence.split(' ')以空格为间隔符,划分单词,每次返回一个列表
elif token=='char':#字符级分词
return [list(sentence) for sentence in sentences]#二维列表
else:
print('ERROR:unknow token type '+token)
tokens=tokenize(lines)
tokens[0:2]
[['the', 'time', 'machine', 'by', 'h', 'g', 'wells', ''], ['']]
建立字典
为了方便模型处理,我们需要将字符串转换为数字。因此我们需要先构建一个字典(vocabulary),将每个词映射到一个唯一的索引编号。
class Vocab(object):
def __init__(self,tokens,min_freq=0,use_special_tokens=False):#min_freq词频阈值,小于其的词舍去
counter=count_corpus(tokens)#该函数定义于下方,为了统计词频
self.token_freqs=list(counter.items())#......(1)
self.idx_to_token=[]
if use_special_tokens:
#使用特殊的tokens,如:pad(列表不足长时补充<pad>)、bos(加<bos>表示句子的开头)、eos(加<eos>表示句子结尾)、unk(用<unk>表示未登录词)
self.pad,self.bos,self.eos,self.unk=(0,1,2,3)
self.idx_to_token+=['<pad>','<bos>','<eos>','<unk>']
else:
self.unk=0#标记未登录词的unk这个特殊token始终要使用
self.idx_to_token+=['<unk>']
self.idx_to_token+=[token for token,freq in self.token_freqs
if freq>=min_freq and token not in self.idx_to_token]#(1)用到了此处,构建索引到词语的映射
self.token_to_idx=dict()
for idx,token in enumerate(self.idx_to_token):
self.token_to_idx[token]=idx#构建词语到索引的映射
def __len__(self):#求字典长度,非必需函数
return len(self.idx_to_token)
def __getitem__(self,tokens):#单词->索引
if not isinstance(tokens,(list,tuple)):#tokens不是列表或元组型,执行下一步
return self.token_to_idx.get(tokens,self.unk)#单个词语求索引,若该词语tokens不在token_to_idx中,找下一个词self.unk的索引并返回
return [self.__getitem__(token) for token in tokens]#多词语求索引,递归调用'单词语求索引
def to_tokens(self,indices):#索引->单词
if not isinstance(indices,(list,tuple)):
return self.idx_to_token[indices]
return [self.idx_to_token[index] for index in indices]
def count_corpus(sentences):
tokens=[tk for st in sentences for tk in st]#sentences二维列表,两层for循环化为一维
return collections.Counter(tokens)# 返回一个字典,记录每个词的出现次数,该过程已包含去重操作
counter=count_corpus(tokens)#看看到底结果如何
print(counter)
Counter({'the': 2261, '': 1284, 'i': 1267, 'and': 1245, 'of': 1155, 'a': 816, 'to': 695, 'was': 552, 'in': 541, 'that': 443, 'my': 440, 'it': 437, 'had': 354, 'me': 281, 'as': 270, 'at': 243, 'for': 221, 'with': 216, 'but': 204, 'time': 200, 'were': 158, 'this': 152, 'you': 137, 'on': 137, 'then': 134, 'his': 129, 'there': 127, 'he': 123, 'have': 122, 'they': 122, 'from': 122, 'one': 120, 'all': 118, 'not': 114, 'into': 114, 'upon': 113, 'little': 113, 'so': 112, 'is': 106, 'came': 105, 'by': 103, 'some': 94, 'be': 93, 'no': 92, 'could': 92, 'their': 91, 'said': 89, 'saw': 88, 'down': 87, 'them': 86, 'machine': 85, 'which': 85, 'very': 85, 'or': 84, 'an': 84, 'we': 82, 'now': 79, 'what': 77, 'been': 75, 'these': 74, 'like': 74, 'her': 74, 'out': 73, 'seemed': 72, 'up': 71, 'man': 70, 'about': 70, 's': 70, 'its': 69, 'thing': 66, 'again': 62, 'traveller': 61, 'would': 60, 'more': 59, 'white': 59, 'our': 57, 'thought': 57, 'felt': 57, 'when': 55, 'over': 54, 'weena': 54, 'still': 53, 'world': 52, 'myself': 51, 'even': 50, 'must': 49, 'through': 49, 'if': 49, 'hand': 49, 'went': 49, 'first': 49, 'are': 48, 'before': 48, 'last': 47, 'towards': 47, 'only': 46, 'people': 46, 'she': 46, 'morlocks': 46, 'see': 45, 'too': 45, 'found': 44, 'how': 43, 'here': 43, 'light': 43, 'great': 42, 'under': 42, 'did': 41, 'him': 40, 'any': 40, 'began': 40, 'back': 40, 'night': 39, 'face': 38, 'way': 38, 'will': 37, 'after': 37, 'another': 37, 'well': 37, 'same': 37, 'think': 36, 'other': 36, 'away': 36, 'round': 36, 'made': 36, 'day': 36, 'us': 35, 'eyes': 35, 'mind': 35, 'might': 35, 'perhaps': 35, 'than': 34, 'put': 34, 'things': 34, 'long': 34, 'looked': 34, 'own': 34, 'may': 33, 'among': 33, 'sky': 33, 'such': 32, 'against': 32, 'took': 32, 'strange': 32, 'yet': 32, 'moment': 31, 'where': 31, 'sun': 31, 'fire': 30, 'black': 30, 'come': 30, 'know': 29, 'off': 29, 'new': 28, 'two': 28, 'old': 28, 'enough': 28, 'hands': 28, 'presently': 28, 'most': 27, 't': 27, 'dark': 27, 'darkness': 27, 'once': 27, 'red': 26, 'who': 26, 'left': 26, 'green': 26, 'grew': 26, 'place': 26, 'hill': 26, 'psychologist': 25, 'space': 25, 'end': 25, 'got': 25, 'three': 24, 'looking': 24, 'medical': 24, 'stood': 24, 'fear': 24, 'almost': 23, 'much': 23, 'should': 23, 'above': 23, 'air': 23, 'head': 23, 'sat': 22, 'do': 22, 'has': 22, 'can': 22, 'tried': 22, 'far': 22, 'seen': 22, 'minute': 22, 'suddenly': 22, 'across': 22, 'sphinx': 22, 'soon': 21, 'along': 21, 'side': 21, 'get': 21, 'earth': 21, 'future': 21, 'part': 21, 'turned': 21, 'human': 21, 'indeed': 20, 'heard': 20, 'say': 20, 'kind': 20, 'gone': 20, 'never': 20, 'until': 20, 'make': 20, 'look': 20, 'while': 20, 'half': 20, 'editor': 20, 'feeling': 20, 'though': 19, 'go': 19, 'room': 19, 'laboratory': 19, 'already': 19, 'moon': 19, 'bronze': 19, 'flowers': 19, 'match': 19, 'matches': 19, 'gallery': 19, 'rather': 18, 'cannot': 18, 'feet': 18, 'certain': 18, 'those': 18, 'filby': 17, 'ground': 17, 'years': 17, 'just': 17, 'table': 17, 'lever': 17, 'big': 17, 'ran': 17, 'door': 17, 'soft': 16, 'mere': 16, 'don': 16, 'however': 16, 'between': 16, 'why': 16, 'fell': 16, 'morning': 16, 'something': 16, 'going': 16, 'behind': 16, 'past': 16, 'followed': 16, 'lay': 16, 'life': 16, 'less': 16, 'struck': 16, 'lit': 16, 'four': 15, 'dimensions': 15, 'each': 15, 'whole': 15, 'bright': 15, 'knew': 15, 'story': 15, 'running': 15, 'creatures': 15, 'palace': 15, 'find': 15, 'wood': 15, 'being': 14, 'large': 14, 'without': 14, 'right': 14, 'men': 14, 'odd': 14, 'good': 14, 'told': 14, 'next': 14, 'feel': 14, 'ever': 14, 'bushes': 14, 'coming': 14, 'age': 14, 'passed': 13, 'dinner': 13, 'lamp': 13, 'clear': 13, 'rose': 13, 'move': 13, 'tell': 13, 'small': 13, 'set': 13, 'understand': 13, 'silent': 13, 'times': 13, 'hundred': 13, 'doubt': 13, 'suppose': 13, 'dust': 13, 'nothing': 13, 'full': 13, 'every': 13, 'days': 13, 'dim': 13, 'open': 13, 'huge': 13, 'hall': 13, 'sea': 13, 'course': 12, 'clearly': 12, 'others': 12, 'am': 12, 'vanished': 12, 'since': 12, 'travelling': 12, 'remember': 12, 'gave': 12, 'sound': 12, 'happened': 12, 'trees': 12, 'creature': 12, 'vast': 12, 'beautiful': 12, 'pocket': 12, 'nature': 12, 'grey': 11, 'line': 11, 'really': 11, 'because': 11, 'simply': 11, 'figure': 11, 'eight': 11, 'done': 11, 'save': 11, 'possibly': 11, 'absolutely': 11, 'several': 11, 'conditions': 11, 'second': 11, 'sense': 11, 'queer': 11, 'evening': 11, 'hesitated': 11, 'glass': 11, 'sudden': 11, 'stars': 11, 'growing': 11, 'pedestal': 11, 'appeared': 11, 'thick': 11, 'altogether': 11, 'further': 11, 'judged': 11, 'blackness': 11, 'upper': 11, 'camphor': 11, 'slower': 11, 'pale': 10, 'caught': 10, 'instance': 10, 'wrong': 10, 'idea': 10, 'always': 10, 'quite': 10, 'work': 10, 'exactly': 10, 'animal': 10, 'better': 10, 'let': 10, 'walked': 10, 'larger': 10, 'm': 10, 'believe': 10, 'travelled': 10, 'bars': 10, 'watch': 10, 'either': 10, 'stopped': 10, 'sleep': 10, 'peculiar': 10, 'house': 10, 'humanity': 10, 'lawn': 10, 'creeping': 10, 'imagine': 10, 'memory': 10, 'broken': 10, 'strong': 10, 'determined': 10, 'ruins': 10, 'few': 10, 'comfort': 10, 'porcelain': 10, 'arms': 10, 'iron': 10, 'wells': 9, 'real': 9, 'nor': 9, 'explain': 9, 'ago': 9, 'moving': 9, 'travel': 9, 'interest': 9, 'arm': 9, 'shoulder': 9, 'bar': 9, 'laughed': 9, 'stared': 9, 'moved': 9, 'below': 9, 'point': 9, 'none': 9, 'met': 9, 'standing': 9, 'journalist': 9, 'opened': 9, 'brown': 9, 'faint': 9, 'question': 9, 'rest': 9, 'apparently': 9, 'blue': 9, 'buildings': 9, 'hung': 9, 'stone': 9, 'lost': 9, 'fancied': 9, 'entered': 9, 'near': 9, 'necessity': 9, 'alone': 9, 'social': 9, 'perfect': 9, 'doors': 9, 'shone': 8, 'follow': 8, 'length': 8, 'became': 8, 'fact': 8, 'young': 8, 'dimension': 8, 'surface': 8, 'telling': 8, 'forward': 8, 'your': 8, 'model': 8, 'together': 8, 'motion': 8, 'faster': 8, 'itself': 8, 'beside': 8, 'take': 8, 'eye': 8, 'wanted': 8, 'cut': 8, 'faces': 8, 'meat': 8, 'thinking': 8, 'noticed': 8, 'assured': 8, 'change': 8, 'horrible': 8, 'swiftly': 8, 'unknown': 8, 'everything': 8, 'drove': 8, 'instead': 8, 'race': 8, 'remote': 8, 'figures': 8, 'pretty': 8, 'thousand': 8, 'children': 8, 'building': 8, 'floor': 8, 'many': 8, 'least': 8, 'view': 8, 'abundant': 8, 'security': 8, 'strength': 8, 'physical': 8, 'needs': 8, 'triumph': 8, 'decay': 8, 'cold': 8, 'sleeping': 8, 'daylight': 8, 'within': 8, 'taken': 8, 'hastily': 8, 'glare': 8, 'shaft': 8, 'eloi': 8, 'south': 8, 'box': 8, 'forest': 8, 'shall': 7, 'mean': 7, 'need': 7, 'natural': 7, 'except': 7, 'trace': 7, 'high': 7, 'certainly': 7, 'hard': 7, 'different': 7, 'sure': 7, 'present': 7, 'become': 7, 'stop': 7, 'account': 7, 'attention': 7, 'case': 7, 'leave': 7, 'slowly': 7, 'held': 7, 'mechanism': 7, 'want': 7, 'saddle': 7, 'nearly': 7, 'plain': 7, 'spoke': 7, 'laughing': 7, 'sounds': 7, 'shadows': 7, 'puzzled': 7, 'somehow': 7, 'comes': 7, 'mouth': 7, 'colour': 7, 'intense': 7, 'pushed': 7, 'seem': 7, 'rare': 7, 'kept': 7, 'rising': 7, 'worn': 7, 'shadow': 7, 'short': 7, 'ears': 7, 'fast': 7, 'splendid': 7, 'shivered': 7, 'longer': 7, 'absolute': 7, 'beyond': 7, 'tree': 7, 'touched': 7, 'breathing': 7, 'violently': 7, 'fancy': 7, 'levers': 7, 'neck': 7, 'intellectual': 7, 'loose': 7, 'number': 7, 'foot': 7, 'windows': 7, 'metal': 7, 'seated': 7, 'fruit': 7, 'tired': 7, 'river': 7, 'valley': 7, 'living': 7, 'sunset': 7, 'during': 7, 'intelligence': 7, 'grow': 7, 'pleasant': 7, 'trying': 7, 'narrow': 7, 'water': 7, 'beach': 7, 'returned': 7, 'fallen': 7, 'underground': 7, 'thousands': 7, 'motionless': 7, 'dream': 7, 'speak': 6, 'matter': 6, 'burned': 6, 'silver': 6, 'carefully': 6, 'hair': 6, 'existence': 6, 'body': 6, 'wait': 6, 'fourth': 6, 'direction': 6, 'making': 6, 'words': 6, 'evidently': 6, 'getting': 6, 'means': 6, 'hope': 6, 'reason': 6, 'vague': 6, 'suggested': 6, 'cried': 6, 'trick': 6, 'ivory': 6, 'front': 6, 'drew': 6, 'chair': 6, 'incredible': 6, 'pointed': 6, 'forth': 6, 'wind': 6, 'bare': 6, 'impression': 6, 'flickering': 6, 'dance': 6, 'showed': 6, 'confusion': 6, 'late': 6, 'paper': 6, 'seven': 6, 'quiet': 6, 'hear': 6, 'warm': 6, 'blood': 6, 'waiting': 6, 'hot': 6, 'home': 6, 'started': 6, 'reached': 6, 'smoking': 6, 'lived': 6, 'thud': 6, 'garden': 6, 'afraid': 6, 'convey': 6, 'unpleasant': 6, 'fall': 6, 'twilight': 6, 'possible': 6, 'hail': 6, 'turf': 6, 'smoke': 6, 'nearer': 6, 'straight': 6, 'reminded': 6, 'sight': 6, 'confidence': 6, 'danger': 6, 'hitherto': 6, 'import': 6, 'corner': 6, 'watching': 6, 'slow': 6, 'distance': 6, 'slope': 6, 'plants': 6, 'disappeared': 6, 'close': 6, 'signs': 6, 'truth': 6, 'animals': 6, 'fate': 6, 'problem': 6, 'sometimes': 6, 'slept': 6, 'panels': 6, 'use': 6, 'stir': 6, 'mystery': 6, 'presence': 6, 'machinery': 6, 'theory': 6, 'heat': 6, 'species': 6, 'familiar': 6, 'convenient': 5, 'paradox': 5, 'accepted': 5, 'anything': 5, 'thickness': 5, 'object': 5, 'tendency': 5, 'beginning': 5, 'slight': 5, 'difference': 5, 'foolish': 5, 'hold': 5, 'state': 5, 'lips': 5, 'pause': 5, 'gently': 5, 'recognized': 5, 'smiled': 5, 'velocity': 5, 'himself': 5, 'wild': 5, 'deep': 5, 'wonder': 5, 'metallic': 5, 'clock': 5, 'explanation': 5, 'tables': 5, 'also': 5, 'brass': 5, 'low': 5, 'watched': 5, 'pressed': 5, 'pass': 5, 'waste': 5, 'changed': 5, 'breath': 5, 'flame': 5, 'ghost': 5, 'pipe': 5, 'interval': 5, 'thursday': 5, 'simple': 5, 'common': 5, 'led': 5, 'corridor': 5, 'complete': 5, 'perfectly': 5, 'holding': 5, 'perceived': 5, 'll': 5, 'spirit': 5, 'surprise': 5, 'coat': 5, 'doorway': 5, 'word': 5, 'brighter': 5, 'till': 5, 'both': 5, 'remained': 5, 'startled': 5, 'true': 5, 'ceased': 5, 'ten': 5, 'falling': 5, 'hazy': 5, 'sensations': 5, 'helpless': 5, 'suggestion': 5, 'early': 5, 'fair': 5, 'civilization': 5, 'occurred': 5, 'resolved': 5, 'incontinently': 5, 'shape': 5, 'carried': 5, 'hour': 5, 'inhuman': 5, 'distinct': 5, 'wall': 5, 'clad': 5, 'rich': 5, 'voices': 5, 'heads': 5, 'beauty': 5, 'sweet': 5, 'ease': 5, 'flinging': 5, 'forgotten': 5, 'staggered': 5, 'knowledge': 5, 'naturally': 5, 'general': 5, 'deserted': 5, 'mass': 5, 'heaps': 5, 'fruits': 5, 'lower': 5, 'nevertheless': 5, 'spite': 5, 'cattle': 5, 'later': 5, 'language': 5, 'discovered': 5, 'cries': 5, 'calm': 5, 'crest': 5, 'planet': 5, 'masses': 5, 'experience': 5, 'form': 5, 'yellow': 5, 'horizon': 5, 'west': 5, 'steadily': 5, 'hither': 5, 'thither': 5, 'golden': 5, 'restless': 5, 'energy': 5, 'weak': 5, 'pain': 5, 'north': 5, 'covered': 5, 'whose': 5, 'circumstances': 5, 'probably': 5, 'horror': 5, 'land': 5, 'path': 5, 'reflection': 5, 'safe': 5, 'afternoon': 5, 'best': 5, 'dreaded': 5, 'nights': 5, 'clambering': 5, 'fingers': 5, 'edge': 5, 'habit': 5, 'museum': 5, 'burning': 5, 'killing': 5, 'eastward': 5, 'grass': 5, 'brightly': 4, 'flashed': 4, 'free': 4, 'geometry': 4, 'expect': 4, 'begin': 4, 'admit': 4, 'having': 4, 'does': 4, 'proceeded': 4, 'call': 4, 'latter': 4, 'cigar': 4, 'provincial': 4, 'mayor': 4, 'particularly': 4, 'manner': 4, 'curious': 4, 'proper': 4, 'weather': 4, 'movement': 4, 'freedom': 4, 'passing': 4, 'miles': 4, 'difficulty': 4, 'savage': 4, 'show': 4, 'laughter': 4, 'weary': 4, 'smiling': 4, 'passage': 4, 'scarcely': 4, 'substance': 4, 'legs': 4, 'singularly': 4, 'seat': 4, 'turning': 4, 'blown': 4, 'indistinct': 4, 'damned': 4, 'journey': 4, 'visible': 4, 'serious': 4, 'remarked': 4, 'asked': 4, 'plausible': 4, 'morrow': 4, 'broad': 4, 'parts': 4, 'bench': 4, 'sheets': 4, 'touch': 4, 'frame': 4, 'five': 4, 've': 4, 'besides': 4, 'previous': 4, 'absence': 4, 'week': 4, 'facing': 4, 'smile': 4, 'remembered': 4, 'brought': 4, 'doing': 4, 'recover': 4, 'resumed': 4, 'curiosity': 4, 'friend': 4, 'business': 4, 'clothes': 4, 'easy': 4, 'salt': 4, 'give': 4, 'displayed': 4, 'lying': 4, 'afterwards': 4, 'voice': 4, 'rail': 4, 'putting': 4, 'machines': 4, 'skull': 4, 'starting': 4, 'stopping': 4, 'sensation': 4, 'teeth': 4, 'extreme': 4, 'position': 4, 'eddying': 4, 'murmur': 4, 'headlong': 4, 'pace': 4, 'painful': 4, 'glimpse': 4, 'circling': 4, 'spread': 4, 'rise': 4, 'dials': 4, 'year': 4, 'flung': 4, 'scarce': 4, 'dread': 4, 'possession': 4, 'built': 4, 'risk': 4, 'profound': 4, 'blow': 4, 'inevitable': 4, 'purple': 4, 'dancing': 4, 'innumerable': 4, 'colossal': 4, 'sides': 4, 'greatly': 4, 'passion': 4, 'foul': 4, 'tall': 4, 'shining': 4, 'frenzy': 4, 'emerged': 4, 'frail': 4, 'oddly': 4, 'harsh': 4, 'features': 4, 'effort': 4, 'gesture': 4, 'abruptly': 4, 'art': 4, 'fro': 4, 'delicate': 4, 'crowd': 4, 'grotesque': 4, 'ways': 4, 'places': 4, 'learn': 4, 'mine': 4, 'meaning': 4, 'name': 4, 'speedily': 4, 'setting': 4, 'thames': 4, 'mile': 4, 'help': 4, 'ruinous': 4, 'heap': 4, 'walls': 4, 'realized': 4, 'houses': 4, 'force': 4, 'population': 4, 'balanced': 4, 'rarely': 4, 'secure': 4, 'palaces': 4, 'mankind': 4, 'leaving': 4, 'fight': 4, 'balance': 4, 'gradually': 4, 'current': 4, 'attained': 4, 'struggle': 4, 'increasing': 4, 'sunlight': 4, 'east': 4, 'folly': 4, 'reach': 4, 'empty': 4, 'slipped': 4, 'child': 4, 'blundering': 4, 'terror': 4, 'crept': 4, 'loss': 4, 'groping': 4, 'fairly': 4, 'failed': 4, 'ill': 4, 'arrival': 4, 'block': 4, 'flakes': 4, 'rain': 4, 'peering': 4, 'london': 4, 'central': 4, 'confess': 4, 'fashion': 4, 'poor': 4, 'heart': 4, 'carry': 4, 'trouble': 4, 'merely': 4, 'return': 4, 'dawn': 4, 'ghosts': 4, 'blaze': 4, 'ruin': 4, 'sideways': 4, 'beneath': 4, 'ages': 4, 'live': 4, 'flight': 4, 'preserved': 4, 'westward': 4, 'descent': 4, 'star': 4, 'order': 4, 'examined': 4, 'strike': 4, 'pattering': 4, 'understood': 4, 'refuge': 4, 'heel': 4, 'shoes': 4, 'approached': 4, 'mace': 4, 'morlock': 4, 'sticks': 4, 'flames': 4, 'crawling': 4, 'usually': 3, 'animated': 3, 'points': 3, 'ideas': 3, 'mathematical': 3, 'breadth': 3, 'cube': 3, 'solid': 3, 'flesh': 3, 'unreal': 3, 'efforts': 3, 'remarkable': 3, 'continued': 3, 'meant': 3, 'society': 3, 'yes': 3, 'twenty': 3, 'scientific': 3, 'finger': 3, 'staring': 3, 'freely': 3, 'backward': 3, 'mental': 3, 'discovery': 3, 'vividly': 3, 'minded': 3, 'able': 3, 'drift': 3, 'turn': 3, 'verification': 3, 'battle': 3, 'discover': 3, 'theories': 3, 'experiment': 3, 'pockets': 3, 'unless': 3, 'scattered': 3, 'placed': 3, 'dozen': 3, 'candles': 3, 'mantel': 3, 'illuminated': 3, 'apparatus': 3, 'twinkling': 3, 'appearance': 3, 'vanish': 3, 'recovered': 3, 'cheerfully': 3, 'seriously': 3, 'bit': 3, 'appreciate': 3, 'therewith': 3, 'taking': 3, 'incredulous': 3, 'rock': 3, 'quartz': 3, 'solemnly': 3, 'suspected': 3, 'subtle': 3, 'easily': 3, 'whom': 3, 'friday': 3, 'considerable': 3, 'guests': 3, 'd': 3, 'known': 3, 'daily': 3, 'doctor': 3, 'rang': 3, 'bell': 3, 'blank': 3, 'noise': 3, 'dusty': 3, 'faded': 3, 'chin': 3, 'expression': 3, 'drawn': 3, 'suffering': 3, 'dazzled': 3, 'flickered': 3, 'glance': 3, 'dull': 3, 'hoped': 3, 'tattered': 3, 'closed': 3, 'gathering': 3, 'read': 3, 'interpretation': 3, 'else': 3, 'conversation': 3, 'raised': 3, 'quietly': 3, 'eat': 3, 'uncomfortable': 3, 'dare': 3, 'devoted': 3, 'appetite': 3, 'clumsy': 3, 'plates': 3, 'o': 3, 'agreed': 3, 'inadequacy': 3, 'circle': 3, 'lighted': 3, 'knees': 3, 'downward': 3, 'actual': 3, 'workshop': 3, 'cracked': 3, 'bent': 3, 'expected': 3, 'noted': 3, 'gripped': 3, 'seeing': 3, 'fainter': 3, 'excessively': 3, 'imminent': 3, 'hopping': 3, 'leaping': 3, 'dashed': 3, 'succession': 3, 'wonderful': 3, 'luminous': 3, 'band': 3, 'landscape': 3, 'vapour': 3, 'swayed': 3, 'start': 3, 'swaying': 3, 'futurity': 3, 'fresh': 3, 'possibility': 3, 'finding': 3, 'occupied': 3, 'explosion': 3, 'sitting': 3, 'surrounded': 3, 'beating': 3, 'cloud': 3, 'wet': 3, 'distinctly': 3, 'marble': 3, 'chanced': 3, 'disease': 3, 'dreadful': 3, 'shapes': 3, 'dimly': 3, 'seized': 3, 'shafts': 3, 'thunderstorm': 3, 'swept': 3, 'garments': 3, 'retreat': 3, 'courage': 3, 'circular': 3, 'robes': 3, 'approaching': 3, 'graceful': 3, 'indescribably': 3, 'used': 3, 'following': 3, 'tongue': 3, 'exquisite': 3, 'shook': 3, 'step': 3, 'nine': 3, 'pink': 3, 'type': 3, 'sharp': 3, 'cheek': 3, 'thin': 3, 'lack': 3, 'hardly': 3, 'limbs': 3, 'rushed': 3, 'received': 3, 'blossom': 3, 'profoundly': 3, 'roof': 3, 'partially': 3, 'generations': 3, 'effect': 3, 'extremely': 3, 'dining': 3, 'material': 3, 'extinction': 3, 'particular': 3, 'distant': 3, 'gestures': 3, 'amount': 3, 'score': 3, 'inclined': 3, 'fatigued': 3, 'examining': 3, 'ended': 3, 'friendly': 3, 'summit': 3, 'date': 3, 'granite': 3, 'leaves': 3, 'remains': 3, 'structure': 3, 'destined': 3, 'sexes': 3, 'woman': 3, 'family': 3, 'violence': 3, 'cupola': 3, 'powers': 3, 'corroded': 3, 'pinkish': 3, 'rust': 3, 'moss': 3, 'cast': 3, 'horizontal': 3, 'agriculture': 3, 'wane': 3, 'engaged': 3, 'outcome': 3, 'process': 3, 'science': 3, 'everywhere': 3, 'effected': 3, 'toil': 3, 'increase': 3, 'self': 3, 'dangers': 3, 'refined': 3, 'wasting': 3, 'equipped': 3, 'die': 3, 'mastered': 3, 'numbers': 3, 'chill': 3, 'descend': 3, 'tangle': 3, 'throat': 3, 'strides': 3, 'removed': 3, 'minutes': 3, 'aloud': 3, 'stirring': 3, 'fears': 3, 'hidden': 3, 'exact': 3, 'fist': 3, 'twigs': 3, 'noises': 3, 'moonlight': 3, 'despair': 3, 'wore': 3, 'impossible': 3, 'touching': 3, 'freshness': 3, 'hiding': 3, 'wondering': 3, 'exhausted': 3, 'excitement': 3, 'careful': 3, 'examination': 3, 'keep': 3, 'slopes': 3, 'hasty': 3, 'guesses': 3, 'gates': 3, 'missed': 3, 'hills': 3, 'flaring': 3, 'often': 3, 'sees': 3, 'subterranean': 3, 'ventilation': 3, 'conclusion': 3, 'learned': 3, 'imagination': 3, 'inaccessible': 3, 'amid': 3, 'tale': 3, 'movements': 3, 'company': 3, 'gap': 3, 'appliances': 3, 'specimens': 3, 'bathing': 3, 'pillars': 3, 'clue': 3, 'presented': 3, 'drifting': 3, 'rescue': 3, 'centre': 3, 'delight': 3, 'desolate': 3, 'rate': 3, 'problems': 3, 'childish': 3, 'affection': 3, 'passionate': 3, 'enter': 3, 'colourless': 3, 'sombre': 3, 'eastern': 3, 'hotter': 3, 'assume': 3, 'outside': 3, 'spots': 3, 'halted': 3, 'steadfastly': 3, 'obscurity': 3, 'dropped': 3, 'monster': 3, 'bleached': 3, 'worlders': 3, 'largely': 3, 'widening': 3, 'increased': 3, 'deeper': 3, 'due': 3, 'closing': 3, 'country': 3, 'adapted': 3, 'aristocracy': 3, 'size': 3, 'called': 3, 'burst': 3, 'boldly': 3, 'character': 3, 'push': 3, 'hooks': 3, 'feared': 3, 'clung': 3, 'projecting': 3, 'disk': 3, 'tunnel': 3, 'lie': 3, 'throb': 3, 'dimness': 3, 'smell': 3, 'run': 3, 'shouted': 3, 'escape': 3, 'grasped': 3, 'stayed': 3, 'blinking': 3, 'thirty': 3, 'pit': 3, 'mechanical': 3, 'ones': 3, 'wandered': 3, 'judge': 3, 'breeze': 3, 'stillness': 3, 'darkling': 3, 'branches': 3, 'weapon': 3, 'vestiges': 3, 'surprised': 3, 'array': 3, 'sloping': 3, 'cases': 3, 'tight': 3, 'traces': 3, 'pieces': 3, 'sorry': 3, 'smashed': 3, 'brutes': 3, 'millions': 3, 'crowbar': 3, 'stems': 3, 'beat': 3, 'pulled': 3, 'reddish': 3, 'hillock': 3, 'awake': 3, 'sweeping': 3, 'rocks': 3, 'crab': 3, 'expounding': 2, 'flushed': 2, 'chairs': 2, 'caressed': 2, 'atmosphere': 2, 'marking': 2, 'lean': 2, 'forefinger': 2, 'taught': 2, 'person': 2, 'ask': 2, 'reasonable': 2, 'neither': 2, 'exist': 2, 'directions': 2, 'incline': 2, 'overlook': 2, 'planes': 2, 'former': 2, 'consciousness': 2, 'moves': 2, 'lives': 2, 'spasmodic': 2, 'overlooked': 2, 'talk': 2, 'spoken': 2, 'reference': 2, 'angles': 2, 'philosophical': 2, 'month': 2, 'flat': 2, 'represent': 2, 'dimensional': 2, 'transitory': 2, 'results': 2, 'fifteen': 2, 'seventeen': 2, 'yesterday': 2, 'upward': 2, 'mercury': 2, 'generally': 2, 'therefore': 2, 'regarded': 2, 'gravitation': 2, 'balloons': 2, 'vertical': 2, 'easier': 2, 'sir': 2, 'uniform': 2, 'grave': 2, 'fifty': 2, 'instant': 2, 'jump': 2, 'staying': 2, 'civilized': 2, 'ultimately': 2, 'contented': 2, 'experimental': 2, 'verify': 2, 'ancestors': 2, 'greek': 2, 'ahead': 2, 'talked': 2, 'brain': 2, 'faintly': 2, 'finished': 2, 'collapsed': 2, 'glittering': 2, 'framework': 2, 'transparent': 2, 'crystalline': 2, 'explicit': 2, 'brilliantly': 2, 'nearest': 2, 'played': 2, 'resting': 2, 'plan': 2, 'notice': 2, 'askew': 2, 'disappear': 2, 'satisfy': 2, 'trickery': 2, 'interminable': 2, 'voyage': 2, 'jumped': 2, 'swung': 2, 'reminiscence': 2, 'tobacco': 2, 'jar': 2, 'stooping': 2, 'lighting': 2, 'helped': 2, 'indicated': 2, 'objections': 2, 'presentation': 2, 'reassured': 2, 'spinning': 2, 'beheld': 2, 'nickel': 2, 'filed': 2, 'drawings': 2, 'explore': 2, 'believed': 2, 'clever': 2, 'shown': 2, 'explained': 2, 'mistake': 2, 'judgment': 2, 'china': 2, 'possibilities': 2, 'utter': 2, 'similar': 2, 'laid': 2, 'candle': 2, 'constant': 2, 'naming': 2, 'host': 2, 'note': 2, 'lead': 2, 'seems': 2, 'pity': 2, 'shy': 2, 'speculation': 2, 'ingenious': 2, 'wider': 2, 'cry': 2, 'heavens': 2, 'amazing': 2, 'smeared': 2, 'disordered': 2, 'ghastly': 2, 'healed': 2, 'haggard': 2, 'silence': 2, 'expecting': 2, 'painfully': 2, 'wine': 2, 'filled': 2, 'champagne': 2, 'comfortable': 2, 'mutton': 2, 'starving': 2, 'staircase': 2, 'lameness': 2, 'pair': 2, 'stained': 2, 'behaviour': 2, 'limping': 2, 'completely': 2, 'servants': 2, 'plate': 2, 'fork': 2, 'suit': 2, 'modest': 2, 'meeting': 2, 'frankly': 2, 'rolling': 2, 'heaping': 2, 'saying': 2, 'dressed': 2, 'treat': 2, 'stick': 2, 'won': 2, 'convulsively': 2, 'questions': 2, 'usual': 2, 'sheer': 2, 'argue': 2, 'refrain': 2, 'interruptions': 2, 'badly': 2, 'bed': 2, 'writing': 2, 'ink': 2, 'express': 2, 'quality': 2, 'truly': 2, 'tap': 2, 'suicide': 2, 'immediately': 2, 'nightmare': 2, 'intellect': 2, 'mrs': 2, 'watchett': 2, 'rocket': 2, 'supposed': 2, 'destroyed': 2, 'merged': 2, 'greyness': 2, 'brilliant': 2, 'arch': 2, 'fluctuating': 2, 'stands': 2, 'changing': 2, 'dreams': 2, 'speed': 2, 'raced': 2, 'belt': 2, 'solstice': 2, 'snow': 2, 'brief': 2, 'spring': 2, 'unable': 2, 'confused': 2, 'madness': 2, 'series': 2, 'impressions': 2, 'rudimentary': 2, 'appear': 2, 'architecture': 2, 'mist': 2, 'richer': 2, 'flow': 2, 'slipping': 2, 'molecule': 2, 'whatever': 2, 'intimate': 2, 'contact': 2, 'reaction': 2, 'reaching': 2, 'result': 2, 'prolonged': 2, 'gust': 2, 'forthwith': 2, 'fool': 2, 'thunder': 2, 'hissing': 2, 'rhododendron': 2, 'carved': 2, 'rhododendrons': 2, 'downpour': 2, 'describe': 2, 'columns': 2, 'thinner': 2, 'birch': 2, 'wings': 2, 'verdigris': 2, 'curtain': 2, 'promise': 2, 'crouching': 2, 'grown': 2, 'readjust': 2, 'smote': 2, 'aside': 2, 'whirled': 2, 'attitude': 2, 'mount': 2, 'curiously': 2, 'opening': 2, 'group': 2, 'directed': 2, 'shoulders': 2, 'leading': 2, 'sandals': 2, 'distinguish': 2, 'fragile': 2, 'bearing': 2, 'sign': 2, 'pointing': 2, 'tentacles': 2, 'warn': 2, 'prettiness': 2, 'faintest': 2, 'cooing': 2, 'hesitating': 2, 'anticipated': 2, 'disappointment': 2, 'nodded': 2, 'vivid': 2, 'carrying': 2, 'melodious': 2, 'smothered': 2, 'countless': 2, 'astonishment': 2, 'fretted': 2, 'confident': 2, 'irresistible': 2, 'mysterious': 2, 'variegated': 2, 'phoenician': 2, 'decorations': 2, 'nineteenth': 2, 'century': 2, 'speech': 2, 'blocks': 2, 'slabs': 2, 'polished': 2, 'orange': 2, 'cushions': 2, 'themselves': 2, 'stalks': 2, 'openings': 2, 'surveyed': 2, 'leisure': 2, 'pattern': 2, 'curtains': 2, 'couple': 2, 'eating': 2, 'sheep': 2, 'delightful': 2, 'perceive': 2, 'attempt': 2, 'grasp': 2, 'chatter': 2, 'amidst': 2, 'substantives': 2, 'doses': 2, 'indolent': 2, 'eager': 2, 'beginnings': 2, 'portal': 2, 'sunlit': 2, 'continually': 2, 'laugh': 2, 'scene': 2, 'glow': 2, 'entirely': 2, 'condition': 2, 'aluminium': 2, 'incapable': 2, 'derelict': 2, 'stranger': 2, 'rested': 2, 'greenery': 2, 'english': 2, 'heels': 2, 'flash': 2, 'costume': 2, 'plainly': 2, 'resemblance': 2, 'institution': 2, 'necessities': 2, 'evil': 2, 'efficient': 2, 'musing': 2, 'attracted': 2, 'existing': 2, 'speculations': 2, 'top': 2, 'walking': 2, 'adventure': 2, 'recognize': 2, 'rests': 2, 'gold': 2, 'evidences': 2, 'shaped': 2, 'realize': 2, 'consequence': 2, 'logical': 2, 'premium': 2, 'makes': 2, 'deliberately': 2, 'field': 2, 'operations': 2, 'destroy': 2, 'weed': 2, 'wholesome': 2, 'greater': 2, 'improve': 2, 'breeding': 2, 'intelligent': 2, 'co': 2, 'adjustment': 2, 'fungi': 2, 'medicine': 2, 'diseases': 2, 'stay': 2, 'affected': 2, 'clothed': 2, 'advertisement': 2, 'guessed': 2, 'inevitably': 2, 'survive': 2, 'patience': 2, 'therein': 2, 'fierce': 2, 'jealousy': 2, 'tenderness': 2, 'devotion': 2, 'conquest': 2, 'weakness': 2, 'necessary': 2, 'survival': 2, 'love': 2, 'power': 2, 'war': 2, 'solitary': 2, 'beasts': 2, 'require': 2, 'harmony': 2, 'flourish': 2, 'artistic': 2, 'died': 2, 'keen': 2, 'grindstone': 2, 'devised': 2, 'diminished': 2, 'owl': 2, 'leprous': 2, 'instinctively': 2, 'answered': 2, 'moonlit': 2, 'worst': 2, 'furiously': 2, 'clutching': 2, 'dismay': 2, 'shelter': 2, 'method': 2, 'clenched': 2, 'knuckles': 2, 'anguish': 2, 'breaking': 2, 'bawling': 2, 'shaking': 2, 'sorely': 2, 'frightened': 2, 'revive': 2, 'unexpected': 2, 'crying': 2, 'god': 2, 'fatigue': 2, 'wretchedness': 2, 'misery': 2, 'woke': 2, 'overnight': 2, 'patient': 2, 'tools': 2, 'desire': 2, 'emotion': 2, 'wasted': 2, 'futile': 2, 'jest': 2, 'impulse': 2, 'blind': 2, 'perplexity': 2, 'struggled': 2, 'overturned': 2, 'footprints': 2, 'hollow': 2, 'care': 2, 'inside': 2, 'dragging': 2, 'hammering': 2, 'gusty': 2, 'vigil': 2, 'hours': 2, 'hopeless': 2, 'spent': 2, 'complicated': 2, 'trap': 2, 'expense': 2, 'avoided': 2, 'concern': 2, 'progress': 2, 'addition': 2, 'explorations': 2, 'terms': 2, 'sentences': 2, 'undulating': 2, 'serenity': 2, 'protected': 2, 'gleam': 2, 'steady': 2, 'threw': 2, 'scrap': 2, 'fluttering': 2, 'towers': 2, 'flicker': 2, 'scorched': 2, 'system': 2, 'difficult': 2, 'contained': 2, 'negro': 2, 'railway': 2, 'wide': 2, 'sensible': 2, 'unseen': 2, 'contributed': 2, 'automatic': 2, 'organization': 2, 'crematoria': 2, 'range': 2, 'remark': 2, 'satisfaction': 2, 'decadent': 2, 'difficulties': 2, 'complex': 2, 'vestige': 2, 'lacked': 2, 'inscription': 2, 'excellent': 2, 'third': 2, 'shallow': 2, 'main': 2, 'rubbing': 2, 'gratitude': 2, 'exploration': 2, 'chiefly': 2, 'calling': 2, 'distress': 2, 'gathered': 2, 'tumult': 2, 'apprehension': 2, 'lesson': 2, 'troubled': 2, 'awakened': 2, 'palps': 2, 'greyish': 2, 'dying': 2, 'pallor': 2, 'inky': 2, 'scanned': 2, 'twice': 2, 'ape': 2, 'notion': 2, 'generation': 2, 'hence': 2, 'indefinite': 2, 'unfamiliar': 2, 'renewed': 2, 'inner': 2, 'seeking': 2, 'masonry': 2, 'glaring': 2, 'advanced': 2, 'pile': 2, 'whether': 2, 'pillar': 2, 'retreated': 2, 'spider': 2, 'sole': 2, 'descendants': 2, 'obscene': 2, 'nocturnal': 2, 'suspect': 2, 'wondered': 2, 'solution': 2, 'sport': 2, 'pursued': 2, 'distressed': 2, 'revolution': 2, 'sliding': 2, 'ventilating': 2, 'hint': 2, 'vaguely': 2, 'caves': 2, 'witness': 2, 'evident': 2, 'sunshine': 2, 'universal': 2, 'artificial': 2, 'underworld': 2, 'splitting': 2, 'proceeding': 2, 'gradual': 2, 'industry': 2, 'exclusive': 2, 'education': 2, 'gulf': 2, 'higher': 2, 'class': 2, 'pleasure': 2, 'labour': 2, 'pay': 2, 'refused': 2, 'constituted': 2, 'dreamed': 2, 'working': 2, 'fellow': 2, 'books': 2, 'terribly': 2, 'tears': 2, 'shrinking': 2, 'quarter': 2, 'vermin': 2, 'horribly': 2, 'clamber': 2, 'exploring': 2, 'regarding': 2, 'piece': 2, 'kissing': 2, 'parapet': 2, 'climbing': 2, 'leak': 2, 'amazement': 2, 'unstable': 2, 'yards': 2, 'cramped': 2, 'weight': 2, 'quick': 2, 'aperture': 2, 'louder': 2, 'oppressive': 2, 'agony': 2, 'swinging': 2, 'trembling': 2, 'unbroken': 2, 'hum': 2, 'striking': 2, 'rayless': 2, 'glared': 2, 'stretched': 2, 'heavy': 2, 'lurking': 2, 'wriggling': 2, 'ourselves': 2, 'frightfully': 2, 'safety': 2, 'economize': 2, 'odour': 2, 'disengaged': 2, 'realization': 2, 'ignorance': 2, 'clutched': 2, 'protection': 2, 'blindness': 2, 'bewilderment': 2, 'giddy': 2, 'deadly': 2, 'nausea': 2, 'frightful': 2, 'faintness': 2, 'swam': 2, 'insensible': 2, 'discoveries': 2, 'element': 2, 'enemy': 2, 'hypothesis': 2, 'arrived': 2, 'futility': 2, 'ancient': 2, 'departed': 2, 'impressed': 2, 'apace': 2, 'thrust': 2, 'brother': 2, 'recall': 2, 'differently': 2, 'terrors': 2, 'pinnacles': 2, 'lame': 2, 'hugely': 2, 'occasionally': 2, 'purpose': 2, 'jacket': 2, 'withered': 2, 'unlike': 2, 'dusk': 2, 'expectation': 2, 'ant': 2, 'darker': 2, 'spreading': 2, 'excitements': 2, 'glad': 2, 'constellations': 2, 'forty': 2, 'traversed': 2, 'unreasonable': 2, 'break': 2, 'feeble': 2, 'food': 2, 'rats': 2, 'instinct': 2, 'watchword': 2, 'degradation': 2, 'perforce': 2, 'contrive': 2, 'immediate': 2, 'contrivance': 2, 'persuasion': 2, 'bring': 2, 'turfy': 2, 'proved': 2, 'valves': 2, 'covering': 2, 'skeleton': 2, 'extinct': 2, 'bones': 2, 'contents': 2, 'section': 2, 'fossils': 2, 'staved': 2, 'receded': 2, 'sulphur': 2, 'aisle': 2, 'shrivelled': 2, 'puzzles': 2, 'stand': 2, 'sufficient': 2, 'suffer': 2, 'restrained': 2, 'charred': 2, 'decaying': 2, 'testified': 2, 'eagerly': 2, 'escaped': 2, 'sealed': 2, 'accordingly': 2, 'centuries': 2, 'promised': 2, 'cartridges': 2, 'waned': 2, 'consider': 2, 'woods': 2, 'calamity': 2, 'gives': 2, 'vegetation': 2, 'tongues': 2, 'curved': 2, 'accustomed': 2, 'overhead': 2, 'crackling': 2, 'fumbled': 2, 'backs': 2, 'lump': 2, 'stooped': 2, 'dry': 2, 'foliage': 2, 'dead': 2, 'breathed': 2, 'slumbrous': 2, 'clinging': 2, 'death': 2, 'monstrous': 2, 'blows': 2, 'starlight': 2, 'hawthorn': 2, 'subsiding': 2, 'awful': 2, 'moaning': 2, 'ashes': 2, 'lonely': 2, 'permanency': 2, 'meet': 2, 'forbidden': 2, 'slid': 2, 'abominable': 2, 'scramble': 2, 'fitted': 2, 'dial': 2, 'dome': 2, 'glowing': 2, 'momentary': 2, 'intensely': 2, 'eternal': 2, 'margin': 2, 'incrustation': 2, 'hillocks': 2, 'claws': 2, 'ungainly': 2, 'slime': 2, 'monsters': 2, 'duller': 2, 'million': 2, 'lifeless': 2, 'eclipse': 2, 'sick': 2, 'specimen': 2, 'isn': 2, 'squat': 2, 'lunch': 2, 'richardson': 2, 'mutual': 2, 'h': 1, 'g': 1, 'recondite': 1, 'twinkled': 1, 'radiance': 1, 'incandescent': 1, 'lights': 1, 'lilies': 1, 'bubbles': 1, 'glasses': 1, 'patents': 1, 'embraced': 1, 'submitted': 1, 'luxurious': 1, 'roams': 1, 'gracefully': 1, 'trammels': 1, 'precision': 1, 'lazily': 1, 'admired': 1, 'earnestness': 1, 'fecundity': 1, 'controvert': 1, 'universally': 1, 'school': 1, 'founded': 1, 'misconception': 1, 'argumentative': 1, 'accept': 1, 'nil': 1, 'plane': 1, 'abstractions': 1, 'instantaneous': 1, 'pensive': 1, 'extension': 1, 'duration': 1, 'infirmity': 1, 'draw': 1, 'distinction': 1, 'happens': 1, 'intermittently': 1, 'relight': 1, 'extensively': 1, 'accession': 1, 'cheerfulness': 1, 'mathematicians': 1, 'definable': 1, 'asking': 1, 'construct': 1, 'professor': 1, 'simon': 1, 'newcomb': 1, 'york': 1, 'similarly': 1, 'models': 1, 'master': 1, 'perspective': 1, 'murmured': 1, 'knitting': 1, 'brows': 1, 'lapsed': 1, 'introspective': 1, 'repeats': 1, 'mystic': 1, 'brightening': 1, 'portrait': 1, 'sections': 1, 'representations': 1, 'dimensioned': 1, 'fixed': 1, 'unalterable': 1, 'required': 1, 'assimilation': 1, 'popular': 1, 'diagram': 1, 'record': 1, 'shows': 1, 'barometer': 1, 'surely': 1, 'traced': 1, 'conclude': 1, 'coal': 1, 'limits': 1, 'jumping': 1, 'inequalities': 1, 'dear': 1, 'existences': 1, 'immaterial': 1, 'cradle': 1, 'interrupted': 1, 'germ': 1, 'recalling': 1, 'incident': 1, 'occurrence': 1, 'absent': 1, 'six': 1, 'respect': 1, 'balloon': 1, 'accelerate': 1, 'oh': 1, 'argument': 1, 'convince': 1, 'investigations': 1, 'inkling': 1, 'exclaimed': 1, 'indifferently': 1, 'driver': 1, 'determines': 1, 'remarkably': 1, 'historian': 1, 'hastings': 1, 'attract': 1, 'tolerance': 1, 'anachronisms': 1, 'homer': 1, 'plato': 1, 'plough': 1, 'german': 1, 'scholars': 1, 'improved': 1, 'invest': 1, 'money': 1, 'accumulate': 1, 'hurry': 1, 'erected': 1, 'strictly': 1, 'communistic': 1, 'basis': 1, 'extravagant': 1, 'anyhow': 1, 'humbug': 1, 'trousers': 1, 'slippers': 1, 'shuffling': 1, 'sleight': 1, 'conjurer': 1, 'burslem': 1, 'preface': 1, 'anecdote': 1, 'delicately': 1, 'follows': 1, 'unaccountable': 1, 'octagonal': 1, 'hearthrug': 1, 'shaded': 1, 'candlesticks': 1, 'sconces': 1, 'fireplace': 1, 'profile': 1, 'alert': 1, 'appears': 1, 'subtly': 1, 'conceived': 1, 'adroitly': 1, 'affair': 1, 'elbows': 1, 'pressing': 1, 'looks': 1, 'peered': 1, 'beautifully': 1, 'retorted': 1, 'imitated': 1, 'action': 1, 'sends': 1, 'gliding': 1, 'reverses': 1, 'represents': 1, 'press': 1, 'yourselves': 1, 'quack': 1, 'lend': 1, 'individual': 1, 'sent': 1, 'eddy': 1, 'everyone': 1, 'stupor': 1, 'fill': 1, 'earnest': 1, 'spill': 1, 'unhinged': 1, 'uncut': 1, 'inspiration': 1, 'anywhere': 1, 'presume': 1, 'impartiality': 1, 'threshold': 1, 'diluted': 1, 'psychology': 1, 'helps': 1, 'delightfully': 1, 'wheel': 1, 'bullet': 1, 'flying': 1, 'gets': 1, 'creates': 1, 'fiftieth': 1, 'hundredth': 1, 'vacant': 1, 'draughty': 1, 'silhouette': 1, 'edition': 1, 'sawn': 1, 'crystal': 1, 'twisted': 1, 'unfinished': 1, 'christmas': 1, 'aloft': 1, 'intend': 1, 'winked': 1, 'ii': 1, 'reserve': 1, 'ingenuity': 1, 'ambush': 1, 'lucid': 1, 'frankness': 1, 'scepticism': 1, 'motives': 1, 'pork': 1, 'butcher': 1, 'whim': 1, 'elements': 1, 'distrusted': 1, 'tricks': 1, 'deportment': 1, 'aware': 1, 'trusting': 1, 'reputations': 1, 'furnishing': 1, 'nursery': 1, 'egg': 1, 'shell': 1, 'potentialities': 1, 'minds': 1, 'plausibility': 1, 'practical': 1, 'incredibleness': 1, 'anachronism': 1, 'preoccupied': 1, 'discussing': 1, 'linnaean': 1, 'tubingen': 1, 'stress': 1, 'blowing': 1, 'richmond': 1, 'arriving': 1, 'assembled': 1, 'drawing': 1, 'sheet': 1, 'unavoidably': 1, 'detained': 1, 'asks': 1, 'says': 1, 'spoil': 1, 'thereupon': 1, 'attended': 1, 'aforementioned': 1, 'beard': 1, 'didn': 1, 'observation': 1, 'jocular': 1, 'volunteered': 1, 'wooden': 1, 'witnessed': 1, 'midst': 1, 'exposition': 1, 'hallo': 1, 'tableful': 1, 'plight': 1, 'dirty': 1, 'sleeves': 1, 'greyer': 1, 'dirt': 1, 'actually': 1, 'limp': 1, 'footsore': 1, 'tramps': 1, 'drained': 1, 'disturb': 1, 'faltering': 1, 'articulation': 1, 'draught': 1, 'cheeks': 1, 'approval': 1, 'wash': 1, 'dress': 1, 'visitor': 1, 'funny': 1, 'padding': 1, 'footfall': 1, 'socks': 1, 'detested': 1, 'fuss': 1, 'wool': 1, 'eminent': 1, 'scientist': 1, 'wont': 1, 'headlines': 1, 'game': 1, 'amateur': 1, 'cadger': 1, 'upstairs': 1, 'hated': 1, 'knife': 1, 'grunt': 1, 'exclamatory': 1, 'gaps': 1, 'wonderment': 1, 'fervent': 1, 'eke': 1, 'income': 1, 'crossing': 1, 'nebuchadnezzar': 1, 'phases': 1, 'inquired': 1, 'couldn': 1, 'cover': 1, 'resorted': 1, 'caricature': 1, 'hadn': 1, 'brushes': 1, 'price': 1, 'joined': 1, 'ridicule': 1, 'joyous': 1, 'irreverent': 1, 'special': 1, 'correspondent': 1, 'reports': 1, 'shouting': 1, 'ordinary': 1, 'hilariously': 1, 'chaps': 1, 'middle': 1, 'rosebery': 1, 'lot': 1, 'reserved': 1, 'peptone': 1, 'arteries': 1, 'thanks': 1, 'nodding': 1, 'shilling': 1, 'verbatim': 1, 'fingernail': 1, 'poured': 1, 'relieve': 1, 'tension': 1, 'anecdotes': 1, 'hettie': 1, 'potter': 1, 'tramp': 1, 'smoked': 1, 'cigarette': 1, 'eyelashes': 1, 'drank': 1, 'regularity': 1, 'determination': 1, 'nervousness': 1, 'apologize': 1, 'greasy': 1, 'ringing': 1, 'adjoining': 1, 'dash': 1, 'chose': 1, 'leaning': 1, 'shan': 1, 'echoed': 1, 'keenness': 1, 'pen': 1, 'attentively': 1, 'speaker': 1, 'sincere': 1, 'intonation': 1, 'turns': 1, 'hearers': 1, 'glanced': 1, 'iii': 1, 'principles': 1, 'incomplete': 1, 'finish': 1, 'inch': 1, 'remade': 1, 'career': 1, 'screws': 1, 'drop': 1, 'oil': 1, 'rod': 1, 'holds': 1, 'pistol': 1, 'feels': 1, 'reel': 1, 'tricked': 1, 'traverse': 1, 'shoot': 1, 'dumb': 1, 'confusedness': 1, 'descended': 1, 'switchback': 1, 'anticipation': 1, 'smash': 1, 'flapping': 1, 'wing': 1, 'scaffolding': 1, 'conscious': 1, 'slowest': 1, 'snail': 1, 'crawled': 1, 'intermittent': 1, 'darknesses': 1, 'quarters': 1, 'gaining': 1, 'palpitation': 1, 'continuous': 1, 'deepness': 1, 'color': 1, 'jerking': 1, 'streak': 1, 'misty': 1, 'puffs': 1, 'melting': 1, 'flowing': 1, 'registered': 1, 'consequently': 1, 'poignant': 1, 'hysterical': 1, 'exhilaration': 1, 'attend': 1, 'developments': 1, 'advances': 1, 'elusive': 1, 'fluctuated': 1, 'massive': 1, 'glimmer': 1, 'remain': 1, 'wintry': 1, 'intermission': 1, 'veil': 1, 'mattered': 1, 'attenuated': 1, 'interstices': 1, 'intervening': 1, 'substances': 1, 'involved': 1, 'jamming': 1, 'bringing': 1, 'atoms': 1, 'obstacle': 1, 'chemical': 1, 'unavoidable': 1, 'risks': 1, 'cheerful': 1, 'insensibly': 1, 'strangeness': 1, 'sickly': 1, 'jarring': 1, 'upset': 1, 'nerve': 1, 'petulance': 1, 'impatient': 1, 'lugged': 1, 'reeling': 1, 'clap': 1, 'stunned': 1, 'pitiless': 1, 'overset': 1, 'mauve': 1, 'blossoms': 1, 'dropping': 1, 'shower': 1, 'stones': 1, 'rebounding': 1, 'skin': 1, 'fine': 1, 'hospitality': 1, 'loomed': 1, 'indistinctly': 1, 'invisible': 1, 'winged': 1, 'vertically': 1, 'hover': 1, 'sightless': 1, 'imparted': 1, 'advance': 1, 'recede': 1, 'denser': 1, 'tore': 1, 'threadbare': 1, 'lightening': 1, 'temerity': 1, 'withdrawn': 1, 'cruelty': 1, 'manliness': 1, 'developed': 1, 'unsympathetic': 1, 'overwhelmingly': 1, 'powerful': 1, 'disgusting': 1, 'likeness': 1, 'slain': 1, 'intricate': 1, 'parapets': 1, 'wooded': 1, 'lessening': 1, 'storm': 1, 'panic': 1, 'frantically': 1, 'strove': 1, 'trailing': 1, 'summer': 1, 'shreds': 1, 'nothingness': 1, 'picked': 1, 'unmelted': 1, 'hailstones': 1, 'piled': 1, 'courses': 1, 'naked': 1, 'bird': 1, 'knowing': 1, 'hawk': 1, 'swoop': 1, 'grappled': 1, 'fiercely': 1, 'wrist': 1, 'knee': 1, 'desperate': 1, 'onset': 1, 'panting': 1, 'heavily': 1, 'recovery': 1, 'prompt': 1, 'fearfully': 1, 'pathway': 1, 'tunic': 1, 'girdled': 1, 'waist': 1, 'leather': 1, 'buskins': 1, 'noticing': 1, 'consumptive': 1, 'hectic': 1, 'regained': 1, 'iv': 1, 'liquid': 1, 'addressed': 1, 'alarming': 1, 'inspired': 1, 'gentleness': 1, 'childlike': 1, 'pins': 1, 'happily': 1, 'unscrewed': 1, 'communication': 1, 'peculiarities': 1, 'dresden': 1, 'uniformly': 1, 'curly': 1, 'mouths': 1, 'chins': 1, 'mild': 1, 'egotism': 1, 'communicate': 1, 'speaking': 1, 'notes': 1, 'quaintly': 1, 'chequered': 1, 'astonished': 1, 'imitating': 1, 'fools': 1, 'incredibly': 1, 'level': 1, 'suspended': 1, 'vain': 1, 'rendering': 1, 'thunderclap': 1, 'withdrew': 1, 'bowed': 1, 'chain': 1, 'applause': 1, 'laughingly': 1, 'culture': 1, 'created': 1, 'someone': 1, 'plaything': 1, 'exhibited': 1, 'edifice': 1, 'anticipations': 1, 'posterity': 1, 'merriment': 1, 'entry': 1, 'portals': 1, 'yawned': 1, 'shadowy': 1, 'tangled': 1, 'neglected': 1, 'weedless': 1, 'spikes': 1, 'measuring': 1, 'waxen': 1, 'petals': 1, 'shrubs': 1, 'examine': 1, 'closely': 1, 'richly': 1, 'observe': 1, 'carving': 1, 'narrowly': 1, 'suggestions': 1, 'dingy': 1, 'garlanded': 1, 'colored': 1, 'whirl': 1, 'proportionately': 1, 'glazed': 1, 'coloured': 1, 'unglazed': 1, 'admitted': 1, 'tempered': 1, 'deeply': 1, 'channelled': 1, 'frequented': 1, 'transverse': 1, 'hypertrophied': 1, 'raspberry': 1, 'conductors': 1, 'signing': 1, 'likewise': 1, 'ceremony': 1, 'peel': 1, 'loath': 1, 'example': 1, 'thirsty': 1, 'hungry': 1, 'dilapidated': 1, 'geometrical': 1, 'fractured': 1, 'picturesque': 1, 'silky': 1, 'diet': 1, 'strict': 1, 'vegetarians': 1, 'carnal': 1, 'cravings': 1, 'frugivorous': 1, 'horses': 1, 'dogs': 1, 'ichthyosaurus': 1, 'season': 1, 'floury': 1, 'sided': 1, 'husk': 1, 'especially': 1, 'staple': 1, 'checked': 1, 'resolute': 1, 'interrogative': 1, 'conveying': 1, 'stare': 1, 'inextinguishable': 1, 'haired': 1, 'intention': 1, 'repeated': 1, 'attempts': 1, 'caused': 1, 'immense': 1, 'amusement': 1, 'schoolmaster': 1, 'persisted': 1, 'noun': 1, 'command': 1, 'demonstrative': 1, 'pronouns': 1, 'verb': 1, 'interrogations': 1, 'lessons': 1, 'hosts': 1, 'wander': 1, 'toy': 1, 'conversational': 1, 'disregard': 1, 'hunger': 1, 'satisfied': 1, 'gesticulated': 1, 'devices': 1, 'confusing': 1, 'situated': 1, 'shifted': 1, 'recorded': 1, 'splendour': 1, 'bound': 1, 'labyrinth': 1, 'precipitous': 1, 'crumpled': 1, 'pagoda': 1, 'nettles': 1, 'wonderfully': 1, 'tinted': 1, 'stinging': 1, 'determine': 1, 'intimation': 1, 'terrace': 1, 'single': 1, 'household': 1, 'cottage': 1, 'characteristic': 1, 'communism': 1, 'hairless': 1, 'visage': 1, 'girlish': 1, 'rotundity': 1, 'limb': 1, 'differences': 1, 'texture': 1, 'mark': 1, 'alike': 1, 'miniatures': 1, 'parents': 1, 'precocious': 1, 'physically': 1, 'opinion': 1, 'softness': 1, 'differentiation': 1, 'occupations': 1, 'militant': 1, 'childbearing': 1, 'becomes': 1, 'blessing': 1, 'specialization': 1, 'disappears': 1, 'remind': 1, 'reality': 1, 'oddness': 1, 'thread': 1, 'miraculous': 1, 'griffins': 1, 'flaming': 1, 'crimson': 1, 'burnished': 1, 'steel': 1, 'dotted': 1, 'silvery': 1, 'obelisk': 1, 'hedges': 1, 'proprietary': 1, 'rights': 1, 'facet': 1, 'ruddy': 1, 'sets': 1, 'feebleness': 1, 'ameliorating': 1, 'civilizing': 1, 'climax': 1, 'united': 1, 'projects': 1, 'harvest': 1, 'sanitation': 1, 'stage': 1, 'attacked': 1, 'department': 1, 'spreads': 1, 'persistently': 1, 'horticulture': 1, 'cultivate': 1, 'favourite': 1, 'selective': 1, 'peach': 1, 'seedless': 1, 'grape': 1, 'sweeter': 1, 'flower': 1, 'breed': 1, 'ideals': 1, 'tentative': 1, 'limited': 1, 'organized': 1, 'eddies': 1, 'educated': 1, 'operating': 1, 'subjugation': 1, 'wisely': 1, 'vegetable': 1, 'leaped': 1, 'gnats': 1, 'weeds': 1, 'butterflies': 1, 'flew': 1, 'ideal': 1, 'preventive': 1, 'stamped': 1, 'evidence': 1, 'contagious': 1, 'processes': 1, 'putrefaction': 1, 'changes': 1, 'triumphs': 1, 'housed': 1, 'shelters': 1, 'gloriously': 1, 'economical': 1, 'shop': 1, 'traffic': 1, 'commerce': 1, 'constitutes': 1, 'paradise': 1, 'adaptations': 1, 'biological': 1, 'errors': 1, 'cause': 1, 'vigour': 1, 'hardship': 1, 'active': 1, 'weaker': 1, 'loyal': 1, 'alliance': 1, 'capable': 1, 'restraint': 1, 'decision': 1, 'emotions': 1, 'arise': 1, 'offspring': 1, 'parental': 1, 'justification': 1, 'support': 1, 'sentiment': 1, 'arising': 1, 'connubial': 1, 'maternity': 1, 'sorts': 1, 'unnecessary': 1, 'survivals': 1, 'discords': 1, 'slightness': 1, 'strengthened': 1, 'belief': 1, 'energetic': 1, 'vitality': 1, 'alter': 1, 'altered': 1, 'tendencies': 1, 'desires': 1, 'source': 1, 'failure': 1, 'hindrances': 1, 'constitution': 1, 'outlet': 1, 'surgings': 1, 'purposeless': 1, 'settled': 1, 'peace': 1, 'takes': 1, 'eroticism': 1, 'languor': 1, 'impetus': 1, 'adorn': 1, 'sing': 1, 'fade': 1, 'inactivity': 1, 'hateful': 1, 'secret': 1, 'delicious': 1, 'checks': 1, 'succeeded': 1, 'stationary': 1, 'abandoned': 1, 'v': 1, 'gibbous': 1, 'overflow': 1, 'noiseless': 1, 'flitted': 1, 'chilled': 1, 'complacency': 1, 'stoutly': 1, 'conviction': 1, 'lash': 1, 'losing': 1, 'grip': 1, 'stanching': 1, 'trickle': 1, 'certainty': 1, 'excessive': 1, 'assurance': 1, 'cursed': 1, 'thereby': 1, 'faced': 1, 'towered': 1, 'mockery': 1, 'consoled': 1, 'imagining': 1, 'dismayed': 1, 'unsuspected': 1, 'intervention': 1, 'invention': 1, 'produced': 1, 'duplicate': 1, 'attachment': 1, 'prevented': 1, 'tampering': 1, 'hid': 1, 'startling': 1, 'deer': 1, 'gashed': 1, 'bleeding': 1, 'sobbing': 1, 'raving': 1, 'uneven': 1, 'malachite': 1, 'shin': 1, 'inarticulate': 1, 'splutter': 1, 'flare': 1, 'angry': 1, 'laying': 1, 'reasoning': 1, 'knocking': 1, 'stumbling': 1, 'maddened': 1, 'hopelessly': 1, 'raved': 1, 'screaming': 1, 'weeping': 1, 'sparrows': 1, 'desertion': 1, 'behoves': 1, 'materials': 1, 'cunning': 1, 'scrambled': 1, 'bathe': 1, 'stiff': 1, 'soiled': 1, 'equal': 1, 'questionings': 1, 'conveyed': 1, 'stolid': 1, 'hardest': 1, 'task': 1, 'devil': 1, 'begotten': 1, 'anger': 1, 'curbed': 1, 'advantage': 1, 'counsel': 1, 'groove': 1, 'ripped': 1, 'midway': 1, 'marks': 1, 'removal': 1, 'sloth': 1, 'closer': 1, 'highly': 1, 'decorated': 1, 'framed': 1, 'rapped': 1, 'discontinuous': 1, 'frames': 1, 'handles': 1, 'keyholes': 1, 'infer': 1, 'apple': 1, 'beckoned': 1, 'wish': 1, 'behaved': 1, 'grossly': 1, 'improper': 1, 'insult': 1, 'chap': 1, 'ashamed': 1, 'temper': 1, 'robe': 1, 'repugnance': 1, 'beaten': 1, 'banged': 1, 'chuckle': 1, 'mistaken': 1, 'pebble': 1, 'hammered': 1, 'flattened': 1, 'coil': 1, 'powdery': 1, 'outbreaks': 1, 'furtively': 1, 'occidental': 1, 'inactive': 1, 'aimlessly': 1, 'wrecking': 1, 'sit': 1, 'puzzle': 1, 'lies': 1, 'monomania': 1, 'clues': 1, 'humour': 1, 'situation': 1, 'study': 1, 'anxiety': 1, 'although': 1, 'tolerably': 1, 'avoidance': 1, 'abstain': 1, 'pursuit': 1, 'footing': 1, 'exclusively': 1, 'composed': 1, 'concrete': 1, 'verbs': 1, 'abstract': 1, 'figurative': 1, 'simplest': 1, 'propositions': 1, 'tethered': 1, 'exuberant': 1, 'richness': 1, 'climbed': 1, 'abundance': 1, 'endlessly': 1, 'varied': 1, 'style': 1, 'clustering': 1, 'thickets': 1, 'evergreens': 1, 'laden': 1, 'ferns': 1, 'feature': 1, 'depth': 1, 'walk': 1, 'rimmed': 1, 'wrought': 1, 'shafted': 1, 'engine': 1, 'sucked': 1, 'connect': 1, 'extensive': 1, 'associate': 1, 'sanitary': 1, 'obvious': 1, 'drains': 1, 'bells': 1, 'modes': 1, 'conveyance': 1, 'conveniences': 1, 'visions': 1, 'utopias': 1, 'detail': 1, 'arrangements': 1, 'details': 1, 'obtain': 1, 'realities': 1, 'conceive': 1, 'africa': 1, 'tribe': 1, 'companies': 1, 'telephone': 1, 'telegraph': 1, 'wires': 1, 'parcels': 1, 'delivery': 1, 'postal': 1, 'orders': 1, 'willing': 1, 'untravelled': 1, 'apprehend': 1, 'sepulture': 1, 'suggestive': 1, 'tombs': 1, 'cemeteries': 1, 'somewhere': 1, 'explorings': 1, 'defeated': 1, 'aged': 1, 'infirm': 1, 'endure': 1, 'explored': 1, 'halls': 1, 'apartments': 1, 'fabrics': 1, 'renewal': 1, 'undecorated': 1, 'metalwork': 1, 'creative': 1, 'shops': 1, 'workshops': 1, 'importations': 1, 'playing': 1, 'playful': 1, 'waterless': 1, 'interpolated': 1, 'letters': 1, 'visit': 1, 'sort': 1, 'cramp': 1, 'downstream': 1, 'strongly': 1, 'moderate': 1, 'swimmer': 1, 'deficiency': 1, 'slightest': 1, 'weakly': 1, 'drowning': 1, 'hurriedly': 1, 'wading': 1, 'mite': 1, 'estimate': 1, 'returning': 1, 'garland': 1, 'display': 1, 'appreciation': 1, 'gift': 1, 'arbour': 1, 'smiles': 1, 'friendliness': 1, 'kissed': 1, 'hers': 1, 'appropriate': 1, 'friendship': 1, 'lasted': 1, 'tire': 1, 'plaintively': 1, 'miniature': 1, 'flirtation': 1, 'expostulations': 1, 'parting': 1, 'frantic': 1, 'cling': 1, 'inflicted': 1, 'seeming': 1, 'fond': 1, 'showing': 1, 'cared': 1, 'doll': 1, 'neighbourhood': 1, 'tiny': 1, 'fearless': 1, 'oddest': 1, 'threatening': 1, 'grimaces': 1, 'observing': 1, 'droves': 1, 'blockhead': 1, 'insisted': 1, 'slumbering': 1, 'multitudes': 1, 'triumphed': 1, 'acquaintance': 1, 'including': 1, 'pillowed': 1, 'slips': 1, 'dreaming': 1, 'disagreeably': 1, 'drowned': 1, 'anemones': 1, 'chamber': 1, 'flagstones': 1, 'virtue': 1, 'sunrise': 1, 'mingled': 1, 'cheerless': 1, 'quickly': 1, 'leash': 1, 'uncertain': 1, 'doubted': 1, 'colouring': 1, 'keenly': 1, 'whence': 1, 'dated': 1, 'grant': 1, 'allen': 1, 'amused': 1, 'argued': 1, 'overcrowded': 1, 'unsatisfying': 1, 'associated': 1, 'search': 1, 'substitute': 1, 'deadlier': 1, 'cooling': 1, 'younger': 1, 'darwin': 1, 'forget': 1, 'planets': 1, 'parent': 1, 'catastrophes': 1, 'occur': 1, 'suffered': 1, 'fed': 1, 'blocked': 1, 'contrast': 1, 'brilliancy': 1, 'impenetrably': 1, 'swim': 1, 'spellbound': 1, 'instinctive': 1, 'eyeballs': 1, 'overcoming': 1, 'extent': 1, 'controlled': 1, 'darted': 1, 'blundered': 1, 'ruined': 1, 'imperfect': 1, 'flaxen': 1, 'fours': 1, 'forearms': 1, 'shudder': 1, 'forming': 1, 'ladder': 1, 'succeed': 1, 'persuading': 1, 'dawned': 1, 'differentiated': 1, 'heir': 1, 'lemur': 1, 'scheme': 1, 'related': 1, 'withal': 1, 'amorous': 1, 'male': 1, 'female': 1, 'considered': 1, 'bad': 1, 'apertures': 1, 'visibly': 1, 'interested': 1, 'amuse': 1, 'economic': 1, 'emergence': 1, 'fish': 1, 'kentucky': 1, 'capacity': 1, 'reflecting': 1, 'cat': 1, 'fumbling': 1, 'awkward': 1, 'carriage': 1, 'reinforced': 1, 'sensitiveness': 1, 'retina': 1, 'tunnelled': 1, 'enormously': 1, 'tunnellings': 1, 'habitat': 1, 'ramifications': 1, 'anticipate': 1, 'temporary': 1, 'capitalist': 1, 'labourer': 1, 'key': 1, 'wildly': 1, 'utilize': 1, 'ornamental': 1, 'purposes': 1, 'metropolitan': 1, 'electric': 1, 'railways': 1, 'subways': 1, 'workrooms': 1, 'restaurants': 1, 'multiply': 1, 'birthright': 1, 'factories': 1, 'spending': 1, 'worker': 1, 'practically': 1, 'refinement': 1, 'rude': 1, 'portions': 1, 'prettier': 1, 'shut': 1, 'intrusion': 1, 'educational': 1, 'facilities': 1, 'temptations': 1, 'habits': 1, 'exchange': 1, 'promotion': 1, 'intermarriage': 1, 'retards': 1, 'lines': 1, 'stratification': 1, 'frequent': 1, 'haves': 1, 'pursuing': 1, 'nots': 1, 'workers': 1, 'rent': 1, 'caverns': 1, 'starve': 1, 'suffocated': 1, 'arrears': 1, 'miserable': 1, 'rebellious': 1, 'permanent': 1, 'survivors': 1, 'happy': 1, 'theirs': 1, 'etiolated': 1, 'moral': 1, 'operation': 1, 'imagined': 1, 'armed': 1, 'perfected': 1, 'industrial': 1, 'cicerone': 1, 'utopian': 1, 'supposition': 1, 'zenith': 1, 'degeneration': 1, 'dwindling': 1, 'grounders': 1, 'modification': 1, 'troublesome': 1, 'doubts': 1, 'masters': 1, 'restore': 1, 'disappointed': 1, 'answer': 1, 'topic': 1, 'unendurable': 1, 'harshly': 1, 'concerned': 1, 'banishing': 1, 'inheritance': 1, 'clapping': 1, 'vi': 1, 'manifestly': 1, 'pallid': 1, 'bodies': 1, 'worms': 1, 'zoological': 1, 'filthily': 1, 'sympathetic': 1, 'influence': 1, 'disgust': 1, 'health': 1, 'oppressed': 1, 'definite': 1, 'noiselessly': 1, 'appearances': 1, 'whitened': 1, 'lemurs': 1, 'replaced': 1, 'shirks': 1, 'duty': 1, 'penetrating': 1, 'mysteries': 1, 'companion': 1, 'appalled': 1, 'restlessness': 1, 'insecurity': 1, 'afield': 1, 'expeditions': 1, 'combe': 1, 'observed': 1, 'banstead': 1, 'largest': 1, 'facade': 1, 'oriental': 1, 'lustre': 1, 'tint': 1, 'bluish': 1, 'chinese': 1, 'aspect': 1, 'tiring': 1, 'circuit': 1, 'welcome': 1, 'caresses': 1, 'deception': 1, 'enable': 1, 'shirk': 1, 'danced': 1, 'strangely': 1, 'disconcerted': 1, 'bye': 1, 'piteous': 1, 'pull': 1, 'opposition': 1, 'nerved': 1, 'proceed': 1, 'roughly': 1, 'agonized': 1, 'reassure': 1, 'smaller': 1, 'lighter': 1, 'acutely': 1, 'glancing': 1, 'projection': 1, 'thudding': 1, 'discomfort': 1, 'relief': 1, 'slender': 1, 'loophole': 1, 'ached': 1, 'distressing': 1, 'pumping': 1, 'roused': 1, 'snatched': 1, 'retreating': 1, 'impenetrable': 1, 'abnormally': 1, 'sensitive': 1, 'pupils': 1, 'abysmal': 1, 'fishes': 1, 'reflected': 1, 'apart': 1, 'fled': 1, 'vanishing': 1, 'gutters': 1, 'tunnels': 1, 'strangest': 1, 'unaided': 1, 'arched': 1, 'cavern': 1, 'necessarily': 1, 'spectral': 1, 'sheltered': 1, 'stuffy': 1, 'halitus': 1, 'freshly': 1, 'shed': 1, 'vista': 1, 'meal': 1, 'carnivorous': 1, 'survived': 1, 'furnish': 1, 'joint': 1, 'unmeaning': 1, 'stung': 1, 'spot': 1, 'absurd': 1, 'assumption': 1, 'infinitely': 1, 'kodak': 1, 'weapons': 1, 'endowed': 1, 'store': 1, 'astonishing': 1, 'novelty': 1, 'lank': 1, 'beings': 1, 'plucking': 1, 'clothing': 1, 'loudly': 1, 'whispering': 1, 'discordantly': 1, 'alarmed': 1, 'eking': 1, 'rustling': 1, 'hurried': 1, 'mistaking': 1, 'haul': 1, 'waved': 1, 'nauseatingly': 1, 'chinless': 1, 'lidless': 1, 'pump': 1, 'tugged': 1, 'kicking': 1, 'clutches': 1, 'wretch': 1, 'nigh': 1, 'secured': 1, 'boot': 1, 'trophy': 1, 'climb': 1, 'greatest': 1, 'keeping': 1, 'blinding': 1, 'soil': 1, 'smelt': 1, 'clean': 1, 'vii': 1, 'worse': 1, 'sustaining': 1, 'ultimate': 1, 'impeded': 1, 'simplicity': 1, 'forces': 1, 'overcome': 1, 'sickening': 1, 'malign': 1, 'loathed': 1, 'beast': 1, 'incomprehensible': 1, 'remarks': 1, 'guess': 1, 'degree': 1, 'villainy': 1, 'favoured': 1, 'resulted': 1, 'evolution': 1, 'relationship': 1, 'carolingian': 1, 'kings': 1, 'decayed': 1, 'possessed': 1, 'sufferance': 1, 'daylit': 1, 'intolerable': 1, 'inferred': 1, 'maintained': 1, 'habitual': 1, 'service': 1, 'horse': 1, 'paws': 1, 'enjoys': 1, 'organism': 1, 'reversed': 1, 'nemesis': 1, 'begun': 1, 'anew': 1, 'becoming': 1, 'reacquainted': 1, 'floated': 1, 'stirred': 1, 'meditations': 1, 'ours': 1, 'ripe': 1, 'prime': 1, 'paralyse': 1, 'defend': 1, 'delay': 1, 'fastness': 1, 'base': 1, 'realizing': 1, 'exposed': 1, 'shuddered': 1, 'commended': 1, 'practicable': 1, 'dexterous': 1, 'climbers': 1, 'reckoned': 1, 'eighteen': 1, 'moist': 1, 'distances': 1, 'deceptively': 1, 'nail': 1, 'indoors': 1, 'silhouetted': 1, 'delighted': 1, 'desired': 1, 'darting': 1, 'pick': 1, 'concluded': 1, 'eccentric': 1, 'vase': 1, 'floral': 1, 'decoration': 1, 'utilized': 1, 'reminds': 1, 'paused': 1, 'silently': 1, 'mallows': 1, 'narrative': 1, 'hush': 1, 'wimbledon': 1, 'contrived': 1, 'stops': 1, 'senses': 1, 'preternaturally': 1, 'sharpened': 1, 'hollowness': 1, 'receive': 1, 'invasion': 1, 'burrows': 1, 'declaration': 1, 'deepened': 1, 'tightly': 1, 'waded': 1, 'opposite': 1, 'statue': 1, 'faun': 1, 'minus': 1, 'acacias': 1, 'brow': 1, 'sore': 1, 'lowered': 1, 'hide': 1, 'dense': 1, 'roots': 1, 'stumble': 1, 'boles': 1, 'decided': 1, 'asleep': 1, 'wrapped': 1, 'moonrise': 1, 'imperceptible': 1, 'lifetimes': 1, 'rearranged': 1, 'groupings': 1, 'milky': 1, 'streamer': 1, 'yore': 1, 'southward': 1, 'sirius': 1, 'scintillating': 1, 'kindly': 1, 'dwarfed': 1, 'troubles': 1, 'gravities': 1, 'terrestrial': 1, 'unfathomable': 1, 'precessional': 1, 'cycle': 1, 'pole': 1, 'describes': 1, 'revolutions': 1, 'activity': 1, 'traditions': 1, 'organizations': 1, 'nations': 1, 'languages': 1, 'literatures': 1, 'aspirations': 1, 'ancestry': 1, 'shiver': 1, 'starlike': 1, 'dismissed': 1, 'whiled': 1, 'dozed': 1, 'peaked': 1, 'overtaking': 1, 'overflowing': 1, 'swollen': 1, 'ankle': 1, 'forbidding': 1, 'wherewith': 1, 'dainty': 1, 'bottom': 1, 'pitied': 1, 'rill': 1, 'flood': 1, 'discriminating': 1, 'monkey': 1, 'prejudice': 1, 'sons': 1, 'cannibal': 1, 'torment': 1, 'fatted': 1, 'preyed': 1, 'preserve': 1, 'rigorous': 1, 'punishment': 1, 'selfishness': 1, 'content': 1, 'labours': 1, 'excuse': 1, 'fullness': 1, 'carlyle': 1, 'scorn': 1, 'wretched': 1, 'claim': 1, 'sympathy': 1, 'sharer': 1, 'pursue': 1, 'procure': 1, 'torch': 1, 'arrange': 1, 'battering': 1, 'ram': 1, 'schemes': 1, 'chosen': 1, 'dwelling': 1, 'viii': 1, 'noon': 1, 'ragged': 1, 'estuary': 1, 'creek': 1, 'wandsworth': 1, 'battersea': 1, 'happening': 1, 'foolishly': 1, 'interpret': 1, 'customary': 1, 'tiled': 1, 'miscellaneous': 1, 'objects': 1, 'shrouded': 1, 'gaunt': 1, 'oblique': 1, 'megatherium': 1, 'barrel': 1, 'brontosaurus': 1, 'confirmed': 1, 'shelves': 1, 'clearing': 1, 'preservation': 1, 'kensington': 1, 'palaeontological': 1, 'bacteria': 1, 'ninety': 1, 'hundredths': 1, 'sureness': 1, 'slowness': 1, 'treasures': 1, 'threaded': 1, 'strings': 1, 'reeds': 1, 'instances': 1, 'bodily': 1, 'deadened': 1, 'footsteps': 1, 'urchin': 1, 'monument': 1, 'preoccupation': 1, 'deal': 1, 'palaeontology': 1, 'historical': 1, 'galleries': 1, 'library': 1, 'vastly': 1, 'interesting': 1, 'spectacle': 1, 'oldtime': 1, 'geology': 1, 'transversely': 1, 'minerals': 1, 'gunpowder': 1, 'saltpeter': 1, 'nitrates': 1, 'doubtless': 1, 'deliquesced': 1, 'train': 1, 'specialist': 1, 'mineralogy': 1, 'parallel': 1, 'history': 1, 'recognition': 1, 'blackened': 1, 'stuffed': 1, 'desiccated': 1, 'mummies': 1, 'jars': 1, 'patent': 1, 'readjustments': 1, 'proportions': 1, 'angle': 1, 'intervals': 1, 'globes': 1, 'ceiling': 1, 'originally': 1, 'artificially': 1, 'bulks': 1, 'linger': 1, 'vaguest': 1, 'solve': 1, 'sloped': 1, 'footnote': 1, 'ed': 1, 'slit': 1, 'area': 1, 'puzzling': 1, 'intent': 1, 'diminution': 1, 'apprehensions': 1, 'revived': 1, 'academic': 1, 'projected': 1, 'signal': 1, 'grasping': 1, 'whimper': 1, 'correctly': 1, 'snapped': 1, 'strain': 1, 'rejoined': 1, 'encounter': 1, 'longed': 1, 'kill': 1, 'disinclination': 1, 'slake': 1, 'thirst': 1, 'murder': 1, 'military': 1, 'chapel': 1, 'flags': 1, 'rags': 1, 'semblance': 1, 'print': 1, 'warped': 1, 'boards': 1, 'clasps': 1, 'literary': 1, 'moralized': 1, 'ambition': 1, 'keenest': 1, 'enormous': 1, 'wilderness': 1, 'rotting': 1, 'transactions': 1, 'papers': 1, 'optics': 1, 'technical': 1, 'chemistry': 1, 'useful': 1, 'damp': 1, 'carpeting': 1, 'performed': 1, 'composite': 1, 'whistling': 1, 'leal': 1, 'cancan': 1, 'skirt': 1, 'tail': 1, 'permitted': 1, 'original': 1, 'inventive': 1, 'wear': 1, 'immemorial': 1, 'fortunate': 1, 'unlikelier': 1, 'chance': 1, 'hermetically': 1, 'paraffin': 1, 'wax': 1, 'unmistakable': 1, 'volatile': 1, 'sepia': 1, 'painting': 1, 'fossil': 1, 'belemnite': 1, 'perished': 1, 'fossilized': 1, 'throw': 1, 'inflammable': 1, 'explosives': 1, 'helpful': 1, 'elated': 1, 'rusting': 1, 'hatchet': 1, 'sword': 1, 'guns': 1, 'pistols': 1, 'rifles': 1, 'powder': 1, 'rotted': 1, 'shattered': 1, 'idols': 1, 'polynesian': 1, 'mexican': 1, 'grecian': 1, 'yielding': 1, 'wrote': 1, 'nose': 1, 'steatite': 1, 'america': 1, 'exhibits': 1, 'lignite': 1, 'fresher': 1, 'tin': 1, 'merest': 1, 'accident': 1, 'dynamite': 1, 'eureka': 1, 'joy': 1, 'selecting': 1, 'essay': 1, 'dummies': 1, 'chances': 1, 'non': 1, 'court': 1, 'turfed': 1, 'refreshed': 1, 'defences': 1, 'needed': 1, 'refrained': 1, 'forcing': 1, 'inadequate': 1, 'ix': 1, 'ere': 1, 'purposed': 1, 'pushing': 1, 'dried': 1, 'litter': 1, 'thus': 1, 'loaded': 1, 'sleepiness': 1, 'shrubby': 1, 'fearing': 1, 'singular': 1, 'impending': 1, 'served': 1, 'warning': 1, 'onward': 1, 'feverish': 1, 'irritable': 1, 'scrub': 1, 'insidious': 1, 'approach': 1, 'calculated': 1, 'safer': 1, 'abandon': 1, 'firewood': 1, 'reluctantly': 1, 'amaze': 1, 'friends': 1, 'atrocious': 1, 'temperate': 1, 'climate': 1, 'burn': 1, 'focused': 1, 'dewdrops': 1, 'tropical': 1, 'districts': 1, 'lightning': 1, 'blast': 1, 'blacken': 1, 'widespread': 1, 'smoulder': 1, 'fermentation': 1, 'decadence': 1, 'licking': 1, 'play': 1, 'herself': 1, 'struggles': 1, 'plunged': 1, 'crowded': 1, 'adjacent': 1, 'avoid': 1, 'rustle': 1, 'vessels': 1, 'grimly': 1, 'tug': 1, 'scratched': 1, 'fizzed': 1, 'prepared': 1, 'fright': 1, 'breathe': 1, 'split': 1, 'flared': 1, 'knelt': 1, 'lifted': 1, 'fainted': 1, 'manoeuvring': 1, 'sweat': 1, 'rapidly': 1, 'build': 1, 'encamp': 1, 'bole': 1, 'collecting': 1, 'carbuncles': 1, 'forms': 1, 'blinded': 1, 'grind': 1, 'whoop': 1, 'bonfire': 1, 'casting': 1, 'choking': 1, 'smoky': 1, 'moreover': 1, 'replenishing': 1, 'exertion': 1, 'nod': 1, 'bitterness': 1, 'soul': 1, 'heaped': 1, 'web': 1, 'overpowered': 1, 'nipping': 1, 'rolled': 1, 'succulent': 1, 'giving': 1, 'bone': 1, 'exultation': 1, 'accompany': 1, 'fighting': 1, 'pitch': 1, 'battered': 1, 'incessant': 1, 'stream': 1, 'agape': 1, 'spark': 1, 'roar': 1, 'stepping': 1, 'explosive': 1, 'outflanked': 1, 'weird': 1, 'tumulus': 1, 'surmounted': 1, 'writhing': 1, 'encircling': 1, 'fence': 1, 'crippling': 1, 'moans': 1, 'helplessness': 1, 'quivering': 1, 'elude': 1, 'somewhat': 1, 'happen': 1, 'uncanny': 1, 'coiling': 1, 'uprush': 1, 'streamed': 1, 'tatters': 1, 'canopy': 1, 'belonged': 1, 'universe': 1, 'fists': 1, 'persuaded': 1, 'screamed': 1, 'thrice': 1, 'rush': 1, 'streaming': 1, 'whitening': 1, 'blackening': 1, 'stumps': 1, 'diminishing': 1, 'searched': 1, 'relieved': 1, 'massacre': 1, 'abominations': 1, 'island': 1, 'haze': 1, 'bearings': 1, 'remnant': 1, 'souls': 1, 'clearer': 1, 'tied': 1, 'limped': 1, 'pulsated': 1, 'internally': 1, 'intensest': 1, 'overwhelming': 1, 'sorrow': 1, 'fireside': 1, 'thoughts': 1, 'longing': 1, 'trouser': 1, 'leaked': 1, 'x': 1, 'viewed': 1, 'conclusions': 1, 'bitterly': 1, 'magnificent': 1, 'fertile': 1, 'banks': 1, 'gay': 1, 'saved': 1, 'stab': 1, 'blots': 1, 'cupolas': 1, 'enemies': 1, 'provided': 1, 'grieved': 1, 'committed': 1, 'hopes': 1, 'property': 1, 'wealth': 1, 'toiler': 1, 'unemployed': 1, 'unsolved': 1, 'law': 1, 'versatility': 1, 'compensation': 1, 'environment': 1, 'appeals': 1, 'useless': 1, 'partake': 1, 'variety': 1, 'drifted': 1, 'perfection': 1, 'feeding': 1, 'disjointed': 1, 'mother': 1, 'retained': 1, 'initiative': 1, 'mortal': 1, 'wit': 1, 'invent': 1, 'fatigues': 1, 'grief': 1, 'tranquil': 1, 'sleepy': 1, 'theorizing': 1, 'dozing': 1, 'catching': 1, 'refreshing': 1, 'awoke': 1, 'sunsetting': 1, 'napping': 1, 'stretching': 1, 'grooves': 1, 'apartment': 1, 'elaborate': 1, 'preparations': 1, 'siege': 1, 'meek': 1, 'surrender': 1, 'suppressing': 1, 'inclination': 1, 'stepped': 1, 'oiled': 1, 'cleaned': 1, 'clang': 1, 'trapped': 1, 'chuckled': 1, 'gleefully': 1, 'murmuring': 1, 'calmly': 1, 'fix': 1, 'depart': 1, 'persistent': 1, 'studs': 1, 'butt': 1, 'ring': 1, 'described': 1, 'xi': 1, 'sickness': 1, 'properly': 1, 'vibrated': 1, 'unheeding': 1, 'amazed': 1, 'records': 1, 'reversing': 1, 'indicators': 1, 'seconds': 1, 'palpitating': 1, 'prodigious': 1, 'indicative': 1, 'marked': 1, 'alternations': 1, 'stretch': 1, 'brooded': 1, 'comet': 1, 'broader': 1, 'given': 1, 'glowed': 1, 'reverted': 1, 'sullen': 1, 'slowing': 1, 'tidal': 1, 'drag': 1, 'cautiously': 1, 'reverse': 1, 'scale': 1, 'outlines': 1, 'indian': 1, 'starless': 1, 'scarlet': 1, 'hull': 1, 'lichen': 1, 'perpetual': 1, 'wan': 1, 'breakers': 1, 'waves': 1, 'oily': 1, 'swell': 1, 'gentle': 1, 'broke': 1, 'lurid': 1, 'oppression': 1, 'mountaineering': 1, 'rarefied': 1, 'scream': 1, 'butterfly': 1, 'slanting': 1, 'dismal': 1, 'firmly': 1, 'yonder': 1, 'uncertainly': 1, 'antennae': 1, 'carters': 1, 'whips': 1, 'waving': 1, 'stalked': 1, 'gleaming': 1, 'corrugated': 1, 'ornamented': 1, 'bosses': 1, 'greenish': 1, 'blotched': 1, 'sinister': 1, 'apparition': 1, 'tickling': 1, 'fly': 1, 'brush': 1, 'ear': 1, 'threadlike': 1, 'qualm': 1, 'antenna': 1, 'alive': 1, 'algal': 1, 'descending': 1, 'dozens': 1, 'foliated': 1, 'desolation': 1, 'northward': 1, 'stony': 1, 'poisonous': 1, 'lichenous': 1, 'hurts': 1, 'lungs': 1, 'appalling': 1, 'earthy': 1, 'crustacea': 1, 'fascination': 1, 'ebb': 1, 'obscure': 1, 'tenth': 1, 'multitude': 1, 'crabs': 1, 'livid': 1, 'liverworts': 1, 'lichens': 1, 'flecked': 1, 'bitter': 1, 'assailed': 1, 'sable': 1, 'fringes': 1, 'ice': 1, 'expanse': 1, 'ocean': 1, 'bloody': 1, 'unfrozen': 1, 'indefinable': 1, 'sandbank': 1, 'flopping': 1, 'bank': 1, 'deceived': 1, 'twinkle': 1, 'outline': 1, 'concavity': 1, 'bay': 1, 'curve': 1, 'aghast': 1, 'transit': 1, 'freshening': 1, 'gusts': 1, 'showering': 1, 'ripple': 1, 'whisper': 1, 'bleating': 1, 'birds': 1, 'insects': 1, 'background': 1, 'thickened': 1, 'peaks': 1, 'marrow': 1, 'overcame': 1, 'bow': 1, 'shoal': 1, 'football': 1, 'bigger': 1, 'trailed': 1, 'weltering': 1, 'fitfully': 1, 'fainting': 1, 'terrible': 1, 'sustained': 1, 'clambered': 1, 'xii': 1, 'contours': 1, 'ebbed': 1, 'flowed': 1, 'spun': 1, 'zero': 1, 'slackened': 1, 'flapped': 1, 'slowed': 1, 'inversion': 1, 'glided': 1, 'foremost': 1, 'previously': 1, 'hillyer': 1, 'shakily': 1, 'trembled': 1, 'calmer': 1, 'around': 1, 'stagnant': 1, 'begrimed': 1, 'pall': 1, 'mall': 1, 'gazette': 1, 'timepiece': 1, 'clatter': 1, 'sniffed': 1, 'washed': 1, 'dined': 1, 'adventures': 1, 'prophecy': 1, 'speculating': 1, 'destinies': 1, 'hatched': 1, 'fiction': 1, 'assertion': 1, 'stroke': 1, 'enhance': 1, 'nervously': 1, 'grate': 1, 'creak': 1, 'scrape': 1, 'carpet': 1, 'audience': 1, 'absorbed': 1, 'contemplation': 1, 'sixth': 1, 'sigh': 1, 're': 1, 'writer': 1, 'stories': 1, 'puffing': 1, 'mute': 1, 'inquiry': 1, 'scars': 1, 'gynaeceum': 1, 'leant': 1, 'hanged': 1, 'plenty': 1, 'cabs': 1, 'station': 1, 'eluded': 1, 'precious': 1, 'fit': 1, 'ugly': 1, 'ebony': 1, 'translucent': 1, 'glimmering': 1, 'smears': 1, 'bits': 1, 'awry': 1, 'damaged': 1, 'hesitation': 1, 'overwork': 1, 'shared': 1, 'cab': 1, 'gaudy': 1, 'fantastic': 1, 'credible': 1, 'sober': 1, 'substantial': 1, 'bough': 1, 'shaken': 1, 'instability': 1, 'meddle': 1, 'camera': 1, 'knapsack': 1, 'elbow': 1, 'shake': 1, 'busy': 1, 'hoax': 1, 'awfully': 1, 'magazines': 1, 'prove': 1, 'hilt': 1, 'forgive': 1, 'consented': 1, 'comprehending': 1, 'slam': 1, 'publisher': 1, 'barely': 1, 'engagement': 1, 'handle': 1, 'exclamation': 1, 'truncated': 1, 'click': 1, 'ghostly': 1, 'whirling': 1, 'phantasm': 1, 'rubbed': 1, 'pane': 1, 'skylight': 1, 'servant': 1, 'mr': 1, 'disappointing': 1, 'photographs': 1, 'lifetime': 1, 'everybody': 1, 'knows': 1, 'epilogue': 1, 'choose': 1, 'drinking': 1, 'hairy': 1, 'savages': 1, 'unpolished': 1, 'abysses': 1, 'cretaceous': 1, 'saurians': 1, 'reptilian': 1, 'jurassic': 1, 'phrase': 1, 'wandering': 1, 'plesiosaurus': 1, 'haunted': 1, 'oolitic': 1, 'coral': 1, 'reef': 1, 'saline': 1, 'lakes': 1, 'triassic': 1, 'riddles': 1, 'wearisome': 1, 'solved': 1, 'manhood': 1, 'fragmentary': 1, 'discord': 1, 'culminating': 1, 'discussed': 1, 'cheerlessly': 1, 'advancement': 1, 'makers': 1, 'casual': 1, 'brittle': 1})
tok_f=counter.items()
type(tok_f)#看看.items()方法的作用,可见转化为了dict_items型,原键值对元素变成对应元组
dict_items
list(counter)[:2]#观察先.item()再list与直接list的异同,直接的话是键组成的列表
['the',
'time']
list(tok_f)[:2]#而这是键值对元组形成的列表,方便后续操作
[('the', 2261),
('time', 200)]
idx_to_token=[]
idx_to_token+=['<pad>','<bos>','<eos>','<unk>']
idx_to_token
['<pad>', '<bos>', '<eos>', '<unk>']
我们看一个例子,这里我们尝试用Time Machine作为语料构建字典
vocab=Vocab(tokens)
print(list(vocab.token_to_idx.items())[0:10])
[('<unk>', 0), ('the', 1), ('time', 2), ('machine', 3), ('by', 4), ('h', 5), ('g', 6), ('wells', 7), ('', 8), ('i', 9)]
vocab['the']
1
vocab.to_tokens(1)
'the'
将词转为索引
使用字典,我们可以将原文本中的句子从单词序列转换为索引序列
for i in range(8,10):
print('word:',tokens[i])
print('indices:',vocab[tokens[i]])#调用__getitem__(tokens[i])
word: ['the', 'time', 'traveller', 'for', 'so', 'it', 'will', 'be', 'convenient', 'to', 'speak', 'of', 'him', '']
indices: [1, 2, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 8]
word: ['was', 'expounding', 'a', 'recondite', 'matter', 'to', 'us', 'his', 'grey', 'eyes', 'shone', 'and']
indices: [21, 22, 23, 24, 25, 17, 26, 27, 28, 29, 30, 31]
用现有工具进行分词
我们前面介绍的分词方式非常简单,它至少有以下几个缺点:
- 标点符号通常可以提供语义信息,但是我们的方法直接将其丢弃了
- 类似“shouldn’t", "doesn’t"这样的词会被错误地处理
- 类似"Mr.", "Dr."这样的词会被错误地处理
我们可以通过引入更复杂的规则来解决这些问题,但是事实上,有一些现有的工具可以很好地进行分词,我们在这里简单介绍其中的两个:spaCy和NLTK。
下面是一个简单的例子:
text="Mr. Chen doesn't agree with my suggestion."
spaCy
'''import spacy
nlp=spacy.load('en_core_web_sm')
doc=nlp(text)
print([token.text for token in doc])'''
['Mr.', 'Chen', 'does', "n't", 'agree', 'with', 'my', 'suggestion', '.']
from nltk.tokenize import word_tokenize
from nltk import data
data.path.append(r'E:/nltk_data')#语料库nltk_data要下好,路径为其放置位置
print(word_tokenize(text))
['Mr.', 'Chen', 'does', "n't", 'agree', 'with', 'my', 'suggestion', '.']
如有错误,欢迎指正!谢谢!
转载自原文链接, 如需删除请联系管理员。
原文链接:动手学习深度学习(Pytorch版)Task 2:文本预处理,转载请注明来源!