华盛顿特区与其他地区的差别
深度分析 (In-Depth Analysis)
Living in Washington DC for the past 1 year, I have come to realize how WMATA metro is the lifeline of this vibrant city. The metro network is enormous and well-connected throughout the DMV area. When I first moved to the Capital city with no car, I often used to hop on the metro to get around. I have always loved train journeys and therefore unsurprisingly, metro became my most favorite way to explore this beautiful city. On my travels, I often notice the product placements and advertisements on metro platforms, near escalators/elevators, inside the metro trains, etc. A good analysis of the metro rider data would help the advertisers to identify which metro stops are the busiest at what times so as to increase the ad exposure. I chanced upon this free dataset and decided to plunge deep into it. In this article, I’ll walk you through my analysis.
在过去的一年中,住在华盛顿特区,我逐渐意识到WMATA地铁是这座充满活力的城市的生命线。 地铁网络非常庞大,并且在DMV区域内连接良好。 当我第一次没有汽车搬到首都时,我经常跳上地铁到处走走。 我一直喜欢火车旅行,因此毫不奇怪,地铁成为我探索这座美丽城市的最喜欢的方式。 在旅途中,我经常注意到地铁站台,自动扶梯/电梯附近,地铁列车内等的产品位置和广告。对地铁乘客数据的良好分析将有助于广告商确定哪些地铁站最繁忙时间,以增加广告曝光率。 我偶然发现了这个免费数据集,并决定深入其中。 在本文中,我将指导您进行分析。
Step 1: Importing necessary libraries
步骤1:导入必要的库
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
from wordcloud import WordCloud, STOPWORDS
from nltk.corpus import stopwords
Step 2: Reading the data
步骤2:读取资料
Let us call our pandas dataframe as ‘df_metro’ which will contain the original data.
让我们将熊猫数据框称为“ df_metro”,它将包含原始数据。
df_metro = pd.read_csv("DC MetroData.csv"
Step 3: Eyeballing the data and length of the dataframe
步骤3:查看数据和数据帧的长度
df_metro.head()