首页 » 技术分享 » 分析的金发姑娘区

分析的金发姑娘区

 

We earthlings have it good. If earth’s orbit was a little closer to the sun it would be too hot; a little further and it would be too cold. Where earth sits is just right, an area otherwise known as the Goldilocks Zone.

我们地球人都很好。 如果地球的轨道更靠近太阳,那将太热了。 再远一点,那太冷了。 地球所在的位置恰到好处,这就是所谓的“金发姑娘区”。

If analytics is the solar system, then data visualization is its Goldilocks zone. On one side is reality, which is too detailed and complex to fully comprehend. On the other side is raw data, which is also difficult to understand because it is too abstract. For analytics to be useful, it must sit somewhere in the middle — not too real and not too abstract. Visualization is the key to getting it just right.

如果分析是太阳系,那么数据可视化就是其Goldilocks区。 一方面是现实,它过于详细和复杂,无法完全理解。 另一方面是原始数据,由于过于抽象,因此也难以理解。 为了使分析有用,它必须位于中间的某个位置–不太真实也不是太抽象。 可视化是使其正确正确的关键。

三腿式分析工具 (The Three-Legged Stool of Analytics)

People understand what they can touch, see, hear, smell, and taste. They cannot do that with an abstract concept. People find meaning in ideas that are connected to the physical world they live in.

人们了解他们可以触摸,看到,听到,闻到和品尝到的东西。 他们不能用抽象的概念来做到这一点。 人们在与生活世界息息相关的想法中找到意义。

This is a problem for analytics since abstraction is a fundamental aspect of it. To analyze and make a confident conclusion about something, a large amount of information is usually needed. That information then must be abstracted into raw data.

这是分析的一个问题,因为抽象是它的基本方面。 为了对某事进行分析并做出可靠的结论,通常需要大量信息。 然后必须将该信息抽象为原始数据。

Once the data is set, the next step is to group, aggregate, and apply statistical methods to it, with the goal of finding meaningful patterns that can be turned into generalized principles and predictions. This too is a fundamental aspect of analytics. Every analytical process starts with data and analysis, but it cannot stop here. It is still too abstract. For an analysis to be useful, it must make its way back to the five senses.

设置好数据后,下一步就是将其分组,汇总和应用统计方法,以期找到可以转化为通用原理和预测的有意义的模式。 这也是分析的基本方面。 每个分析过程都始于数据和分析,但不能止于此。 它仍然太抽象。 为了使分析有用,它必须回到五种意义上。

If analytics is a three-legged stool where the first leg makes information manageable (raw data), and the second leg makes it meaningful (data science), then a third leg is needed to make it relatable. That is the role of visualization. For an analysis to stand, all three legs must work in harmony. All too often, visualization is thought of as an add on, but there are several reasons why it should be treated as a fundamental part of every analytical project.

如果分析是三足的工作,第一根使信息易于管理(原始数据),第二根使信息变得有意义(数据科学),则需要第三根腿使其具有相关性。 那就是可视化的作用。 为了使分析站起来,所有三个支路必须协调一致。 通常,可视化被认为是附加功能,但是有几个原因将可视化视为每个分析项目的基础部分。

眼见为实 (Seeing is believing)

Describe something and it might be understood. Show something and not only will it be understood, but it will be understood instantly. The classic example of this is Anscombe’s Quartet, which takes 4 datasets with the exact same summary stats…

描述一些东西,它可能会被理解。 展示一些东西,不仅会被理解,而且会立即被理解。 典型的例子是Anscombe的Quartet,它包含4个具有完全相同的摘要统计数据集。