Using Tableau Public for (Spatial and Trendline) Data Visualization
(An Early Exploration and "TMI" Musing on Data)

Using Tableau Public for (Spatial and Trendline) Data Visualization

(An Early Exploration and "TMI" Musing on Data)

Shalin Hai-Jew, Kansas State University

Colleague to Colleague Spring Forum 2012

(Apr. 20, 2012, Hutchinson Community College)

With complex data sets of educational information available, it helps to be able to provide Web-accessible and interactive visualizations of this data for more understandable analysis, collaborative decision-making, and public awareness.

complex information + visualization = improved knowability and analyzability (and improved application of that information--optimally)

Tableau Public is a free tool (albeit a hosted solution) that enables the uploading of complex data (in .xl and .txt formats) for intuitive presentations on the Web. Such depictions offer accessible ways of understanding interrelationships, trends over time, and predictive analytics. The Web outputs are dynamic and engaging. This is the next step up from static information but somewhat less than full learner tracking and resultant full-scale simulations based on that data. Come learn about this tool and some of its possible uses in educational data visualization and light analytics.

Examples from Tableau Public "Gallery"

What are the Basic Parts of a Data Visualization?

The Data

the raw data set (and its history)

the potential of the data into the future

the methodology of information collection

the combination of data sets (customized data integration through "rationalization" and "normalization" of data; the scrubbing of data for clean data sets for comparability), including those with null data fields

the elimination of skewing outliers or anomalies (if relevant)

The Visualization

x and y axes (and z)

2D and movements of elements

3D visualizations

interactive dashboards

informational labels

Cautions

the "metamers" (illusions) in data visualization (with "negative learning" to be avoided)

the analysis and conclusions drawn from data analysis

What are the Main Features of Data Visualizations?

Dynamic
Interactive (Web interface)
Eye-catching
Optimally...

Accurate
Informative
Relevant to the world (there's no value to data if it doesn't somehow "map" to the world)

How Do Visual Analytics Add Value?

easier to understand (than reams of raw numerical data)
multiple learning channels (visual, textual / symbolic, kinesthetic)
enables interactivity with changing variable values
enables some distance to see patterns and relationships

The following visuals come from two popular data visualizations that were featured in early 2012.

General (Electronic) Data Backgrounder

The Nature of (Electronic) Information

Some basic tenets of electronic information follow:

Signals vs. Noise: The analytical value of data emerges from the ability to extract the "signal" from the "noise." (Check out Signals Detection Theory.) Paul Ormerod (in "Why Most Things Fail" published in 2006) observes: "The historical data which we have is dominated by noise rather than by signal and contains very little true information. No matter how hard we try, no matter how many statistics we collect, there are strict limits to the value of the genuine information we can extract" (p. 56). Data analysis has to be about extracting what is relevant and then the proper decision-making in knowing what to do with that information in terms of action.

An Electronic Panopticon: Electronic information is collected about *everything* (via cameras, sensors, socio-technical systems, communications systems, and others). People live in an electronic panopticon. (Marketers are said to have 4,000 data points for most consumers in the U.S. Analytical tools identify patterning in human behaviors, and this information may be thin-sliced to the individual. The targeted individuals themselves may not be aware of how they are responding to certain internal / external patterns and stimuli.) To function in the electronic world, people give away lots of private data in order to access free tools. Two of the largest information depositories in the world today are Google(TM) and Facebook(R).

Difficult to De-identify Data Sets of Personal Information: Most electronic information can not be practically de-identified (cleaned of specific identifiers). With a few data points, most information may be tracked back to an individual or "re-identified." (Some research says 33 data points; others say just a handful of data points.) Databases may be cross-referenced. Computers may be applied to conduct text analysis on writing to identify an author hand. (The wide distribution of information renders much information "weak secrets.")

No True Anonymity Online: There is no true "anonymity" on the WWW and Internet. Most contents may be tracked to "personally identifiable information" (PII) and "unique identifiers." The Web is constantly mapped. Spiders and Web crawlers and other "robots" are seeking particular information.

Responsibility for Information Collected: Organizations have responsibility for the information they collect. They have a responsibility to protect it. Any information that is collected may be subpoena-able.

Various Uses of Information: Information will be used in intended and unintended ways, depending on the size of the audience (and their ranges of intentions). Information always reveals more than the original intent. Electronic information is malleable. It may be data-mined for more information and decision-making. Decisions are often based on empirics and in vivo ("among the living") information.

Digital Content "Decay": Digital content ages out in about 10 years due to file readability. The "slow fires" of decay affect digital contents much more quickly than paper resources, which can last hundreds of years with proper conditions.

Information Wants to "Be Free": Hackers are trying to make information free. Those who control privy data (especially those with privacy and legal implications, those with R&D value, those with national security value) have to set up information regimes to protect the data.

The Deep (Invisible) Web: Much information is online that is not findable with some of the current browsers, but some tools can search the Deep Web, which is said to be some 500-times the size of the Visible Web.

Reverse Engineering Raw Data from Data Visualizations: Data visualizations are harder to reverse engineer data from unless there is clear labeling, unless the software is open-source (and it's clear how data visualizations are arrived at), and unless there are mechanisms to re-extrapolate the data.

References

Aid, M.M. (2009). The secret sentry: The untold history of the National Security Agency. New York: Bloomsbury Press.

Andrejevic, M. (2009). iSpy: Surveillance and power in the interactive era. Lawrence: University Press of Kansas.

Glenny, M. (2011). DarkMarket: Cyberthieves, cybercops and you. New York: Random House.

Collecting Relevant Information

Proper Research Approaches, Proper Metrics

The table below lists some of the common research methods in higher education... This is not an exhaustive listing by any means. This is included just to spark some thinking about some of the more formalized methods of research.

Quantitative (Research) Methods

Qualitative (Research) Methods

Mixed (Research) Methods

empirical experiments

structured records reviews

surveys / questionnaires

historical research and analysis

(based on a range of statistical analysis methods)

fieldwork (in a natural setting)

Delphi study

grounded theory

heuristic research

interviewing

surveys / questionnaires

case studies

portraiture

historical research and analysis

(emergent vs. pre-determined research)

a combination

meta-analyses (qual, quant and mixed methods)

(a mix of information gathering and statistical analysis)

Addressing Threats to Research and Information Validity

Applied Information Basics

Shape: Almost all information has some "shape" or pattern. Very little is formless. Very little is actually and totally random (except when it is, according to author Nassim Nicholas Taleb).

Absence / Presence: What is seen may be relevant. What is not seen in the research may be relevant.

Surveillance: Understanding a situation is about surveillance--regular intervals (or constant) observations and measures. From this, an observer establishes a baseline. From this, an observer may notice anomalies.

Context for Meaning: Information is about context. Outside of context, information may lack relevance.

Framing and Interpretation: Information is about focus. It is about point-of-view. It is about assumed values. Information is about (subjective) interpretation. (This is why it's helpful to game-out various points-of-view and possible interpretations of the same set of information.) People make inferences about raw data and in so doing process it into usable information.

A Framing Animation

Predictive Analytics and Hidden Information: Predictive analytics is about trendlines. It is about human habituation and predictability of future actions based on data points. It is about hidden information or an extension of what is non-intuitively knowable. (See Charles Duhigg's "The Power of Habit," 2012.)

Modeling: All simulations are based on underlying information. Without knowing the assumptions of the model, one cannot truly assess whether the model has any value. Beyond that, a model has to track with the factual world, so there have to be points where the model's predictions may be tested against the world.

Omniscience: Competition in information is about knowing something critical without others knowing. It is about knowing earlier. It is about situational awareness.

Following the Rules

Follow the standards of the domain field.
Adhere to ethical guidelines about research, particularly any research that involves humans or animals...or any sort of risk.
Ensure that all participants in your research have been duly notified (through informed consent).
Maintain confidentiality of all data.
Store the data for the required time period.

Creating and Deploying Effective Surveys

Ensure that all survey instruments are phrased so as to avoid bias. (Question Design)

Phrase questions to focus on one issue at a time. Avoid double-barreled questions.

Ensure that surveys reach a sizeable random sample of a population, not an accidental defined sub-set. (Survey Methodology)

Ensure that all surveys questions used have been tested for multi-collinearity (to eliminate sample items that measure similar things). It is important to avoid conflating very similar factors. The variables being tested in a context should be as different and independent from each other as possible. If there is overlap, there will be unnecessary "noise" in the data.

Analysis and Decision-making

Not Over-Reaching in Terms of Generalizability

Small sample size (as in case studies with n = 1 or few)
Insufficient sample size in terms of percentage of the target population

Do not over-generalize. Draw conclusions only in proper measure.

Do not confuse correlation with causation. (Just because two events occur in close time proximity does not mean that there is necessarily a causal relationship between them.)

Make sure there is sufficient information before making an important decision. (This is all about due diligence.)

Sometimes, researchers are observing and assessing one thing but confusing it with another. There's a risk to conflation especially if the two compared objects are potentially close in relationship. The tests for "multi-collearity" is an attempt to separate the measures for one factor from another potentially closely related factor.

Coding Data Methodically

Coding data appropriately

Maximizing the Data Collection to Saturation

Building a strong repository of complete and relevant information for the "triangulation" of data

Breaking "The Law of Small Numbers"

If a sample size is too small, the findings will be much more exaggerated (with more extremes and outlier effects) than for larger samples. Too often, analysts read causal observations when the "null hypothesis" cannot actually be rejected (Kahneman, 2011, p. 118).

Applying Logic

Avoid endogeneity. Do not confuse a cause-and-effect when the effect could be coming from a factor within the system instead of outside the system. Or there may be factors affecting a result that is not even considered in the modeling.

Consider all possible interpretations of the data.

Applying the Base Rate

Statistics can be helpful in aiding an analyst in understanding the probabilities in a particular circumstance. A "base rate" is the general statistical probability for a particular occurrence. A "base rate fallacy" occurs when people go with judging an outcome by considering irrelevant information and without considering basic probabilities (Kahneman, 2011).

Avoiding the WYSIATI (What You See is All There Is) Fallacy

Kahneman (2011) talks about the fallacy of "What You See is All There Is" (WYSIATI). People do not often consider the "unknown unknowns" or unseen evidence. An analyst should consider all evidence on a wide spectrum so as not to close off an exploration too soon. Fixing too soon on a possible explanation may be misleading.

Triangulating Data

Compare various data streams to get a fuller view of an issue. The more you will be generalizing and / or the higher the risk of the judgment, the more due diligence has to be done.

Information Sampling over Time

Sometimes, researchers will only do a slice-of-time sample and yet extrapolate results that might imply continuous time. It may help to sample over longer time periods for greater generalizability.

Go with the Counter Narrative

Test out other points of view and data sets. Work hard to debunk your own stance or interpretation. Dis-believe. (This stance can either strengthen your stance or weaken it.)

Be Leery of "Cognitive Ease"

Be aware of the ease of accessing certain explanations because that ease-of-access often translates to an early explanation and certitude which may be incorrect. Kahneman (2011) talks about the risks of "cognitive ease" in analysis and decision-making. Further, he describes two cognitive systems in people. System 1 is automatic and quick and activates with "no sense of voluntary control." It is intuitive and tends towards gullibility. (p. 20) System 2 is used for complex computations. It tends towards dis-believing. This system is much more logical and analytical; it is linked with "the subjective experience of agency, choice, and concentration." (p. 21) Too often, people go with System 1 when they should be applying System 2. System 1 is easier to access than System 2, and it requires less cognitive focus.

The "Black Swan" Rebuttal

(This is an image of two black swans released through Creative Commons licensure.)

A Rebuttal Stance against Quantitative Data, Bell Curves, and Decision-making Based on Too-Little Information: Nassim Nicholas Taleb's "Black Swans"

Black Swans

"Black Swan" events are outlier events. They are rare but high-impact events that fall outside paradigms and outside bell curves (so-called "normal distribution" curves). Thinking too much within paradigms and bell curves makes black swans even more inconceivable, which leaves people unprepared for such outsized events.

The Explanatory Value of "Black Swans"

"A small number of Black Swans explain almost everything in our world, from the success of ideas and religions, to the dynamics of historical events, to elements of our own personal lives. Ever since we left the Pleistocene, some ten millennia ago, the effect of these Black Swans has been increasing. It started accelerating during the industrial revolution, as the world started getting more complicated, while ordinary events, the ones we study and discuss and try to predict from reading the newspapers, have become increasingly inconsequential" (Taleb, 2010, p. xxii). The world is about "non-linearities," non-routines, randomness, and true serendipity.

Other Problems with Current Uses of Data

People tend to underestimate the error rate of their forecasts.
People do not consider "anti-knowledge" or their missing information. People should acquire a huge anti-library of works that they've never read, so they can start getting busy learning. People need to be much more dissatisfied with their basis of knowledge because people know so little, as a rule.
People tend to focus on small details and not abstract meta-rules.
People tend towards non-thinking. Human predecessors "spent more than a hundred million years as nonthinking mammals" (Taleb, 2010, p. xxvi). Rather, they should strive towards "erudition" and satisfying "genuine intellectual curiosity."
The social reward structures do not reward people to think preventively against major disasters but rather reward those who come in afterwards to make changes. (He uses 9/11 as an example.)
People tend to suffer the "triplet of opacity": (1) the illusion of understanding; (2) the retrospective distortion (how history seems to make so much more sense in retrospective analysis); (3) the "overvaluation of factual information and the handicap of authoritative and learned people, particularly when they create categories--when they 'Platonify.'" (Taleb, 2010, p. 8), (Taleb suggests that Plato over-simplified the world by using general themes when that undercut the true complexity of the world. The "bell curve" is an oversimplification of the world in the sense that it is one frame on a complex reality.)
Those with a shared framework of analysis will come out with generally the same (often erroneous) analyses. It's preferable to have a broad range of analysts using different conceptual and mental models...and to encourage diversities of opinion.
Inductive logic is highly limiting. He gives an example of a turkey who is fed faithfully 1,000 days but on the 1001st day is slaughtered. The Black Swan is a "sucker's problem," for people who are too willing to draw conclusions too early and on too little information.
There is no long run (what is called "the asymptotic property" or the extension of the present into infinity). All reality is based on the short-term. The long-term is not predictable...by its very nature.
People should not pursue verification of their theories. They should pursue the debunking of their ideas. They should cull evidence to show the opposite of what they think to truly test their theories. They should strive to find evidence that would prove themselves wrong (disconfirming evidence). They should cultivate a healthy skepticism.
People tend to fall into a narrative fallacy--of over-interpreting by telling simple stories to bind unrelated facts together. People tend to reduce complexity through "dimension reduction," so they can feel in better control of complexity.
Human memory is selective and dynamic. It remembers information in a malleable way and is not fully trustworthy.

Not Bell Curves: Taleb calls the use of "Gaussian bell curves" a kind of "reductionism of the deluded." The bell curve exists in a theoretical space, not in the real; it does not exist outside the Gaussian family. A "random walk" considers possibilities, but it has severe limits--the limits of the analytical framework.

not

(A fractal by Wolfgang Beyer, released through Creative Commons licensure; a generic bell curve or "normal distribution")

The Limits of Mathematization

"Furthermore, assuming chance has anything to do with mathematics, what little mathematization we can do in the real world does not assume the mild randomness represented by the bell curve, but rather scalable wild randomness. What can be mathematized is usually not Gaussian, but Mandelbrotian." (Taleb, 2010, p. 128)

Fractals and Mandelbrotian Randomness

" Fractal Is a word Mandelbrot coined to describe the geometry of the rough and broken—from the Latin fractus, the origin of fractured. Fractality is the repetition of geometric patterns at different scales, revealing smaller and smaller versions of themselves. Small parts resemble, to some degree, the whole." (Taleb, 2010, p. 257)

Exponentially Large Scalable Effects and the Rationale for Fractals

The real world has non-linearities that are not accounted for in bell curves. Real-world challenges mean that events may have scalable effects that magnifying effects. Taleb writes: "Like many biological variables, life expectancy is from Mediocristan, that is, it is subjected to mild randomness. It is not scalable, since the older we get, the less likely we are to live. In a developed country a newborn female is expected to die at around 79, according to insurance tables. When she reaches her 79^th birthday, her life expectancy, assuming that she is in typical health, is another 10 years. At the age of 90, she should have another 4.7 years to go. At the age of 100, 2.5 years. At the age of 119, if she miraculously lives that long, she should have about nine months left. As she lives beyond the expected date of death, the number of additional years to go decreases. This illustrates the major property of random variables related to the bell curve. The conditional expectation of additional life drops as a person gets older.

"With human projects and ventures we have another story. These are often scalable...With scalable variables, the ones from Extremistan, you will witness the exact opposite effect. Let's say a project is expected to terminate in 79 days, the same expectation in days as the newborn female has in years. On the 79^th day, if the project is not finished, it will be expected to take another 25 days to complete. But on the 90^th day, if the project is still not completed, it should have about 58 days to go. On the 100^th, it should have 89 days to go. On the 119^th, it should have an extra 149 days. On day 600, if the project is not done, you will be expected to need an extra 1,590 days. As you see, the longer you wait, the longer you will be expected to wait." (Taleb, 2010, p. 159).

Fractals as the Framework

"Fractals should be the default, the approximation, the framework. They do not solve the Black Swan problem and do not turn all Black Swans into predictable events, but they significantly mitigate the Black Swan problem by making such large events conceivable." (Taleb, 2010, p. 262)

Accounting for "Fat Tails": C. Steiner (2012) suggests that it may be better to consider that there is unpredictability with human irrationalities...which may result in "fat tails." These are the ends of the bell curve, which theoretically stretch into infinity.

Really, One Cautionary Note

References

Kahneman, D. (2011). Thinking, Fast and Slow. New York: Farrar, Straus and Giroux.

Steiner, C. (2012). Automate this: How algorithms came to rule our world. New York: Portfolio / Penguin.

Taleb, N.N. (2010/2007). The black swan: The impact of the highly improbable. New York: Random House.

Creating "Clean" Data Sets

A Classic Rectangular Data Array / Data Structure

The "schema" of the data (in a spreadsheet) should be defined in a clear way. Tableau Public ingests data using a classic rectangular data array (which is the typical form used in SPSS, Excel, SAS, and other programs). The rows represent unique cases. The columns represent variables. The far-left column should consist of identifiers. The absolute top row should consist of labels for the variables.

Acquiring Local Data Sets for Online Learning

survey systems
learning / course management systems
program statistics
college and university statistics

Spatiality: The World is Mapped

The world is mapped according to a geographic coordinate system (which consists of latitude, longitude and elevation; the first two of which represent horizontal positions and the latter of which represents a vertical position). These three points represent any physical space in the world.

(This image of the geographic coordinates on a sphere was created by E^(nix) and released via a Creative Commons license.)

652px-Latitude_and_Longitude_of_the_Earth.svg.png

(This image of the latitude and the longitude of the earth was made by Djexplo, and it was released with a Creative Commons license.)

Why Does Spatiality Matter?

(You are here...)

Most of what happens in the world happens in real time-space. Both space and time must be defined in order to "ground" the experience.

Business owners study the density of potential competitor businesses before deciding to start a particular business. They also study the population of the various environments to ensure that they may be able to have a sufficient base to support their business.
Law enforcement maps certain events that may be related in order to understand if there may be certain geographical patterns to the behaviors. They place suspects in space-time in order to rule them in or out as continuing suspects.
Epidemiologists will map the spread of disease over certain locales for certain periods of time for disease "rates". Projections are built into the future based on the behaviors of a certain disease spread in space-time.

For example, a fractal map was created to show the travel patterns of people as they spread the HIV virus from individual to individual. This map identified risk zones in real-space.
On another front, social networking visualizations (node-link diagrams and others) identified "sexual networks" that showed the higher risk of those in thick nodes with higher interactions with others (represented in vertices).

Demographers use maps to study immigration patterns.
Wildlife biologists study the behavior of a non-native species over certain terrain in reality and in simulations in order to help create an intervention plan. [Biogeographers determine species range to understand the species distribution over particular spaces. The species distribution changes over time because of the "translocation" of species by the dispersion by people, wind, water, and animals. These movements affect the bio-geography (the biodiversity over space and time).]
Meteorologists use maps to project the trajectory and potential impacts of specific weather systems.
Hydrogeochemists map various processes natural processes in the movements of elements and heavy metals through nature, over spaces.

Why Does Spatiality Matter for Online Instructors?

Spatiality may be...

a critical aspect of the curriculum
a part of the problem-solving
an aspect of the student design
a part of the student skill set.

Spatiality may help the instructor...

learn particular patterns
describe complex interrelationships
conceptualize particular relationships
broaden their understandings

Common Data Set Initiative

K-State and Common Data Sets (Office of Planning and Analysis)

K-State 2010 - 2011 Data

Trendlines

data over time
tendencies
considering "black swans"

Mock K-State Distance Education Students' Data

Maintain a pristine original dataset. (Those familiar with multimedia development understand why. This is to ensure that nothing gets corrupted in the work.)
Make a copy of the dataset for scrubbing and possible editing.
Do not change fundamental of each record. (The datasets will be downloadable, so each record must be preserved with its original information.)
Cluster like-location data. (This may be expressed as zip codes; latitude and longitude; or other ways.)
Make sure that the first row (A1 - A100...) has a listing of all the information in the columns below.
Do the = average( ) in Excel to average the grades in one (zip code) area; otherwise, the grades seem to just sum.
Replace the old data if there are updates. Or better yet delete the old table and rework the data.
Clean out the browser cache. Double check to make sure that the data visualization is making sense.
Make sure that the individual records, when viewed, make sense.

The Mock Data Set (an Excel file)

A Spatialized Map View (Widget with Embed Text)

(A spatialized map offers one interactive visualization of data.)

The Dashboard View (iFrame with Live Links)

(A dashboard combines several visualizations of the data.)

Acknowledgments: Thanks to Scott Finkeldei for the mock data set from K-State.

("Electricity")

Downloading Open-Source Data Sets

Data.Gov

http://www.data.gov/

U.S. Census Data (2010)

http://2010.census.gov/2010census/data/

U.S. Census Bureau

http://www.census.gov/

U.S. Census Bureau / The 2012 Statistical Abstract (especially Sections and .xl downloads)

http://www.census.gov/compendia/statab/

National Center for Education Statistics

http://www.data.gov/

The Back-end of Tableau Public

Basic Steps (Demonstration)

Now that you've downloaded and installed the Tableau Public client on your computer, start up Tableau Public (v. 7). (Start -> Programs -> Tableau Public...)
Call up the data set, and format properly (in the original format--like .csv, .xlsx, .txt, or others.
Upload the data set.
Manage the various "sheets" of data for different data visualizations.
Integrate the visualizations on the dashboard in combination.
Style it. Add in the proper labels. Add in colors. Organize the layout.
Create an account.
Publish the data visualization live to the Web. (There is no other way to save with a free account.)
Drop the URL or the "embed text" into the site. Or use an iFrame. Or use a widget.

Creating a Data Visualization Using Tableau Public (downloadable PowerPoint version)

A Sample Data Set

The Kansas Board of Regents (KSBOR)

Institutional Gap Calculation

Downloading Tableau Public

Training Materials for Tableau Public

Other (More Complex) Data Visualizations

Global Demographic Trends

Gapminder

http://www.gapminder.org/

Dr. Hans Rosling

"No More Boring Data" (TED, 2007)

http://www.youtube.com/watch?v=hVimVzgtD6w

"New Insights on Poverty" (TED, 2007)

http://www.ted.com/talks/hans_rosling_reveals_new_insights_on_poverty.html

"Let my Dataset Change your Mindset" (TED, 2009)

http://www.ted.com/talks/hans_rosling_at_state.html

"Hans Rosling: Asia's rise—how and when" (TED, 2009)

http://www.ted.com/talks/hans_rosling_asia_s_rise_how_and_when.html

"Hans Rosling on Global Population Growth" (TED, 2010)

http://www.ted.com/talks/hans_rosling_on_global_population_growth.html

"Hans Rosling: The good news of the decade?" (TED, 2010)

http://www.ted.com/talks/hans_rosling_the_good_news_of_the_decade.html

"Hans Rosling and the magic washing machine" (TED, 2011)

http://www.ted.com/talks/hans_rosling_and_the_magic_washing_machine.html

Geological Visualizations

US Geological Survey (USGS)

http://glovis.usgs.gov/ (Global Visualization)

USGS Seamless Data Warehouse

http://seamless.usgs.gov/

USGS EarthExplorer

http://earthexplorer.usgs.gov/

Disease Distribution and Spatiality

OIE: World Organization for Animal Health / Interactive Disease Distribution Maps

(World Animal Health Information Database / WAHID Interface)

http://web.oie.int/wahis/public.php?page=disease_status_map&WAHIDPHPSESSID=e898489703716cef784208fdfd399f64

CDC Arbonet Maps

http://www.cdc.gov/ncidod/dvbid/westnile/USGS_frame.html

Educational Knowledge Tracking

Khan Academy

http://www.khanacademy.org/exercisedashboard

http://www.khanacademy.org/login?continue=http%3A//www.khanacademy.org/profile

The Analysis of Large Data Sets (Terabytes of Data, Billions of Records)

Google BigQuery (2010)

https://developers.google.com/bigquery/

Google Research Blog

http://googleresearch.blogspot.com/

Using Excel for Web Visualizations

Data Wiz Blog

http://datawiz.wordpress.com/

Dr. Shalin Hai-Jew

Instructional Designer

Kansas State University

785-532-5262

shalin@k-state.edu

Note: This presentation was built using SoftChalk 7. The Flash animation was built with Adobe AfterEffects. Microsoft Visio was used for the annotated screenshots with the pull-outs of information. Adobe Photoshop was used for minor photo editing. A Web-based ASCII-art generator was used to create the ASCII-text art.

This was updated in May 2012.