Creating "Clean" Data Sets


A Classic Rectangular Data Array / Data Structure


The "schema" of the data (in a spreadsheet) should be defined in a clear way. Tableau Public ingests data using a classic rectangular data array (which is the typical form used in SPSS, Excel, SAS, and other programs). The rows represent unique cases. The columns represent variables. The far-left column should consist of identifiers. The absolute top row should consist of labels for the variables.




Acquiring Local Data Sets for Online Learning


Spatiality: The World is Mapped

The world is mapped according to a geographic coordinate system (which consists of latitude, longitude and elevation; the first two of which represent horizontal positions and the latter of which represents a vertical position). These three points represent any physical space in the world.



(This image of the geographic coordinates on a sphere was created by E^(nix) and released via a Creative Commons license.)




(This image of the latitude and the longitude of the earth was made by Djexplo, and it was released with a Creative Commons license.)


Why Does Spatiality Matter?

Red-Dot.png (You are here...)



Why Does Spatiality Matter for Online Instructors?


Common Data Set Initiative

K-State and Common Data Sets (Office of Planning and Analysis)

K-State 2010 - 2011 Data






Mock K-State Distance Education Students' Data

  1. Maintain a pristine original dataset. (Those familiar with multimedia development understand why. This is to ensure that nothing gets corrupted in the work.)
  2. Make a copy of the dataset for scrubbing and possible editing.
  3. Do not change fundamental of each record. (The datasets will be downloadable, so each record must be preserved with its original information.)
  4. Cluster like-location data. (This may be expressed as zip codes; latitude and longitude; or other ways.)
  5. Make sure that the first row (A1 - A100...) has a listing of all the information in the columns below.
  6. Do the = average( ) in Excel to average the grades in one (zip code) area; otherwise, the grades seem to just sum.
  7. Replace the old data if there are updates. Or better yet delete the old table and rework the data.
  8. Clean out the browser cache. Double check to make sure that the data visualization is making sense.
  9. Make sure that the individual records, when viewed, make sense.


The Mock Data Set (an Excel file)


A Spatialized Map View (Widget with Embed Text)

(A spatialized map offers one interactive visualization of data.)

Powered by Tableau



The Dashboard View (iFrame with Live Links)

(A dashboard combines several visualizations of the data.)




Acknowledgments: Thanks to Scott Finkeldei for the mock data set from K-State.