General (Electronic) Data Backgrounder
The Nature of (Electronic) Information
Some basic tenets of electronic information follow:
- Signals vs. Noise: The analytical value of data emerges from the ability to extract the "signal" from the "noise." (Check out Signals Detection Theory.) Paul Ormerod (in "Why Most Things Fail" published in 2006) observes: "The historical data which we have is dominated by noise rather than by signal and contains very little true information. No matter how hard we try, no matter how many statistics we collect, there are strict limits to the value of the genuine information we can extract" (p. 56). Data analysis has to be about extracting what is relevant and then the proper decision-making in knowing what to do with that information in terms of action.
- An Electronic Panopticon: Electronic information is collected about *everything* (via cameras, sensors, socio-technical systems, communications systems, and others). People live in an electronic panopticon. (Marketers are said to have 4,000 data points for most consumers in the U.S. Analytical tools identify patterning in human behaviors, and this information may be thin-sliced to the individual. The targeted individuals themselves may not be aware of how they are responding to certain internal / external patterns and stimuli.) To function in the electronic world, people give away lots of private data in order to access free tools. Two of the largest information depositories in the world today are Google(TM) and Facebook(R).
- Difficult to De-identify Data Sets of Personal Information: Most electronic information can not be practically de-identified (cleaned of specific identifiers). With a few data points, most information may be tracked back to an individual or "re-identified." (Some research says 33 data points; others say just a handful of data points.) Databases may be cross-referenced. Computers may be applied to conduct text analysis on writing to identify an author hand. (The wide distribution of information renders much information "weak secrets.")
- No True Anonymity Online: There is no true "anonymity" on the WWW and Internet. Most contents may be tracked to "personally identifiable information" (PII) and "unique identifiers." The Web is constantly mapped. Spiders and Web crawlers and other "robots" are seeking particular information.
- Responsibility for Information Collected: Organizations have responsibility for the information they collect. They have a responsibility to protect it. Any information that is collected may be subpoena-able.
- Various Uses of Information: Information will be used in intended and unintended ways, depending on the size of the audience (and their ranges of intentions). Information always reveals more than the original intent. Electronic information is malleable. It may be data-mined for more information and decision-making. Decisions are often based on empirics and in vivo ("among the living") information.
- Digital Content "Decay": Digital content ages out in about 10 years due to file readability. The "slow fires" of decay affect digital contents much more quickly than paper resources, which can last hundreds of years with proper conditions.
- Information Wants to "Be Free": Hackers are trying to make information free. Those who control privy data (especially those with privacy and legal implications, those with R&D value, those with national security value) have to set up information regimes to protect the data.
- The Deep (Invisible) Web: Much information is online that is not findable with some of the current browsers, but some tools can search the Deep Web, which is said to be some 500-times the size of the Visible Web.
- Reverse Engineering Raw Data from Data Visualizations: Data visualizations are harder to reverse engineer data from unless there is clear labeling, unless the software is open-source (and it's clear how data visualizations are arrived at), and unless there are mechanisms to re-extrapolate the data.
References
Aid, M.M. (2009). The secret sentry: The untold history of the National Security Agency. New York: Bloomsbury Press.
Andrejevic, M. (2009). iSpy: Surveillance and power in the interactive era. Lawrence: University Press of Kansas.
Glenny, M. (2011). DarkMarket: Cyberthieves, cybercops and you. New York: Random House.