Thursday, 28 February 2008

DATA PROCESSING

Data processing is any computer process that converts data into information or knowledge. The processing is usually assumed to be automated and running on a computer. Because data are most useful when well-presented and actually informative, data-processing systems are often referred to as information systems to emphasize their practicality. Nevertheless, both terms are roughly synonymous, performing similar conversions; data-processing systems typically manipulate raw data into information, and likewise information systems typically take raw data as input to produce information as output.

To better market their profession, a computer programmer or a systems analyst that might once have referred, such as during the 1970s, to the computer systems that they produce as data-processing systems more often than not nowadays refers to the computer systems that they produce by some other term that includes the word information, such as information systems, information technology systems, or management information systems.

In the context of data processing, data are defined as numbers or characters that represent measurements from observable phenomena. A single datum is a single measurement from observable phenomena. Measured information is then algorithmically derived and/or logically deduced and/or statistically calculated from multiple data. (evidence). Information is defined as either a meaningful answer to a query or a meaningful stimulus that can cascade into further queries.

More generally, the term data processing can apply to any process that converts data from one format to another, although data conversion would be the more logical and correct term. From this perspective, data processing becomes the process of converting information into data and also the converting of data back into information. The distinction is that conversion doesn't require a question (query) to be answered. For example, information in the form of a string of characters forming a sentence in English is converted or encoded from a keyboard's key-presses as represented by hardware-oriented codes into ASCII codes after which it may be more easily processed by a computer—not as merely raw, amorphous data, but as a meaningful character in a natural language's set of graphemes—and finally converted or decoded to be displayed as characters, represented by a font on the computer display. In that example we can see the stage-by-stage conversion of the presence of and then absence of electrical conductivity in the key-press and subsequent release at the keyboard from raw substantially-meaningless hardware-oriented data to evermore-meaningful information as the processing proceeds toward the human being.

Conversely, that simple example for pedagogical purposes here is usually described as an embedded system (for the software resident in the keyboard itself) or as (operating-)systems programming, because the information is derived from a hardware interface and may involve overt control of the hardware through that interface by an operating system. Typically control of hardware by a device driver manipulating ASIC or FPGA registers is not viewed as part of data processing proper or information systems proper, but rather as the domain of embedded systems or (operating-)systems programming. Instead, perhaps a more conventional example of the established practice of using the term data processing is that a business has collected numerous data concerning an aspect of its operations and that this multitude of data must be presented in meaningful, easy-to-access presentations for the managers who must then use that information to increase revenue or to decrease cost. That conversion and presentation of data as information is typically performed by a data-processing application.

When the domain from which the data are harvested is a science or an engineering, data processing and information systems are considered too broad of terms and the more specialized term data analysis is typically used, focusing on the highly-specialized and highly-accurate algorithmic derivations and statistical calculations that are less often observed in the typical general business environment. This divergence of culture is exhibited in the typical numerical representations used in data processing versus numerical; data processing's measurements are typically represented by integers or by fixed-point or binary-coded decimal representations of numbers whereas the majority of data analysis's measurements are often represented by floating-point representation of rational numbers.

Practically all naturally occurring processes can be viewed as examples of data processing systems where "observable" information in the form of pressure, light, etc. are converted by human observers into electrical signals in the nervous system as the senses we recognize as touch, sound, and vision. Even the interaction of non-living systems may be viewed in this way as rudimentary information processing systems. Conventional usage of the terms data processing and information systems restricts their use to refer to the algorithmic derivations, logical deductions, and statistical calculations that recur perennially in general business environments, rather than in the more expansive sense of all conversions of real-world measurements into real-world information in, say, an organic biological system or even a scientific or engineering system.

No comments: