Author: Peter Fenwick

Introduction to Computer Data Representation

Personal Book: US $49 Special Offer (PDF + Printed Copy): US $143
Printed Copy: US $119
Library Book: US $196
ISBN: 978-1-60805-883-9
eISBN: 978-1-60805-882-2 (Online)
Year of Publication: 2014

Introduction

Introduction to Computer Data Representation introduces readers to the representation of data within computers. Starting from basic principles of number representation in computers, the book covers the representation of both integer and floating point numbers, and characters or text. It comprehensively explains the main techniques of computer arithmetic and logical manipulation. The book also features chapters covering the less usual topics of basic checksums and ‘universal’ or variable length representations for integers, with additional coverage of Gray Codes, BCD codes and logarithmic representations. The description of character coding includes information on both MIME and Unicode formats. Introduction to Computer Data Representation also includes historical aspects of data representation, explaining some of the steps that developers took (and the mistakes they made) that led to the present, well-defined and accepted standards of data representation techniques. The book serves as a primer for advanced computer science graduates and a handy reference for anyone wanting to learn about numbers and data representation in computers.

Foreword

It is my great pleasure to recommend this excellent book written by my friend and colleague, Professor Peter Fenwick. During the eleven years I have known him, we have had many a discussion, often touching on topics covered here. Though this is the closest we have come to a collaboration I have little doubt that had we met earlier in our careers we would have collaborated extensively.

A major contribution of this book is to bring a historical perspective to many topics that are so widely accepted that it might not be obvious there were choices to be made. The binary representation of numbers was so obvious even in the 1940s that Burks, Goldstine and von Neumann are said to have “adopted it seemingly without discussion”. But Burks et al considered floating point representation, then argued against supporting it. Long ago I heard it claimed that von Neumann believed any mathematician ”worth his salt” should be able to specify floating point computations using only integers. In any case, floating point only came into its own in the 1980s, with the broad acceptance of the IEEE standards. Professor Fenwick shows great insight into why it took decades to get right something as basic as the representation of numbers.

A second important contribution is discussion of the introduction of redundancy to increase reliability in the presence of errors: check sums and variable-length (universal) codes. While simple check sums are frequently discussed, I know of no comparable source for a general discussion of Universal codes, an important but somewhat obscure subject.

I agree with Professor Fenwick's quote, that “everybody thinks they know” about these topics, but there are big holes, even today. Surely most of us have superficial knowledge that fails us when we really need to work through the details. This book covers a huge range of material, thoroughly and concisely. I have taught a good bit of the material, but I learned much, even in areas where I claim some expertise. The book displays a deep understanding of the many and varied requirements for digital representation of information, from the obvious integers and floating point, to Zeckendorf representations and Gray codes; from 2's complement to logarithmic arithmetic; from Elias and Levenstein codes to Rice and Golomb codes and on to Ternary Comma and Fibonacci codes.

In addition to the plethora of ways to represent numbers, it also covers representation of characters and strings. While the book will serve very well as a reference, it is also fascinating reading. Many pages are devoted to obscure topics, interesting largely because of their place in history, but outside the domain of a classic textbook on computer organization or architecture. These are perhaps the most important sections, precisely because they had to be understood and discarded to get us where we are now.

This book definitely does not qualify for the subtitle, “Data Representation for Dummies”. While it quickly surveys common forms of representation, the pace and breadth will bewilder the true novice. On occasions, it uses terms unfamiliar (at least to an American), requiring another source. Appropriately, Professor Fenwick acknowledges the role of Wikipedia, which covers rather more topics than his book, but certainly not as coherently.

The author has a wry, if somewhat subtle, sense of humour which often surfaces unexpectedly: it's a bit of a stretch, but of course the description and figure regarding Gray codes include a “grey area”!

Discussion of the roles and interaction of precision, accuracy and range is superb. Floating point representation is highly precise, so why is it dangerous for use in financial calculations? Professor Fenwick points out something that had not occurred to me: a “quite ordinary calculator” is capable of more precise arithmetic than a 32-bit [IEEE single-precision] floating point computation. That explains why the calculator “app” on my iPad has both less range, and less precision, than the HP calculator I bought 35 years ago!

A topic rarely covered so clearly is “unwarranted precision”, the process of using a precise mathematical operation to apparently increase accuracy (significant digits) of a number. Professor Fenwick points out confusion over precision created by the fact that the speed of light is so close to 300,000,000 metres per second—and the fact that scientific notation provides information about the accuracy of a value (pp. 106-107). I especially liked his discussion of the sins of the popular press, for example, by apparently increasing precision in the process of converting units: an altitude “10,000 feet”—accurate to, say, ±100 metres—becomes the apparently more precise, but inaccurate, “3 048 metres”. It is unfortunate that the general level of this book is beyond comprehension for most journalists!

In short, this is a fascinating book that will appeal to many because of its authoritative exploration of how we represent information. But it will also serve as a reference for those requiring—or simply enjoying—the ability to choose efficient representations that lead to accurate results. It's a good read, and a great book to keep handy.

James R. Goodman , United States of America Fellow IEEE,
Fellow ACM,
2013 Eckert-Mauchly Award


RELATED BOOKS

.Multi-Objective Optimization In Theory and Practice II: Metaheuristic Algorithms.
.Arduino and SCILAB based Projects.
.Arduino meets MATLAB: Interfacing, Programs and Simulink.
.Budget Optimization and Allocation: An Evolutionary Computing Based Model.