From Ada Lovelace to Deep Learning
Programming has come a long way - from the tedious and time-consuming early computers, to deep learning, where computers essentially program themselves.
- C 780–850 – Life of Mohammed ibn-Musa al-Khwarizmi from whose name we get the word “algorithm” (as well as “algebra”)
- 1786 – Hessian Army engineer J. H. Müller publishes a paper describing a “Difference Engine” but could not obtain funding to proceed
- 1822 – Charles Babbage proposes to develop such a machine and, in 1823, obtains funding from the British government. After developing an early version of such a machine he specifies a much more ambitious project, the “Analytical Engine,” which is never completed.
- 1843 – Ada King, Countess of Lovelace, writes the “first computer program.”
- 1945 – John von Neumann authors the first draft of a paper containing the first published description of the logical design of a computer using the stored-program concept.
- 1946 – The first working electronic computer, ENIAC is announced to the public.
- 1948 – An experimental computer, the Manchester Small-Scale Experimental Machine, successfully ran a stored program.
- 1956 – John McCarthy organizes the first international conference to emphasize “artificial intelligence.”
- 1975 – The first consumer microcomputer, the Altair 8800 was introduced. Upon reading of the computer, Bill Gates and Paul Allen developed Altair BASIC to allow the Altair to run stored programs (this was the product that launched Microsoft – then called “Micro-Soft”).
- 1997 – IBM’s Deep Blue defeats World Chess Champion Garry Kasparov 3½-2½.
- 2011 – IBM’s Watson defeats Jeopardy! champions.
- 2016 – Google’s AlphaGo defeats world-class Go player Lee Se-dol 5-1.
Algorithm – "In mathematics and computer science, an algorithm is a self-contained step-by-step set of operations to be performed. Algorithms perform calculation, data processing, and/or automated reasoning tasks." – Wikipedia
We constantly hear terms such as “algorithm,” “computer program,” and, more and more, “deep learning.” Yet, while most have an understanding of computer programs, the other terms are somewhat elusive. Normally, it’s not very important for the average person to understand technical terms, but a knowledge of the progression from what’s known as “Ada’s Algorithm” to deep learning has meaning in appreciating our now rapid movement toward true “artificial intelligence.”
An algorithm, quite simply, is a rule or a method of accomplishing a task. No matter how complex computers are, they are no more than a collection of wiring and physical components. They must receive direction to accomplish whatever task or tasks are desired by the owners of the device.
So, if a computer were to be used to calculate employee payroll, the way to do this would be contained in the payroll algorithm. The algorithm would contain a number of instructions or “program steps” to properly complete its processing. (To learn about the first use of a computer to process payroll, see Milestones in Digital Computing.)
One step might be to calculate gross pay for an employee; an instruction to do this might simply be: “Gross – Hours * Rate” where * stands for multiplication.
However, that is a very simple statement that might be used only in a case where no one could work overtime as defined by state law. If there were many employees, all “on the clock” in a jurisdiction where hours over 40 had to be compensated at 1½ times the normal rate, the instruction might look like this:
“IF Hours Are More Than 40, THEN Regular Gross = Rate * 40 and OTGross = (Hours – 40) * 1.5 (Rate) ELSE Regular Gross = Rate * Hours and OTGross = 0 Total Gross = Regular Gross + OTGross”
This series of instructions must be calculated for each employee as must, in a normal firm, the determination as to whether the employee is a salaried or an “on the clock” employee, how much tax (if any) must be withheld for the federal government and state and city (based on number of dependents and federal and appropriate state regulations).
Additionally, reports (and possibly checks) would have to be produced. All in all, something that we might consider as straightforward now may seem very complex as we get into the details – there can be no errors in the instructions; they must be precise and accurate. Even minor errors may cause large financial loss, mechanical failure and/or loss of life.
Ada to ENIAC
Ada King, Countess of Lovelace, and daughter of the famed English Lord Byron (George Gordon) is called the “first programmer” – even the “first computer programmer” – even though there was no understanding of programming – and certainly no computers in 1843. She is referred to in these terms simply because her writing in a notebook about the never-to-be-finished Analytical Engine showed an understanding of the concepts that would be important over 100 years later (see James Essinger’s “Ada’s Algorithm: How Lord Byron’s Daughter Ada Lovelace Launched The Digital Age,” for the entire fascinating and somewhat tragic story). In recognition of her contributions, the U.S. Department of Defense named a programming language, developed in the 1970s, “Ada.” (To learn more about Ada, see Ada Lovelace, Enchantress of Numbers.)
By the time that the first working electronic computer, the ENIAC, was developed – during World War II but not completed until 1946 – it was well understood that the computer was no Frankenstein Monster that could “think” on its own; it had to be programmed! The original programming for the ENIAC was done on paper and thoroughly checked for logic (hence the term “desk checking”) before touching the computer. To do otherwise would be costly – computer time was considered very expensive and the wasting of it was frowned upon. In the case of the ENIAC, there could also be a great waste of human effort, as each instruction had to be entered one at a time by “throwing” mechanical switches as it was to be executed.
While the ENIAC was being developed, the famed mathematician Jon von Neumann had postulated the concept of a “stored program.” A program would be written, tested, “debugged” (all errors found and corrected), and stored on some medium (punched cards, paper tape, etc.). When needed, it would be loaded into the computer with the data to be processed and used (think of Microsoft Word, kept on your hard drive or, even now, “in the cloud” and only called into the computer when you wanted to write a letter or create a memo).
When the Altair 8800 first appeared, 30 years after the ENIAC, it was purely a hobbyist’s machine for tinkering until Altair BASIC arrived and allowed it to utilize stored programs.
For over 50 years after the ENIAC, progress in computer technology was found in making bigger, faster, and cheaper components, communication breakthroughs (such as the internet), and enhanced programming languages (COBOL, Fortran, BASIC, Ada, C, Forth, APL, Logo, LISP, Pascal, Java, etc.) and tools to make program development more efficient and hopefully, more “bulletproof” (error-free).
While this mainstream of computer progress had been going on, lurking on the sidelines had been the science-fiction-sounding dream of “artificial intelligence” (a term coined by John McCarthy in the mid-1950s) – the ability to have something other than humans exhibit human intelligence, a dream going back to the mythical Golem and Mary Shelley’s Frankenstein, a dream thought to be much more possible through the advent of computer technology.
The term “artificial intelligence” has taken on many meanings since McCarthy coined it – robotics, expert systems, case-based reasoning, etc., but none seems as profound as a system that emulates human learning.
The business and scientific systems developed for the first 50+ years of electronic computing were all based on rule-based systems – “deductive reasoning,” in which we are given general principles and then we apply them to individual cases as we go along (such as the “IF-THEN-ELSE” example given above). In short, we proceed from the abstract or general to the particular or individual case.
Humans, however, also learn by the reverse of this – from the particular to the abstract or rule; this is inductive reasoning. If we go out of the house when it snows, turn right, and snow falls off a tree on our head, sooner or later we start turning left when it snows. In short, we learn and build rules based on the learning.
If we considered the human-generated algorithms to be analogous to deductive reasoning, so-called deep learning is the inductive opposite – we may set goals but then “dump into the system” thousands or millions of related facts or games or war scenarios and then, more or less, say “you figure it out.”
In short, the computer is writing the algorithm – and it’s doing it thousands of times faster than a human could, based on analyzing millions of more related facts than a human could. Ada would be proud!