AGI Is Not An Existential Risk
Forget The Turing Test - It's The Tommy Lasorda Test That Matters
It was 1988 and Tommy Lasorda had a problem. He had convinced the Los Angeles Dodgers to draft the son of his distant cousin, a first baseman. Lasorda did this as a favor to his friend. The son was drafted by the Dodgers in the 62nd round of the 1988 MLB amateur draft as the 1,390th player picked overall. The problem was convincing the Dodgers to actually sign the player. He was having a hard time convincing the Dodgers that they needed to sign the first baseman who had only played at Miami-Dade Community College. The Dodgers finally relented when Lasorda suggested that the player switch from playing first base to being a catcher. The player was signed for a minimal amount of $15,000.
The player was sent to a training camp for catchers in the Dominican Republic and his work ethic made him a formidable hitter. He played for the Águilas de Mexicali of the Mexican Pacific League during the 1991-1992 season. He was finally promoted to the Dodgers and played his first game on September 1, 1992. He hit his first home run 11 days later. The player hit his 427th and final home run on September 26, 2008. In between he won the 1993 NL Rookie of the Year award, was a 12 time All--Star, and was inducted into the Baseball Hall of Fame in 2016.
The process that Tommy Lasorda executed for Mike Piazza was a fully human process. Lasorda's actions were based on his baseball experience and personal knowledge of Mike Piazza. There was no technology involved. Could technology do what Lasorda did, namely, identify a player like Mike Piazza who had zero chance of making it to Major League Baseball? The Turing Test, named after Alan Turing, is defined to be a test of technology to determine if it can exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The Turing Test is one of several tests put forth to identify when technology meets the criteria for artificial general intelligence (AGI). AGI is the ability of technology to execute any intellectual task that a human being can, such as that executed by Tommy Lasorda.
Artificial intelligence or AI is a very broad topic area and most of it is in the realm of basic research. It is meant to use technology to model real-world human behaviors. One area is natural language processing (the study of technology and human languages interactions). Another is vision, that is, can the functionality of the eye be recreated. The one area that is well-understood is that part we call computational statistics or machine learning, which is what is referred to as AI in the business world and media. This AI is referred to as a model. The model is the set of coefficients used to evaluate a prescribed mathematical function (e.g., a polynomial) when data is presented to the AI model. The goal of an AI algorithm is to determine the optimal coefficients to make the AI model an effective predictor.
It is important to understand that AI has nothing to do with intelligence. AI is used to automate processes using well-understood statistical methods. It does not have any kind of awareness of the output generated. It is not sentient. It has no possibility to become sentient. It has significant shortcomings that limit its use in our lives. It is difficult to envision an AI being able to identify Mike Piazza with little or no data. Current AI technologies and proposed AGI models cannot do what Lasorda did and that is the proposed new test. An AI that can pass The Lasorda Test can truly be declared an AGI.
There are three ways to think about modelling real-world behaviors. The classic two ways to approximate reality are done using bounded and unbounded systems. Bounded systems are processes like chess, checkers, poker, Go, and Backgammon that have a finite number of outcomes. Unbounded systems are processes that have an infinite number outcomes. Reality is best represented by unbounded systems. The third way is to create a hybrid situation where some of the system is bounded and can be described using AI. This is where we are headed with business processes become quasi-rote so that an AI model or models can be used to run the process.
There are two differences between bounded and unbounded systems that are important. If two chess players play many games, and play the same moves each game, they will always arrive at the same outcome every time. The rules of chess are immutable. An AI that plays chess is simply searching through all the possible outcomes of the game it is playing. There is something like a trillion trillion trillion trillion possible chess games. There have been about a trillion games played since the rules of chess were finalized at most. Therefore, there are many more outcomes of chess that have not been seen or explored. An AI can play more than a trillion games in a short amount of time and therefore find those unrevealed outcomes. For bounded systems, an AI is just a glorified search engine.
Baseball also has rules that are immutable or change slowly. Yet if two baseball teams play many games and make the same moves each game, they will always arrive at a different outcome each time. This is because no one pitch is the same, no one swing will result in the same outcome, and a fielder cannot always field a hit ball and throw the runner out the same way. Baseball rules allow for creativity on the part of players and coaches. The total number of possible baseball games cannot be calculated.
The best AI can do for baseball is to use data from baseball games and individual performances to create analytics to attempt and optimize a baseball team for success. This goal has resulted in the explosion in the use of analytics in all sports, led by baseball. The analytics are used to predict an optimal mix of players for a given game situation. Coaches will have reports to use during games to assist with decision-making. To date, no team in any sport is run solely on the output of predictive analytics.
The second important difference is that bounded systems do not need data for an AI to play the game. The rules are sufficient. Unbounded systems like most sports and business processes, need data to create an AI model. The amount of data needed is significant for the algorithms that create the model to determine an optimal set of coefficients. This is why analytics is a recent phenomenon. The amount of data and the computational power necessary to determine the coefficients became available over the past 10-15 years. These algorithms have been in existence for up to 60 years but they were of limited utility because of the lack of data and computational power.
The algorithms use traditional statistical techniques to arrive at a set of optimal coefficients. The number of coefficients can be enormous. A neural network algorithm could produce an AI model that has over 100 million coefficients when used in a production context like controlling the steering of a car or flying a drone. It can only arrive at this number of coefficients because of the sheer size of the data. Car companies have spent billions of dollars to create the AI model to steer the car, for example. They have spent billions of dollars and over a decade to collect the data to determine the optimal set of millions of coefficients that result in a car being able to drive down an Interstate in a safe manner. The fact that it cost that much money and took that amount of time to create an AI model to carry out a fairly simple task should inform the reader how far we are from AI models running our economy and our lives.
Data is the key for the proper construction of any AI model. Data is the reason the vast majority of AI models do not work very well. Why? The data from baseball games and the capital markets are extremely truthful data. These two data sets need no work done to them to correct problems and mistakes. These two data sets are exemplars of curated data. Curated data is data where mistakes and problems have been removed and the data faithfully represents the truth of the underlying process. Unfortunately, these highly curated data sets are in the minority of data sets used when applying AI models to business processes. The vast majority of data used to create AI models have problems and do not provide truthful information about the underlying process. The data needs to be fixed, or curated, for the data to represent the process. The data have to go through a curation process that may or may not make that data represent the process.
The first mistake users of AI models make is that they have not defined the problem they are solving by using the AI model. Assuming the users have defined the problem correctly, the second mistake users make is deciding what data and analytics are needed to successfully solve the problem. Assuming the users have done this second step correctly, a third mistake users make is that they do not collect the correct data, collect a subset of what is needed, or collect data that needs has problems that need to be fixed. Assuming the users have done this third step correctly (i.e., curated the data), a fourth mistake users make is when they combine the data into the analytics. They need to conduct testing to ensure the curated data provides the correct analytics over time. Assuming the users have done this correctly, the fifth mistake made is disseminating the analytics to decision makers without training the decision makers on how to use the analytics. The sixth mistake users make is assuming that if the analytics they get today are appropriate, that they will be appropriate a year from now.
We cannot expect AI to run our society because that is not what it is made to do. AI is only capable of making predictions of rote or quasi-rote processes for which copious amounts of data exist. We should not expect an AGI to be useful anytime soon because these would need to support processes for which copious amounts of data do not exist. Lasorda saw something in Piazza that none one else saw and certainly the data on Piazza did not support Lasorda's efforts. There was no information that could have been used to develop a technology, in this case, create an AI model, to identify the real Mike Piazza and any future Mike Piazzas. A true AGI, defined as an AI that could find the next Mike Piazza, does not exist. Current efforts in the field of AGI do not come close to satisfying the Lasorda Test. It will be decades before that test is satisfied by an AGI that can support processes with little or no data. The fear of an AGI running rampant and destroying society is not something to fear in our lifetimes.