Machine learning, automated suspicion algorithms, and the Fourth Amendment. (2024)

Link/Page Citation

At the conceptual intersection of machine learning and governmentdata collection lie Automated Suspicion Algorithms, or ASAs, which arecreated by applying machine learning methods to collections ofgovernment data with the purpose of identifying individuals likely to beengaged in criminal activity. The novel promise of ASAs is that they canidentify data-supported correlations between innocent conduct andcriminal activity and help police prevent crime. ASAs present a noveldoctrinal challenge as well, as they intrude on a step of the FourthAmendment's individualized suspicion analysis, previously the soleprovince of human actors: the determination of when reasonable suspicionor probable cause can be inferred from established facts. This Articleanalyzes ASAs under existing Fourth Amendment doctrine for the benefitof courts that will soon be asked to deal with ASAs. In the process,this Article reveals the inadequacies of existing doctrine for handlingthese new technologies and proposes extrajudicial means for ensuringthat ASAs are accurate and effective.

INTRODUCTIONI. MACHINE LEARNING AND ASAsII. INDIVIDUALIZED SUSPICION, OLD ALGORITHMS, AND ASAs A. The Two-Step Individualized Suspicion Analysis B. Algorithms in the Individualized Suspicion Analysis: The Old and the NewIII. THE INSUFFICIENCY OF AN ASA's PREDICTION A. The Collective and Constructive Knowledge Doctrines B. Applying the Doctrines to ASAsIV. INCLUDING ASAS IN THE TOTALITY-OF-THE-CIRc*msTANCES ANALYSIS A. Algorithms as Police Profiles B. Algorithms as Informants C. Algorithms as Drug-Sniffing Dogs 1. The Law of Drug Dogs 2. ASAs as Drug Dogs 3. ConclusionV. ASA ERRORSCONCLUSION

INTRODUCTION

One day soon, a machine will identify likely criminal activity and,with the beep of an e-mail delivery, the buzz of an alarm, or the silentcreation of a report, tell police where to find it. Already, a computerprogram analyzes massive quantities of securities trading data andnotifies the Securities and Exchange Commission of investors who mightbe engaged in insider trading. (1) Computer systems connected tonetworks of video cameras alert police when bags are abandoned on subwayplatforms, (2) when people on a street corner interact multiple times ina short period, (3) or when a single individual visits multiple cars ina parking structure. (4) The federal government has field tested adevice that screens individuals and predicts whether, based onphysiological data, the individual intends to commit a terrorist act.(5) Researchers at Carnegie Mellon, funded by the Defense AdvancedResearch Projects Agency, are developing computer systems to index andanalyze the text and images in online advertisem*nts for sex services toidentify likely sex traffickers and their victims. (6) While thesecurrent technologies generally follow a comprehensible logic--lookingfor facts that we understand to correlate with criminalconduct--technologies of the near future will analyze more data than ahuman being could and unearth connections that evade obvious logic.? Inother words, soon a computer may spit out a person's name, address,and social security number along with the probability that the person isengaged in a certain criminal activity, with no further explanation. (8)

These emergent technologies arise from the intersection of twotrends: the collection of massive troves of individualized data aboutpeople in the United States and the explosive growth of a field ofcomputer science known as machine learning. (9) With respect to theformer, these data come from a nearly unlimited variety of public andprivate sources, including video cameras, crime scene gunshot detectors,license plate readers, automatic tollbooth payment systems, and socialmedia websites. (10) Government bodies from the municipal to the federallevel are all involved in this "data vacuuming." (11)Moreover, private companies are increasingly making personal dataavailable to governments including to law enforcement agencies. (12)With a mixture of resignation and pessimism, this Article takes thegovernment's past and future collection of enormous quantities ofpersonal data as a given and instead examines the government's useof those data. (13)

Meanwhile, researchers have made colossal strides in recent yearsin machine learning, "the systematic study of algorithms andsystems that improve their knowledge or performance withexperience." (14) Machine learning is particularly useful forrevealing otherwise unrecognizable patterns in complex processesunderlying observable phenomena. (15) Specifically, machine learningtechniques help computer systems learn about an underlying process andits patterns by creating a useful mathematical approximation of how theprocess works. (16) This approximation can then be applied to new datato predict future occurrences of the same phenomena. (17) For instance,machine learning methods are used to examine patient records and createalgorithms that can help doctors diagnose illnesses or provideprognoses. (18)

At least on a conceptual level, machine learning and crime fightingare a perfect match. The interaction of forces that cause people tocommit crimes is incomprehensibly complex. Criminologists have soughtfor decades to use data to understand that interaction and identify themost likely criminal offenders. (19) Statistical models that aim toidentify the criminally inclined based on quantifiable personalcharacteristics have become influential in the contexts of pretrialrelease, probation, and parole. (20) Similarly, police departments haverecently begun to use statistical models to predict where in theirjurisdictions certain crimes are likely to occur. (21) Machine learningprovides a way to go one step further and use data to identify likelycriminals among the general population without the need to disentanglethe Gordian knot of causal forces.

This Article addresses technologies that apply machine learningtechniques to the "data hoards" available to law enforcementin order to predict individual criminality. (22) Some of thesetechnologies are already in use or are in advanced stages ofdevelopment. (23) Nascent examples are even more numerous, including:using past offender and crime scene data to create more accurateprofiles of unknown offenders, (24) leveraging behavioral data toidentify individuals who are attempting to conceal their true--andpotentially nefarious--intent, (25) and analyzing past corporatefinancial statements to create algorithms that can determine from thelanguage used in a financial statement whether the company is likelyengaged in fraud. (26)

This Article refers to programs like these--programs createdthrough machine learning processes that seek to predict individualcriminality--as Automated Suspicion Algorithms, or ASAs. ASAs sharethree defining characteristics as implied by the name. First, they arebased on algorithms, which can be broadly defined as sequences ofinstructions to convert an input into an output. (27) In this case, ASAsconvert data about an individual and her behavior into predictions ofthe likelihood that she is engaged in criminal conduct. (28) Second,ASAs assess individuals based on suspicion of criminal activity in thatthey engage in probabilistic predictions that rely on patterns detectedin imperfect information. (29) Third, ASAs automate the process ofidentifying suspicious individuals from data: they comb through data forfactors that correlate to criminal activity, assess the weight of eachfactor and how it relates to other factors, use the results to predictcriminality from new data, and continuously improve their performanceover time. (30) The automated creation of rules that predict criminalitydistinguishes ASAs from computer systems that might merely automate theapplication of a pre-existing police profile of criminality. (31)

Of course, from fingerprints to field testing kits to DNA matching,law enforcement has always tried to find ways to use the newesttechnologies. (32) As a result, attorneys, judges, and commentators arequite familiar with the role that technologies play in helping policeascertain the basic facts about a crime: the who, what, when, where, andwhy. A field test for cocaine, for instance, tells police whether acertain substance is contraband. A DNA match confirms that a suspect wasat a crime scene. But determining these historical facts is only thefirst step in deciding whether individualized suspicion existssufficient to justify a search or seizure under the Fourth Amendment.(33)

Until now, the second step in determining the existence ofindividualized suspicion--deciding whether the historical facts giverise to probable cause or reasonable suspicion (34)--has remained thesole province of human actors. The Supreme Court has held thatdeterminations about the existence of probable cause and reasonablesuspicion ultimately depend on reason, (35) "common sense,"(36) and police experience. (37) The Court has also made clear thatindividualized suspicion is ultimately about "probabilities,"though in the next breath we learn that probabilities "are nottechnical." (38) The promise of ASAs is that they can answer theindividualized suspicion question by providing data-derivedprobabilities of whether crime is afoot; the novel problem they presentis how those statistical probabilities fit in the "practical,nontechnical conception" of individualized suspicion articulated bythe Supreme Court. (39)

ASAs are coming, (40) and courts will soon be asked to consider howtheir output should factor into the individualized suspicion analysis.(41) The initial goal of this Article is to provide courts with aframework for that analysis. (42) Yet setting out this framework teachesbroader lessons about how emergent technologies interact with the FourthAmendment. First, we learn that ASAs push the limits of the Court'scurrent approach to the Fourth Amendment in areas that have alreadyraised red flags among scholars. One is the ongoing metamorphosis of thecollective knowledge doctrine into what some call the "constructiveknowledge" doctrine. (43) The former allows knowledge to be imputedbetween officers, so one officer may instruct another to conduct asearch or seizure without having to explain why. (44) The latter permitsa search based on the aggregated knowledge of law enforcement personnelgenerally, even if no one officer possessed enough knowledge to make anindividualized suspicion assessment. (45)

Another area of concern is the integration of statistical data inthe individualized suspicion analysis, (46) which the Supreme Courtrecently tackled in the context of drug dogs. (47) A third areaimplicated by ASAs is the Supreme Court's holding that errors inpolice databases require the exclusion of evidence only in cases ofgross negligence or systemic misconduct. (48) Taken together, theseissues establish a second, overarching point: ASA accuracy cannot beregulated through the courts alone; rather, extrajudicial action isneeded to ensure that ASAs are created, maintained, used, and updatedaccurately and effectively.

Part I of this Article provides a brief background of machinelearning and how it could be applied to create ASAs. Part II sketchesout the Fourth Amendment's individualized suspicion analysis, witha particular focus on the two steps articulated by the Supreme Court.Part III tackles the question of whether an ASA's prediction shouldbe sufficient to establish individualized suspicion and concludes thatit should not. Part IV discusses how ASAs should be integrated into thetotality-of-the-circ*mstances analysis. Part V addresses how courtsshould handle ASA errors, and specifically when such errors should leadto exclusion of evidence. The Article concludes by pulling togetherlessons from the prior discussion and proposing extrajudicial means ofensuring ASA accuracy.

I. MACHINE LEARNING AND ASAs

"Machine learning" is part of a nest of concepts in theartificial intelligence arena, including "data mining,'"(49) "knowledge discovery in databases," (50) and "bigdata," (51) that are often used interchangeably and confusingly inacademia, government, and popular media. (52) For the sake of clarity,in this Article "machine learning" refers to the study ofalgorithms that analyze data in order to help computer systems becomemore accurate over time when completing a task. (53) This continuousimprovement on a given task is the "learning" referenced in"machine learning," and it differs from the more holisticconcept referred to when people speak of human learning. (54) Inparticular, machine learning does not require a computer to engage inhigher-order cognitive skills like reasoning or understanding ofabstract concepts. (55) Rather, machine learning applies inductivetechniques to often-large sets of data to "learn" rules thatare appropriate to a task. (56) In other words, the"intelligence" of a machine learning algorithm is oriented tooutcomes, not process: a "smart" algorithm reachesconsistently accurate results on the chosen task even if the algorithmdoes not "think" like a person. (57)

Machine learning methods are particularly good at helping computerslook at a complex set of data and model the underlying processes thatgenerated those data. (58) The models generated through machine learningcan then be applied to new data in order to predict future outcomes.(59) One of the most common tasks to which machine learning algorithmsare applied is the "classification" of "objects," acatchall concept that can include anything, including people, aboutwhich one might collect data. (60) Classification is an example of whatis called "supervised" machine learning, by which an algorithmlearns from data that has already been "labeled" with thetarget "feature." (61) Features, in turn, are the"language" that machine learning algorithms uses to describethe objects within its domain. (62) The only technological limit on thekind of characteristic that can be a feature is that it must bemeasurable. (63) The machine learning process then creates a model basedon the labeled dataset that can be used to predict the properclassification of future objects. (64)

More specifically, (65) in supervised machine learning the initialset of labeled data is typically subdivided into three parts: a"training set"; a "verification set" or"validation set"; and a "test set." (66) During thedevelopment of a model, the algorithm first learns an initial group ofclassification rules by analyzing the training set. (67) These rules arethen applied to a validation or verification set, and the results areused to optimize the rules' parameters. (68) Finally, the optimizedrules are applied to the test set, and the results establish both a"confidence" level and a "support" level for eachrule. (69) The support level of a rule describes the percentage ofobjects in the test set to which the rule applies. (70) Rules with a lowsupport level are less likely to be statistically significant. (71)Thus, to restrict which rules the algorithm will use to ensurepredictions are made only on the basis of statistically significantcorrelations, programmers often require rules to meet a minimum supportlevel. (72) The confidence level of a rule describes how often objectsin the test set follow the rule. (73) It is, in essence, a measure ofthe strength of the algorithm's prediction. (74) Machine learningmethods are currently used in a wide variety of classification tasks,including identification of "spam" e-mails, optimization ofproductions processes, diagnosis of diseases, risk evaluation, imageclassification, and game playing. (75) Law enforcement's task offerreting out crime is also one of classification: distinguishing theguilty from the innocent. (76) Or, more precisely in the FourthAmendment context, the job of a police officer on the beat is toseparate those who are likely criminals from those who are likelyinnocent." Thus, the machine learning task of classification wouldseem to complement the police officer's objectives.

Indeed, the outline of how an ASA could be created isstraightforward. One would begin with historical data about peoplecontaining a variety of features that might be relevant to predicting acertain kind of criminal activity, perhaps including their immutablepersonal characteristics (e.g., age, gender, race, religion),demographic information (e.g., address, salary, occupation), andspecific activities (e.g., presence on a certain street corner at acertain time, patterns of flights, or specifics of tax returns). (78)These data would also be labeled to indicate whether each includedperson was known to be engaged in the targeted criminal conduct or not.Machine learning methods would then be applied to these data to create amodel that an ASA could apply to new data to predict which individualsare likely to be engaged in the targeted criminal activity. Theconfidence level of the model would determine the confidence level ofthe ASA's prediction that a given individual is engaged in criminalconduct.

Machine learning algorithms are not perfect, however, and mistakenpredictions stem from four general sources. First, when machine learningmethods are used to model complex causal systems, they necessarily relyupon approximations. (79) The causes of criminal conduct aresufficiently complex to motivate entire fields of study, but the ASAdoes not become a criminologist, psychologist, police officer, orsociologist. Instead, machine learning methods use patterns andcorrelations within the data to make a (perhaps highly educated) guessabout what differentiates criminals from non-criminals. (80) Becausethese patterns and correlations are mere estimates of the more complexunderlying phenomenon, they are inevitably inaccurate in some instances.(81) Such inaccuracies can be reduced, however, if the set of trainingdata is large and representative. (82)

Second, inaccuracies in supervised machine learning models may comefrom "noise" in the training data. (83) In other words, thetraining data may contain information about the people described thereinthat is wrong. (84) For instance, a database containing the trainingdata for an ASA targeting auto theft may list as a feature eachindividual's age. The database may list a particular individual asthirty years old when she was really forty years old, or perhaps listthe individual as having been engaged in auto theft when she really wasnot. Though these kinds of noise differ in terms of their source, (85)they both can cause the machine learning process to create inaccuratemodels. (86) Inaccuracies resulting from noise can be mitigated byavoiding "overfitting," where machine learning methods try tomatch a model perfectly to the training data, and by the use of distincttest sets that were not used to train the algorithm. (87)

Third, inaccuracies can arise if an algorithm's training datais not representative of all instances of the relevant event or objectin the world. (88) For instance, if our ASA--meant to identify likelyauto theft--is trained on data only from a single city, the ASA may beless accurate when applied nationally if auto thieves have differentcriminal methods in different locales. Similarly, machine learningmethods typically assume that the near future will be substantiallysimilar to the time when the sample data were collected. (89) Thus, ifthe methods of auto thieves change over time, perhaps in response topolice action or new technologies, our auto-theft-detecting ASA mustcontinue to learn from new data in order to remain accurate.

Fourth, the choices made by humans throughout the machine learningprocess can cause inaccuracies in the final predictions of a machinelearning algorithm. (90) At the outset, decisions must be made aboutwhat features of the objects in question should be used to construct themodel. (91) In other words, before an ASA can be developed, a personmust decide what facts might matter in determining whether certainbehavior or characteristics are indicative of criminal conduct and howsuch facts can be described. For example, if an ASA is meant to detectsuspicious bank transactions, should we look at the timing of eachtransaction? If so, is it the time of day that matters, the temporaldistance of the transaction from other similar transactions, or someother time-related characteristic? The selection of the features to beanalyzed is "absolutely crucial" to the success of the machinelearning process. (92)

Next, data analysts must construct the training dataset. (93) Thisrequires decisions about which databases to use, how to normalize datafrom different databases so that all the objects are described in termsof the same set of features, and whether to reject data that a givenanalyst believes is wrong or insignificant. (94) These decisions andothers allow human assumptions about what correlations should exist inthe data to color the outcome. (95) The algorithm must then be trained,a process that requires a decision about how different kinds ofpotential errors should be weighted. (96) For instance, an ASAprogrammer would need to decide whether it is worse for an innocentperson to be treated as a likely criminal than for the police to ignorea person engaged in the targeted activity, and, if so, how much worse.(97) Implementing this decision will adjust the frequency with which themodel predicts criminality. Finally, once an algorithm is generated, aperson must answer numerous questions about its application in thefield. (98) For example, how certain must a prediction be before it isreported to the police? What information will the ASA convey to thepolice about the prediction? Such decisions will impact a model'saccuracy and operation when it is put into practice.

A person also must decide whether and to what extent the machinelearning algorithm will be comprehensible to humans. (99) Absent anintentional decision to the contrary, machine learning tends to createmodels that are so complex that they become "black boxes,"where even the original programmers of the algorithm have little ideaexactly how or why the generated model creates accurate predictions.(100) On the other hand, when an algorithm is interpretable, an outsideobserver can understand what factors the algorithm relies on to make itspredictions and how much weight it gives to each factor. (101)Interpretability comes at a cost, however, as an interpretable model isnecessarily simpler--and thus often less accurate--than a black boxmodel. (102) It is certainly plausible that in the context of ASAs,society may ultimately decide to bear this cost. Yet when it comes tocrime detection, the political cost of interpretability, measured incrimes unprevented and criminals uncaught, may well be quite high, thusmaking a black box ASA a far more attractive option.

II. INDIVIDUALIZED SUSPICION, OLD ALGORITHMS, AND ASAS

This Part lays out the existing doctrine that governs the findingof individualized suspicion to justify either a search or seizure underthe Fourth Amendment. First, it articulates the two sequential stepsthat a police officer, magistrate, or court must undertake whendetermining whether probable cause or reasonable suspicion exists in agiven case. Second, it establishes that ASAs play a different role inthe individualized suspicion analysis than traditional algorithmic data.

A. The Two-Step Individualized Suspicion Analysis

In most circ*mstances, (103) the police must have individualizedsuspicion that a person is engaged in criminal conduct before they cansearch or seize that person. (104) The two prototypical levels ofindividualized suspicion are reasonable suspicion, which is required toconduct a limited search or brief seizure as articulated in Terry v.Ohio, (105) and probable cause, which is required for a"full-blown" arrest or more intrusive search. (106) Todetermine whether individualized suspicion exists, courts and policemust look at "the totality of the circ*mstances--the wholepicture." (107) The Court adopted the totality-of-the-circ*mstancesapproach in Illinois v. Gates to overturn a line of precedent that hadbeen interpreted to limit when anonymous tips could be used to establishprobable cause. (108) The Court instructed that rather than applying"[r]igid legal rules" in the individualized suspicionanalysis, police and magistrates must engage in a "balancedassessment of the relative weights" of all the relevant evidence.(109) In a similar vein, the totality-of-the-circ*mstances approachrequires police and magistrates to consider exculpatory evidence, alongwith any incriminating facts, in determining whether individualizedsuspicion exists. (110)

The totality-of-the-circ*mstances analysis involves two distinct,sequential steps:

 The principal components of a determination of reasonable suspicion or probable cause will be [(1)] the events which occurred leading up to the stop or search, and then [(2)] the decision whether these historical facts, viewed from the standpoint of an objectively reasonable police officer, amount to reasonable suspicion or to probable cause. (111)

The "events which occurred leading up to the stop orsearch" answer basic who, what, where, and when questions about thecrime and the suspect: Who is she? What did she do? Where is she? Whendid she engage in the relevant conduct? (112) The sources of thisinformation are as diverse as human experience would suggest: directobservation by law enforcement personnel, tips from informants, anddocumentary evidence are but a few. The question for an officer,magistrate, or court at this stage is relatively straightforward: Waslaw enforcement's information sufficiently reliable for areasonable officer to rely upon on in determining the historical facts?(113) The methods used to evaluate the reliability of a given piece ofevidence differ depending on the nature of the evidence, and theevaluation can be quite difficult. Nonetheless, courts have extensiveexperience with such questions. (114)

The second step is more complicated because it presents a mixedquestion of law and fact. (115) An officer, magistrate, or court mustdecide, given the historical facts upon which a reasonable officer wouldrely, "whether the facts satisfy the relevant ... constitutionalstandard." (116) Determinations about what behavior is adequatelyindicative of criminal conduct must be "practical" and"commonsense" (117) and based upon "inferences abouthuman behavior." (118) In addition to historical facts, theseinferences may be informed by "background facts" about thecommunity at issue that are unlikely to be the subject of proof. (119)Courts are also instructed to defer to police experience and trainingwhen deciding whether individualized suspicion exists. For instance, theCourt in United States v. Brignoni-Ponce recognized that "theofficer is entitled to assess the facts in light of his experience"in detecting the criminal conduct at issue. (120) In United States v.Arvizu, the Court reiterated that the individualized suspicion analysis"allows officers to draw on their own experience and specializedtraining to make inferences from and deductions about the cumulativeinformation available to them that 'might well elude an untrainedperson.'" (121) Taken together, these rulings teach that thelevel of suspicion arising from a given set of facts "may varydepending on what a police officer knew based on her training,experience, and familiarity with the neighborhood." (122)

The Court's guidance on the inference of suspicion fromhistorical facts leaves numerous ambiguities unresolved. First, theCourt has intentionally declined to state with numerical precision howlikely criminal conduct must be to satisfy the reasonable suspicion andprobable cause standards. The individualized suspicion analysis"does not deal with hard certainties, but with probabilities,"(123) yet the Court has rejected any attempts to quantify the relevantprobabilities. (124) Second, courts and police have little guidance onhow to weigh various kinds of data in deciding whether individualizedsuspicion exists. Because the hard questions of suspicion involvepredicting criminal conduct from noncriminal behavior, "therelevant inquiry is not whether particular conduct is'innocent' or 'guilty,' but the degree of suspicionthat attaches to particular types of noncriminal acts." (125) Yetcourts rarely possess empirical data that might prove or disprove acorrelation between certain conduct and criminal activity. (126) Andeven when they do, courts are typically untrained in how to assess thatdata. (127) Finally, the Supreme Court has not explained how courtsshould decide whether to defer to police experience in a given case andhow much deference to give. (128)

B. Algorithms in the Individualized Suspicion Analysis: The Old andthe New

Law enforcement officials have used the output of automatedalgorithms for decades. (129) Breathalyzers run on algorithms that statethe amount of alcohol in an individual's blood based on the amountof alcohol in a sample of that individual's breath. (130) Radarguns send radio waves at a certain frequency in the direction of amoving automobile, measure the frequency of reflected waves that return,and calculate the speed of the automobile based on the change infrequency. (131) A DNA sample from a crime scene can be matched againststored DNA profiles using search algorithms. (132) Emerging algorithmicbiometric technologies aim to enhance the ability of police to identifysuspects and track their movements. (133)

These traditional technologies can be exceptionally helpful topolice in establishing the "historical facts" of whathappened, when it happened, and who was involved. (134) In DNA matching,algorithmic searches of databases reveal either who was at the scene ofa given crime or whether a given person was at the scene of otherunsolved crimes. (135) Radar guns show how fast a vehicle is moving at agiven moment. (136) Newer biometric technologies can providesubstantially more information about a suspect's location andmovements. (137) All of these technologies help police establish factsthat can be ascertained to a definable level of certainty: for example,the quantity of alcohol in a driver's bloodstream can be certainwithin some calibration level, (138) or the identity of DNA found at acrime scene can be determined to some statistical level of confidence.(139) And because these technologies answer questions of fact, a courtcan focus its analysis on the familiar question of the accuracy of thetechnology used to determine the fact at issue. (140)

Unlike the output of traditional technologies, the output of an ASAis directed at the mixed question of law and fact of whether thehistorical facts are sufficient to establish reasonable suspicion orprobable cause. (141) ASAs look at data from other sources and predictthe probability that an observed person with a certain set of"features" (142) is engaged in criminal conduct. (143) Inproviding a prediction of criminality, the ASA's examination ofdata overlaps with the second step in the individualized suspicionanalysis. (144) As such, ASAs provide a kind of data to the FourthAmendment analysis that serves an analytically different role than theoutput of traditional algorithms.

To illustrate this distinction, consider the case of People v.Nelson from the California Supreme Court. (145) In Nelson, anineteen-year-old college student disappeared after telephoning hermother to report that her car would not start. (146) The victim'sbody was found two days later. (147) After more than twenty-five years,police were able to obtain a sample of a suspect's DNA and match itto DNA collected near where the victim's body was found. (148)Almost conclusively, the DNA match established the historical fact thatthe defendant had been at the location where the body was found closeenough in time that the DNA that he left behind had not degraded orotherwise disappeared. (149)

Yet this historical fact, standing alone, does not tell police howlikely it was that the defendant was guilty. Rather, to connect Nelsonto the murder, more facts are needed. In Nelson, that "more"included: that the victim had been raped before she was killed, that theDNA sample was collected from sem*n on her body and clothing, and thatthe victim was seen in a car matching one owned by the defendant shortlybefore her death. (150) Traditionally, a human being must consider theDNA match together with those additional facts to decide that asufficient probability of guilt existed to justify arresting Nelson forthe murder. The novelty of an ASA is its potential to step into theshoes of that human being by analyzing groups of disparate factstogether and drawing conclusions about the probability of anindividual's guilt.

III. THE INSUFFICIENCY OF AN ASA'S PREDICTION

Say that an ASA predicts a 60% likelihood that a specific person isselling drugs on a street corner, and a police officer, upon receivingthe prediction, stops the suspect, frisks him, and finds drugs. If thedefendant challenges the stop and frisk, can the prosecution rely solelyon the ASA's prediction, or does the Fourth Amendment requiresomething more? The "collective knowledge" doctrine, whichallows one police officer to engage in a search or seizure based on theinstruction of another officer who knows facts that establishindividualized suspicion, (151) provides a framework for answering thisquestion. If the ASA's prediction is the equivalent of anofficer's instruction, then under the constructive knowledgedoctrine an officer would be justified in acting on that prediction,standing alone. The first Section of this Part lays out the scope andoperation of the collective knowledge doctrine, including how somecourts have extended the doctrine to apply to constructive knowledge,and scholars' criticisms of the expanded doctrine. The secondSection explores the application of the doctrine to an ASA'soutput. This Part makes two arguments: first, that the expanded"constructive knowledge" doctrine, as applied to ASAs, wouldeviscerate the individualized suspicion requirement; and second, that anASA's prediction is not sufficient to create individualizedsuspicion.

A. The Collective and Constructive Knowledge Doctrines

In Whiteley v. Warden, the Court held that when one officer asksanother officer to help her with the execution of a warrant, the secondofficer is entitled to presume that the first officer provided amagistrate with sufficient information to justify a finding of probablecause. (152) The Court expanded this rule in United States v. Hensleybeyond situations involving a warrant to allow an officer to rely on aflyer or bulletin if: (1) the officer acted in "objectivereliance" on the flyer or bulletin; (153) and (2) the flyer orbulletin was based on articulable facts sufficient to establish thenecessary individualized suspicion. (154) Lower courts have sinceapplied the collective knowledge rule to justify searches and seizuresin a wide variety of situations in which an officer is instructed toundertake the search but is not provided information sufficient toindependently find the proper level of individualized suspicion. (155)

The rationale behind the collective knowledge rule is largelypragmatic: "[E]ffective law enforcement cannot be conducted unlesspolice officers can act on directions and information transmitted by oneofficer to another and ... officers ... cannot be expected tocross-examine their fellow officers about the foundation for transmittedinformation." (156) Requiring that the officer who engages in asearch or seizure must herself have the necessary individualizedsuspicion would be a "crippling restriction[] on our lawenforcement." (157) Instead, it is sufficient that at some point,an individual trained in making individualized suspicion determinations,whether a magistrate or a law enforcement officer, had sufficientknowledge to conclude that the individual be seized or searched. (158)Mandating that a person trained in individualized suspiciondeterminations find probable cause or reasonable suspicion seems toensure that reliance on the instruction to stop is objectivelyreasonable. (159) In addition, an individual searched or seized pursuantto the collective knowledge doctrine has "minimal" interestsat stake. (160) Because the suspect could have been seized by oneofficer, she loses little in the way of security or privacy when she isstopped by another officer at the instruction of the first. (161)

While the constructive knowledge doctrine applies the general ideaunderlying the collective knowledge doctrine of police reliance on otherofficers' knowledge, it does so without the same strictrequirements. (162) The broadest view of the constructive knowledgedoctrine, and the one most relevant here, is one where no one officerpossesses facts sufficient to establish the needed individualizedsuspicion, but the aggregation of several officers' knowledge wouldmeet the standard. (163)

Specifically, this version of the doctrine omits both therequirement that a single individual trained in individualized suspicionassessments evaluate the facts and the need for the knowledgeableofficers to have communicated with each other. (164) Nevertheless,courts generally limit the scope of the constructive knowledge doctrineto officers who are working closely together. (165)

Academics and dissenting judges have criticized the constructiveknowledge doctrine for not meaningfully enhancing law enforcementexpediency, reasoning that police communication is inexpensive andincreases accuracy. (166) Moreover, the constructive knowledge doctrineremoves the concept of "belief" and the perspective of a"reasonable officer" from the definitions of probable causeand reasonable suspicion. (167) After all, a court cannot inquire intowhether "facts and circ*mstances within the officer'sknowledge ... are sufficient to warrant a prudent person, or one ofreasonable caution, in believing ... that the suspect" is engagedin criminal conduct if no single officer knew the information and couldbelieve something about it. (168) Finally, as massive quantities ofinformation become readily available to law enforcement agencies throughfusion centers and communication technologies, (169) a broad reading ofthe constructive knowledge doctrine would render the individualizedsuspicion requirement meaningless in most situations. (170) This threathas led one scholar to suggest that the constructive knowledge doctrinewould turn the police into "something like Star Trek's BorgCollective," in that officers would be able to rely upon what isknown by any other officer anywhere, at least for the purposes ofproviding a post hoc justification for a search or seizure. (171)

B. Applying the Doctrines to ASAs

The power of ASAs to analyze large quantities of data in makingtheir predictions underscores the threat that the constructive knowledgedoctrine poses to the Fourth Amendment's individualized suspicionrequirement. (172) Where police have access to the massive troves ofinformation contained in fusion centers, the doctrine already opens thedoor to "arrest first, justify later" policing. (173) Butpermitting police to claim constructive awareness of an ASA'spredictions of criminality without any requirement that the predictionsbe communicated to the officer conducting a search or seizure wouldfurther encourage police to ignore individualized suspicionrequirements. (174) Particularly as criminal laws have proliferated tothe point that "everyone is a criminal if prosecutors look hardenough," (175) applying the constructive knowledge doctrine to ASAscould permit the police to stop anyone and later find a prediction ofcrime to justify the intrusion.

Applying the collective knowledge doctrine to ASAs, however,presents a less immediately discomfiting dystopia. Upon receipt of anASA's prediction, police could search or seize a person identifiedby an ASA without engaging in any independent assessment of the facts todetermine whether individualized suspicion exists. Reliance by thepolice officer would be permitted if the ASA's prediction wereanalogous to an instruction to arrest by an individual trained in makingindividualized suspicion determinations. (176) In some sense, an ASA isvery well trained in making individualized suspicion determinations, asit can provide a quantifiable prediction of criminality based on theavailable data (e.g., there is a 60% chance that the person on a certainstreet corner is dealing drugs). (177) So long as we have reason tobelieve that the ASA is accurate, (178) the ASA's prediction isarguably analogous to an assertion of the existence of probable cause orreasonable suspicion by a person trained in making such assessments. TheCourt, after all, has repeatedly explained that individualized suspiciondeals with probabilities, (179) and an ASA can quantify thoseprobabilities like no technologies before it.

This analogy between an ASA and a trained person fails for tworelated reasons, however. First, it depends on a fundamentalmisunderstanding of the question that the individualized suspicionstandard asks. Second, the analogy fails to appreciate differences inhow humans and machines examine factual situations. To see these flaws,we must start by recalling that the probable cause and reasonablesuspicion determinations require a consideration of the totality of thecirc*mstances. (180) As its name suggests, thetotality-of-the-circ*mstances approach demands a consideration of allevidence relevant to the question of how likely it is that the targetedindividual is engaged in criminal activity, including exculpatoryevidence. (181)

For an ASA's prediction to be sufficient to justify a searchor seizure, it too must engage in a totality-of-the-circ*mstancesanalysis. But, at least under current technological constraints, ASAsare fundamentally incapable of doing so. As with any machine learningprocess, an ASA is only as good as the data its programmers choose toprovide it, either in training or in real-world application. (182) Thisis because the data provided to an ASA constitutes the sum total of whatthe algorithm "knows" about the world; the ASA cannot identifynew types of relevant data that are not currently contained in itsdataset and then seek out those data. (183) Thus, an ASA trained on asmall dataset "knows" very little, while an ASA trained on anenormously robust dataset "knows" quite a lot. (184) But eventhe latter ASA is limited in making its predictions to analysis of thedata within its dataset, and it cannot consider other facts that mightbe relevant but that were not included. In contrast, human beings arealways at least potentially capable of including a new piece of relevantinformation in an analysis. (185)

This distinction matters enormously for the capacity of an ASA toengage in a totality-of-the-circ*mstances analysis. The kinds ofinformation that might be relevant to an individualized suspiciondetermination are infinite. (186) While an ASA may be trained with adatabase that contains all the facts that are most relevant in a largemajority of cases, that database cannot contain all the facts that arerelevant in every case. (187) As a result, an ASA cannot consider the"whole picture" regarding a person's potentialcriminality as required by the Fourth Amendment. (188)

To illustrate this point, imagine an ASA targeting the selling ofnarcotics on street corners. The ASA has access to information from avariety of inputs, such as closed-circuit cameras, license-platereaders, and facial recognition technology. Based on both historic andreal-time data from these sources, it predicts when specific individualsare engaging in hand-to-hand drug transactions. One day it issues analert predicting that an individual is more likely than not sellingnarcotics on a street corner. A patrol officer in uniform is dispatchedto investigate and witnesses the suspect and passers-by brieflyexchanging items by hand. As she approaches the suspect, the officermakes two observations. First, she notes that the suspect sees her anddoes not change his behavior. Second, she sees a passer-by drop an itemrecently received from the suspect on the ground, picks the item up, andnotes that it is a flyer for a church event.

Both observed facts tend to diminish the likelihood that thesuspect is engaged in criminal activity, but neither are captured in theASA's dataset. A totality-of-the-circ*mstances analysis ofindividualized suspicion must account for these facts, however, and ourASA has failed to do so. (189) But now that we know the identified factsmatter, the ASA can be programmed to incorporate them in futurepredictions. Yet this does not resolve the underlying problem that theASA must consider every fact that might impact the existence ofindividualized suspicion. To do so the ASA must either be able toprocess all known information or have been programmed in advance to"know" all potentially relevant information. Neither isfeasible: the former requires more processing power than is currentlyavailable and the latter requires impossible foresight. Thus, a persontrained in making individualized suspicion determinations must be thefinal assessor of the totality-of-the-circ*mstances, including both theASA's prediction and any other relevant available data, in order todecide whether the probable cause or reasonable suspicion standards aremet. (190)

Requiring a human to assess the totality-of-the-circ*mstances,however, may reduce the overall accuracy of police searches andseizures. While some of the additional evidence that a human being willconsider may clearly confirm or rebut the ASA's prediction, thehuman officer may not be able to accurately assess the impact of otherevidence on the analysis.

For instance, imagine that an ASA predicts that a specificindividual, who has been going from car to car in a parking lot and thenspends five minutes trying to get into one vehicle before walking away,has a 52% chance of being engaged in auto theft. If the ASA is accurate,the odds establish probable cause to arrest the suspect. (191) Anofficer is told of the ASA's prediction and approaches theindividual near the parking lot. The officer asks him for anexplanation, and the individual provides a story that innocentlyexplains his actions. If the officer finds the story credible, thatexplanation would destroy the officer's probable cause. The officercould not validly arrest the suspect considering the totality of all thecirc*mstances known to him.

Yet, there are serious reasons to doubt the officer's abilityto evaluate accurately the totality of the circ*mstances in many cases.First, studies have shown that police, like laypeople, are not good liedetectors. (192) Other cognitive roadblocks also may hinder anofficer's capacity to make accurate individualized suspiciondeterminations. (193) For example, an officer's initial perceptionof a suspect's criminality may be overly influenced by thesuspect's facial expression when approached by the officer or thesuspect's nervous reaction if the officer appears unfriendly. (194)Once the officer forms a negative impression of the suspect, humannature makes the officer resistant to changing it. (195)

In addition to troubles with credibility determinations, facts thatsociety may not want to be part of the analysis--like a suspect'srace, religion, or national origin--influence the officer'sassessment of criminality. Racial minorities, and particularlyAfrican-American males, have long been stereotyped as "violent,criminal, and dangerous." (196) These stereotypes can unconsciouslyimpact how police assess criminality. Whites react negatively to facesdisplaying features typically associated with African-Americans. (197)African-Americans draw attention more quickly than Whites. (198)Observers viewing ambiguous behavior interpret the behavior differentlydepending on the race of the observed person. (199) Negative implicitbiases are also prevalent with respect to non-White races other thanAfrican-Americans, as well as traits other than race, including religionand national origin. (200) L. Song Richardson has argued convincinglythat these unconscious biases infect police assessments ofindividualized suspicion. (201)

Finally, incorporating the output of an ASA into thetotality-of-the-circ*mstances analysis in an accurate and meaningful wayis likely to be quite challenging. (202) For all these reasons,requiring police to be open to additional data and to include such datain their totality-of-the-circ*mstances analysis for each suspect willlikely lead to more police errors: namely, searches and seizures of theinnocent and instances of the guilty going free.

This result is certainly not ideal, but the Fourth Amendment andits individualized suspicion standards are not in place to maximizepolice accuracy; rather, they aim to ensure individualized justice.(203) In other words, the Fourth Amendment would not be satisfied if apolice agency conducted ten searches, five on suspects who were almostcertainly engaged in criminal activity and five on suspects who almostcertainly were not, on the ground that on average probable causeexisted. (204) Rather, probable cause must exist for each suspect. (205)put another way, in most circ*mstances the Fourth Amendment entitleseach suspect to an assessment of whether individualized suspicion existsbased on all available facts relating to her potential guilt. (206)Recent approaches to probable cause and reasonable suspicion may haveundermined the individualized nature of these standards, (207) but therequirement of a totality-of-the-circ*mstances analysis remains, even ifthat requirement means that police will make more mistakes.

COPYRIGHT 2016 University of Pennsylvania, Law School
No portion of this article can be reproduced without the express written permission from the copyright holder.