This article was originally published in The Journal of American History Volume 93, Number 1 (June, 2006): 117-46 and is reprinted here with permission.
History is a deeply individualistic craft. The singly authored work is the standard for the profession; only about 6 percent of the more than 32,000 scholarly works indexed since 2000 in this journal’s comprehensive bibliographic guide, “Recent Scholarship,” have more than one author. Works with several authors—common in the sciences—are even harder to find. Fewer than 500 (less than 2 percent) have three or more authors.1
Historical scholarship is also characterized by possessive individualism. Good professional practice (and avoiding charges of plagiarism) requires us to attribute ideas and words to specific historians—we are taught to speak of “Richard Hofstadter’s status anxiety interpretation of Progressivism.”2 And if we use more than a limited number of words from Hofstadter, we need to send a check to his estate. To mingle Hofstadter’s prose with your own and publish it would violate both copyright and professional norms.
A historical work without owners and with multiple, anonymous authors is thus almost unimaginable in our professional culture. Yet, quite remarkably, that describes the online encyclopedia known as Wikipedia, which contains 3 million articles (1 million of them in English). History is probably the category encompassing the largest number of articles. Wikipedia is entirely free. And that freedom includes not just the ability of anyone to read it (a freedom denied by the scholarly journals in, say, jstor, which requires an expensive institutional subscription) but also—more remarkably—their freedom to use it. You can take Wikipedia‘s entry on Franklin D. Roosevelt and put it on your own Web site, you can hand out copies to your students, and you can publish it in a book—all with only one restriction: You may not impose any more restrictions on subsequent readers and users than have been imposed on you. And it has no authors in any conventional sense. Tens of thousands of people—who have not gotten even the glory of affixing their names to it—have written it collaboratively. The Roosevelt entry, for example, emerged over four years as five hundred authors made about one thousand edits. This extraordinary freedom and cooperation make Wikipedia the most important application of the principles of the free and open-source software movement to the world of cultural, rather than software, production.3
Wikipedia today: The current home page for Wikipedia reflects the scale of the project (more than 1 million English-language articles) and its multiple languages. http://wikipedia.org/ (March 8, 2006).
Despite, or perhaps because of, this open-source mode of production and distribution, Wikipedia has become astonishingly widely read and cited. More than a million people a day visit the Wikipedia site. The Alexa traffic rankings put it at number 18, well above the New York Times (50), the Library of Congress (1,175), and the venerable Encyclopedia Britannica (2,952). In a few short years, it has become perhaps the largest work of online historical writing, the most widely read work of digital history, and the most important free historical resource on the World Wide Web. It has received gushing praise (“one of the most fascinating developments of the Digital Age”; an “incredible example of open-source intellectual collaboration”) as well as sharp criticism (a “faith-based encyclopedia” and “a joke at best”). And it is almost entirely a volunteer effort; as of September 2005, it had two full-time employees. It is surely a phenomenon to which professional historians should attend.4
To that end, this article seeks to answer some basic questions about history on Wikipedia. How did it develop? How does it work? How good is the historical writing? What are the potential implications for our practice as scholars, teachers, and purveyors of the past to the general public?
Writing about Wikipedia is maddeningly difficult. Because Wikipedia is subject to constant change, much that I write about Wikipedia could be untrue by the time you read this. An additional difficulty stems from its vast scale. I cannot claim to have read the 500 million words in the entire Wikipedia, nor even the subset of articles (as many as half) that could be considered historical.5 This is only a very partial and preliminary report from an ever-changing front, but one that I argue has profound implications for our practice as historians.
Wikipedia itself rather grandly traces its roots back to “the ancient Library of Alexandria and Pergamon” and the “concept of gathering all of the world’s knowledge in a single place” as well as to “Denis Diderot and the 18th century encyclopedists.” But the more immediate origins are in a project called Nupedia launched in March 2000 by Jimmy Wales and Larry Sanger. They were not the first to think of a free Web-based encyclopedia; in the earliest days of the Web, some had talked about creating a free “Interpedia”; in 1999 Richard Stallman, a key figure in the emergence of free and open-source software, proposed gnupedia as a “Free Universal Encyclopedia and Learning Resource.” The thirty-three-year-old Wales (also known as Jimbo), who got rich as an options trader and then became an Internet entrepreneur, decided to create a free, online encyclopedia. He recruited Sanger, age thirty-one, who was finishing a Ph.D. in philosophy at the Ohio State University—whom Wales knew from their joint participation in online mailing lists and Usenet discussion groups devoted to Ayn Rand and objectivism—to become the paid editor in chief. Wales’s company Bomis, an Internet search portal and a vendor of online “erotic images” (featuring the Bomis Babe Report), picked up the tab initially.6
Early Wikipedia: Part of Wikipedia’s home page as it looked on October 25, 2001, when the online encyclopedia’s creators boasted that it had more than 14,000 articles and set 100,000 articles as their goal. It now has more than ten times that number. http://web.archive.org/web/20011025211405/http://wikipedia.com/ (March 2, 2006).
Sanger designed Nupedia to ensure that experts wrote and carefully vetted content. In part because of that extensive review, it managed to publish only about twenty articles in its first eighteen months. In early January 2001, as Sanger was trying to think of ways to make it easier for people without formal credentials to contribute to Nupedia, a computer programmer friend told him about the WikiWikiWeb software, developed by the programmer Ward Cunningham in the mid-1990s, that makes it easy to create or edit a Web page—no coding html (hypertext markup language) or uploading to a server needed. (Cunningham took the name from the Hawaiian word wikiwiki, meaning “quick” or “informal.”) Sanger thought that wiki users would quickly and informally create content for Nupedia that his experts would edit and approve. But the Nupedia editors viewed the experiment with suspicion; by mid-January Sanger and Wales had given it a separate name, Wikipedia, and its own domain.7
Very swiftly, Wikipedia became the tail that swallowed the dog (Nupedia). In less than a month, it had 1,000 articles; by the end of its first year, it had 20,000; by the end of its second year, it had 100,000 articles in just the English edition. (By then it had begun to spawn foreign-language editions, of which there are now 185, from Abkhazian to Klingon to Zulu, with the German edition the largest after English.) Sanger himself did not stay around to enjoy Wikipedia‘s runaway growth. By late 2001 the tech boom was over, and Bomis, like most other dot-coms, was losing money and laying off employees. An effort to sell ads to pay Sanger’s salary foundered as Internet advertising tanked, and Sanger lost his job in February 2002. He continued intermittently as a volunteer but finally broke with the project in January 2003 over the project’s tolerance of problem participants and its hostility to experts.8
Since then, Wikipedia‘s growth has accelerated. It had almost a half million articles by its third anniversary in January 2004; it broke the million mark just nine months later. More than fifty-five thousand people have made at least ten contributions to Wikipedia.9 Over this short history, it has also evolved a style of operation and a set of operating principles that require explanation before any discussion of history on Wikipedia.
The Wikipedia Way: How It Works
The Wikipedia “Policies and Guidelines” page links to dozens of other pages, including six pages of “General Guidelines” (for example, “Contribute what you know or are willing to learn about”); twelve of “Behavior Guidelines” (“Don’t bite the newcomers”); nineteen of “Content Guidelines” (“Check your facts”); nine of “Style Guidelines” (“Avoid one-sentence paragraphs”); and five of “Conventions” (“How to title articles”). But realizing that “they” (I employ the pronoun to refer to the collectivity of Wikipedia authors, editors, administrators, and programmers) would have no participants if authors were required to master this massive set of instructions before writing, they helpfully add, “You don’t need to read every Wikipedia policy before you contribute!” and they offer a short primer of four “key policies.”10
“Wikipedia,“they declare first, “is an encyclopedia. Its goals go no further.” Personal essays, dictionary entries, critical reviews, “propaganda or advocacy,” and “original research” are excluded. Historians may find the last exclusion surprising since we value original research above everything else, but it makes sense for a collaboratively created encyclopedia. How can the collectivity assess the validity of statements if there is no verification beyond the claim “I discovered this in my research”?11 As a result, Wikipedia (like encyclopedias in general) summarizes and reports the conventional and accepted wisdom on a topic but does not break new ground. And someone whose expertise rests on having done extensive original research on a topic gets no particular respect. That denigration of expertise contributed to Larry Sanger’s split from the project.
Colloborative history and controversy: The warning that “the neutrality of this entry is disputed” reflects the intensity—and lack of closure—of the debate (more than 300,000 words have been logged) on Wikipedia over how to present the Armenian genocide.http://en. wikipedia.org/wiki/Armenian_Genocide/ (March 8, 2006).
The second key Wikipedian injunction is to “avoid bias.” “Articles should be written from a neutral point of view [npov],” they insist, “representing differing views on a subject factually and objectively.” Historians who learned (or teach) the mantra that “there is no objective history” in their undergraduate history methods class will regard that advice with suspicion. But Wikipedians quickly point out that the npov policy (as it is incessantly referred to in Wikipedia discussions) “doesn’t assume that writing an article from a single, unbiased, objective point of view is possible.” Instead, Wikipedians say they want to describe disputes rather than to take sides in them, to characterize differing positions fairly.12
Of course, writing “without bias”—even in the circumscribed way that Wikipedia defines it—is, as Wikipedians concede, “difficult” since “all articles are edited by people” and “people are inherently biased.” But even if “neutrality” is a myth, it is a “founding myth” for Wikipedia much as “objectivity,” according to Peter Novick, is a “founding myth” for the historical profession. Wikipedia articles rarely ascend to the desired level of neutrality, but the npov policy provides a shared basis of discourse among Wikipedians. On the “Discussion” pages that accompany every Wikipedia article, the number one topic of debate is whether the article adheres to the npov. Sometimes, those debates can go on at mind-numbing length, such as the literally hundreds of pages devoted to an entry on the Armenian genocide that still carries a warning that “the neutrality of this article is disputed.”13Wikipedia entries on such controversial topics rarely succeed in meeting founder Jimmy Wales’s goal of presenting “ideas and facts in such a fashion that both supporters and opponents can agree.” But they surprisingly often achieve “a type of writing that is agreeable to essentially rational people who may differ on particular points.” Unfortunately, that “type of writing” sometimes leads to mushy prose, exemplified by this description of the historian Daniel Pipes: “He is a controversial figure, both praised and condemned by other commentators.”14
The third “key policy” is simpler: “don’t infringe copyrights.” Just as students can easily copy Wikipedia entries and submit them as term papers, Wikipedia authors can easily post prose copied from the vast plagiarism machine of the Web. But search engines make it relatively easy to catch both forms of plagiarism, and it does not seem to be much of a problem in Wikipedia. The more profound departure comes in the next sentence: Wikipedia “is a free encyclopedia licensed under the terms of the gnu Free Documentation License” (gfdl), a counterpart to the gnu General Public License (gpl) (used in free software projects such as Linux) designed for such open content as manuals and textbooks.15
The gfdl (and gpl) deviate most surprisingly from conventional intellectual property rules by giving you the freedom to use the text however you wish. As the license states: “You may copy and distribute the Document in any medium, either commercially or noncommercially, provided . . . you add no other conditions whatsoever to those of this License.”16 The “provided” clause means that any derivative document must inherit the same freedoms offered by the original—what gnuniks call “copyleft.” You can publish a compilation of presidential biographies based on the profiles in Wikipedia; you can even rewrite half of them. But your new version must give credit to Wikipedia and allow others to reuse and refashion your revised version. In fact, multiple versions of Wikipedia content have sprouted all over the Web.
One further implication of Wikipedia‘s implementation of free and open-source software principles is that its content is available to be downloaded, manipulated, and “data mined”—something not possible even with many resources (newspapers, for example) that can be read free online. Wikipedia can therefore be used for other purposes, including such questions-answering services as the Center for History and New Media’s automated historical fact finder, H-Bot. Or it might provide the basis for tools that would enable you to search intelligently through quantities of undifferentiated digital text and distinguish, say, between references to John D. Rockefeller and those to his son John D. Rockefeller Jr. As Daniel J. Cohen has argued, resources such as Wikipedia “that are free to use in any way, even if they are imperfect, are more valuable than those that are gated or use-restricted, even if those resources are qualitatively better.” Your freedom both to rewrite Wikipedia entries and to manipulate them for other purposes is thus arguably more profound than your ability to read them “for free.” It is why free-software advocates say that to understand the concept of free software, you should think of “free speech” more than “free beer.”17
The fourth pillar of Wikipedia wisdom is “respect other contributors.”18 Like writing without bias, it is easier said than done. What kind of respect, for example, do you owe a contributor who defaces other contributions or attacks other contributors? How do you ensure that entries are not continually filled with slurs and vandalism when the wiki allows any person anyplace to write whatever he or she pleases in any Wikipedia entry?
Wikipedia got by initially with a minimum of rules, in part to encourage participation.
We began [recalled Sanger] with no (or few) policies in particular and said that the community would determine—through a sort of vague consensus, based on its experience working together—what the policies would be. The very first entry on a “rules to consider” page was the “Ignore All Rules” rule (to wit: “If rules make you nervous and depressed, and not desirous of participating in the wiki, then ignore them entirely and go about your business”).
Over time, however, rules proliferated. But Wikipedia acquired laws before it had police or courts. Sanger and Wales “agreed early on that, at least in the beginning, [they] should not eject anyone from the project except perhaps in the most extreme cases. . . . despite the presence of difficult characters from nearly the beginning of the project.” Sanger himself became increasingly distressed by the tolerance of “difficult people,” or “trolls,” on Wikipedia, believing they drove away “many better, more valuable contributors.” Ultimately, the trolls wore Sanger down and pushed him out of the project.19
Although Sanger lost this battle, he may have won the war. Wikipedia gradually developed elaborate mechanisms for dealing with difficult people. It evolved intricate rules by which participants could be temporarily or even permanently banned from Wikipedia for inappropriate behavior. It also set up an elaborate structure of “administrators,” “bureaucrats,” “stewards,” “developers,” and elected trustees to oversee the project.20 But the ideal remained to reach consensus—somewhat in the style of 1960s participatory democracy—rather than to impose formal discipline.
Standing over this noisy democratic polis, however, is the founder, Jimmy Wales—the “God-King,” as some call him. The “banning policy” explains how users can be banned from Wikipedia by the “arbitration committee” or by Wikipedians acting “according to appropriate community-designed policies with consensus support.” But it also adds tersely: “Jimbo Wales retains the power to ban users, and has used it.” Wales’s power rests not just on his prestige as founder but also on his place in the encyclopedia’s legal structure. The Wikimedia Foundation, which controls Wikipedia, has a five-member board: two elected members plus Wales and two of his business partners.21
All of this works surprisingly well. To be sure, Wikipedia can be a bewildering and annoying place for newcomers. One familiar complaint is that “‘fanatic,’ even ‘kooky’ contributors with idiosyncratic, out-of-mainstream, non-scientific belief systems can easily push their point of view, because nobody has the time and energy to fight them, and because they may be highly-placed in the Wikipedian bureaucracy.” Yet somehow thousands of dispersed volunteers who do not know each other have organized a massive enterprise. Consensus and democracy fail at times. The Wikipedian collectivity must temporarily “lock” controversial entries because of vandalism and “edit wars” in which articles are changed and immediately changed back, such as an effort by NYCExpat to remove any references to Father Charles Coughlin’s anti-Semitism. But other entries—even ones in which dedicated partisans such as the followers of Lyndon LaRouche battle for their point of view—remain open for anyone to edit and still present a reasonably accurate account.22
Wikipedia as History
Wikipedia has created a working community, but has it created a good historical resource? Are Wikipedians good historians? As in the old tale of the blind men and the elephant, your assessment of Wikipedia as history depends a great deal on what part you touch. It also depends, as we shall see, on how you define “history.”
American historians might look first at the Wikipedia page headed “List of United States History Articles,” which includes twelve articles surveying American history in conventional time periods and another thirty or so articles on such key topics as immigration, diplomatic history, and women’s history. Unfortunately, the blind man reporting from those nether regions would return shaking his head in annoyance. He might start by complaining that the essay on the United States from 1918 to 1945 inaccurately describes the National Industrial Recovery Act of 1933 as in part a response to the “dissident challenges” of Huey Long and Father Charles Coughlin—a curious characterization of a law enacted when Coughlin was still an enthusiastic backer of Roosevelt and Long was an official (if increasingly critical) ally. But he would be much more distressed by the essay’s incomplete, almost capricious, coverage than by the minor errors. Dozens of standard topics—the Red Scare, the Ku Klux Klan, the Harlem Renaissance, woman suffrage, the rise of radio, the emergence of industrial unionism—go unmentioned. And he would grind his teeth over the awkward prose and slack analysis (“the mood of the nation rejected Wilson’s brand of internationalism”) and the sometimes confusing structure (the paragraph on legislation passed in 1935 appears in the section on Roosevelt’s second term).23
Other entries in the United States history series are worse. The entry on women leaves out the Nineteenth Amendment but devotes a paragraph to splits in the National Organization for Women (now) over the defense of Valerie Solanas (who shot Andy Warhol). The 1865 to 1918 entry only briefly alludes to the Spanish-American War but devotes five paragraphs to the Philippine war, an odd reversal of the general bias in history books, which tend to ignore the latter and lavish attention on the former. The essay also plagiarizes one sentence from another online source. The 4,000-word essay on the history of U.S. immigration verges on incoherence and mentions famine-era Irish immigration only in a one-line picture caption.24
Part of the problem is that such broad synthetic writing is not easily done collaboratively. Equally important, some articles do not seem to have attracted much interest from Wikipedians. The essay on the interwar years has had only 137 edits, about one-seventh the number of interventions in the article on fdr. Participation in Wikipedia entries generally maps popular, rather than academic, interests in history. U.S. cultural history, recently one of the liveliest areas of professional history writing, is what Wikipedia calls a “stub” consisting of one banal sentence (“The cultural history of the United States is a broad topic, covering or having influence in many of the world’s cultural aspects.”). By contrast, Wikipedia offers a detailed 3,100-word article titled “Postage Stamps and Postal History of the United States,” a topic with a devoted popular following that attracts little scholarly interest.25
Biographies of historical figures offer a more favorable terrain for Wikipedia since biography is always an area of popular historical interest. Moreover, biographies offer the opportunity for more systematic comparison because the unit of analysis is clear-cut, whereas other topics can be sliced and diced in multiple ways. But even to assess the quality of biographical writing in Wikipedia requires some context. You cannot compare, for example, Wikipedia‘s 5,000 words on Martin Luther King Jr. with Taylor Branch’s three-volume (2,900-page) prizewinning biography.26 But how does it stack up against other reference works?
I judged 25 Wikipedia biographies against comparable entries in Encarta, Microsoft’s well-regarded online encyclopedia (one of the few commercial encyclopedias that survive from a once-crowded marketplace), and in American National Biography Online, a high-quality specialized reference work published by Oxford University Press for the American Council of Learned Societies, written largely by professional historians, and supported by major grants. The comparison is unfair—both publications have had multimillion-dollar budgets—but it is still illuminating, and it sheds some favorable light on Wikipedia.27
In coverage Wikipedia currently lags behind the comprehensive American National Biography Online, which has 18,000 entries, but exceeds the general-interest Encarta. Of a sample of 52 people listed in American National Biography Online,Wikipedia included one-half, but Encarta only about one-fifth. The American National Biography Online profiles were also more detailed, averaging about four times as many words as those in Wikipedia.Encarta was the least detailed, with its entries for the sample only about one-quarter the length of Wikipedia‘s.28 Yet what is most impressive is that Wikipedia has found unpaid volunteers to write surprisingly detailed and reliable portraits of relatively obscure historical figures—for example, 900 words on the Union general Romeyn B. Ayres.
Relying on volunteers and eschewing strong editorial control leads to widely varying article lengths in Wikipedia. It devotes 3,500 words to the science fiction writer Isaac Asimov, more than it gives to President Woodrow Wilson (3,200) but fewer than it devotes to the conspiracy theorist and perennial presidential candidate Lyndon LaRouche (5,400); American National Biography Online provides a more proportionate (from a conventional historical perspective) coverage of 1,900 words for Asimov and 7,800 for Wilson. (It ignores the still-living LaRouche.) Of course, American National Biography Online also betrays the biases of its editors in its word allocations: Would nonhistorians agree that Charles Beard deserves twice as many words as the reformer and New Deal administrator Harold Ickes?
As the attention devoted to Asimov hints, Wikipedia‘s authors do not come from a cross-section of the world’s population. They are more likely to be English-speaking, males, and denizens of the Internet. Such bias has occasioned much discussion, including among Wikipedians. A page of candid self-criticism titled “Why Wikipedia Is Not So Great,” acknowledges that “geek priorities” have shaped the encyclopedia: “There are many long and well-written articles on obscure characters in science fiction/fantasy and very specialised issues in computer science, physics and math; there are stubs, or bot [machine=generated] articles, or nothing, for vast areas of art, history, literature, film, geography.” One regular contributor to Wikipedia‘s history articles observed (somewhat tongue in cheek): “Wikipedia kicks Britannica‘s ass when it comes to online mmp [massively multiplayer] games, trading card games, Tolkieana and Star Wars factoids!” “This is the encyclopedia that Slashdot built” goes a familiar complaint that alludes to the early promotion of Wikipedia by the Web site that bills itself as the home of “news for nerds.” The “Google effect” further encouraged participation by Web surfers. As Sanger later explained, “each time Google spidered [crawled] the website, more pages would be indexed; the greater the number of pages indexed, the more people arrived at the project; the more people involved in the project, the more pages there were to index.”29
Encyclopedia Britannica editor in chief Dale Hoiberg defensively pointed out to the Guardian that “Wikipedia authors write of things they’re interested in, and so many subjects don’t get covered; and news events get covered in great detail. The entry on Hurricane Frances is five times the length of that on Chinese art, and the entry on the British television show Coronation Street is twice as long as the article on Tony Blair.” (Wikipedians responded to this criticism defensively, making the Blair entry 50 percent longer than the one on the television show.) But the largest bias—at least in the English-language version—favors Western culture (and English-speaking nations), rather than geek or popular culture.30
Perhaps as a result, Wikipedia is surprisingly accurate in reporting names, dates, and events in U.S. history. In the 25 biographies I read closely, I found clear-cut factual errors in only 4. Most were small and inconsequential. Frederick Law Olmsted is said to have managed the Mariposa mining estate after the Civil War, rather than in 1863. And some errors simply repeat widely held but inaccurate beliefs, such as that Haym Salomon personally loaned hundreds of thousands of dollars to the American government during the Revolution and was never repaid. (In fact, the money merely passed through his bank accounts.) Both Encarta and the Encyclopedia Britannica offer up the same myth.31 The 10,000-word essay on Franklin Roosevelt was the only one with multiple errors. Again, some are small or widely accepted, such as the false claim (made by Roosevelt supporters during the 1932 election) that fdr wrote the Haitian constitution or that Roosevelt money was crucial to his first election to public office in 1910. But two are more significant—the suggestion that a switch by Al Smith’s (rather than John Nance Garner’s) delegates gave Roosevelt the 1932 nomination and the statement that the Supreme Court overruled the National Industrial Recovery Act (nira) in 1937, rather than 1935.
The lack of a single author or an overall editor means that Wikipedia sometimes gets things wrong in one place and right in another. The Olmsted entry has him (correctly) forming Olmsted, Vaux and Company in 1865 at the same time that he is (incorrectly) in California running Mariposa. The entry on Andrew Jackson Downing says that Olmsted and Calvert Vaux designed Central Park in 1853 even though the cross-referenced article on Vaux has them (accurately) winning the design competition in 1858.32
To find 4 entries with errors in 25 biographies may seem a source for concern, but in fact it is exceptionally difficult to get every fact correct in reference works. “People don’t realize how hard it is to nail the simplest things,” noted Lars Mahinske, a senior researcher for Britannica. I checked 10 Encarta biographies for figures that also appear in Wikipedia, and in the commercial product I found at least 3 biographies with factual mistakes. Even the carefully edited American National Biography Online, whose biographies are written by experts, contains at least one factual error in the 25 entries I examined closely, the date of Nobel Prize winner I. I. Rabi’s doctoral degree—a date that Wikipedia gets right. Indeed, Wikipedians, who are fond of pointing out that respected reference sources have mistakes, gleefully publish a page devoted to “Errors in the Encyclopedia Britannica That Have Been Corrected in Wikipedia.“33
Wikipedia, then, beats Encarta but not American National Biography Online in coverage and roughly matches Encarta in accuracy. This general conclusion is supported by studies comparing Wikipedia to other major encyclopedias. In 2004 a German computing magazine had experts compare articles in twenty-two different fields in the three leading German-language digital encyclopedias. It rated Wikipedia first with a 3.6 on a 5-point scale, placing it above Brockhaus Premium (3.3) and Encarta (3.1). The following year the British scientific magazine Nature asked experts to assess 42 science entries in Wikipedia and Encyclopedia Britannica, without telling them which articles came from which publication. The reviewers found only 8 serious errors, such as misinterpretations of major concepts—an equal number in each encyclopedia. But they also noted that Wikipedia had a slightly larger number (162 versus 123) of smaller mistakes, including “factual errors, omissions or misleading statements.” Nature concluded that “Britannica‘s advantage may not be great, at least when it comes to science articles,” and that “considering how Wikipedia articles are written, that result might seem surprising.”34
Thus, the free and open-source encyclopedia Wikipedia offers a formidable challenge to the well-established and seemingly authoritative Encyclopedia Britannica as well as to Microsoft’s newer and well-regarded Encarta just as the free and open-source Linux operating system now seriously challenges Microsoft’s Windows in the server market. Not surprisingly, Encarta has been scrambling to compete—both by making its content more generally available (you can get free access by using the msn search engine) and by inviting readers to propose edits to the content.
If the unpaid amateurs at Wikipedia have managed to outstrip an expensively produced reference work such as Encarta and provide a surprisingly comprehensive and largely accurate portrait of major and minor figures in U.S. history, professional historians need not fear that Wikipedians will quickly put them out of business. Good historical writing requires not just factual accuracy but also a command of the scholarly literature, persuasive analysis and interpretations, and clear and engaging prose. By those measures, American National Biography Online easily outdistances Wikipedia.
Compare, for example, Wikipedia‘s 7,650-word portrait of Abraham Lincoln with the 11,000-word article in American National Biography Online. Both avoid factual errors and cover almost every important episode in Lincoln’s life. But surely any reader of this journal would prefer the American National Biography Online sketch by the prominent Civil War historian James McPherson. Part of the difference lies in McPherson’s richer contextualization (such as the concise explanation of the rise of the Whig party) and his linking of Lincoln’s life to dominant themes in the historiography (such as free-labor ideology). But McPherson’s profile is distinguished even more by his artful use of quotations to capture Lincoln’s voice, by his evocative word portraits (the young Lincoln was “six feet four inches tall with a lanky, rawboned look, unruly coarse black hair, a gregarious personality, and a penchant for telling humorous stories”), and by his ability to convey a profound message in a handful of words (“The republic endured and slavery perished. That is Lincoln’s legacy.”). By contrast, Wikipedia‘s assessment is both verbose and dull: “Lincoln’s death made the President a martyr to many. Today he is perhaps America’s second most famous and beloved President after George Washington. Repeated polls of historians have ranked Lincoln as among the greatest presidents in U.S. history.”35
In addition to McPherson’s elegant prose, his profile embodies the skill and confident judgment of a seasoned historian. The same is true of many other American National Biography Online sketches—Alan Brinkley on Franklin Roosevelt or T. H. Watkins on Harold Ickes, for example. Those gems of short biographical writing combine crisp prose with concise judgments about the significance of their subjects. Even less masterly entries in American National Biography Online generally sport smoother prose than Wikipedia. And they also offer reliable bibliographic essays with the latest scholarly works. Wikipedia entries generally include references, but not always the best ones. The bibliography for Haym Salomon contains only two works, both published more than fifty years ago. Of one of those books, American National Biography Online warns that it “repeats all the myths and fabrications found in earlier accounts.”36
Of course, not all historians write as well as McPherson and Brinkley, and some of the better-written Wikipedia entries provide more engaging portraits than some sterile and routine entries in American National Biography Online. For example, the American National Biography Online sketch of the Hall of Fame pitcher Red Faber provides a plodding, almost year-by-year account, whereas Wikipedia gives a more concise overview of his career and significance. Wikipedia‘s profile of the Confederate guerrilla fighter William Clarke Quantrill arguably does a better job of detailing the controversies about his actions than American National Biography Online. Even so, it provides a typical waffling conclusion that contrasts sharply with the firm judgments in the best of the American National Biography Online essays: “Some historians,” they write, “remember him as an opportunistic, bloodthirsty outlaw, while others continue to view him as a daring soldier and local folk hero.”37
This waffling—encouraged by the npov policy—means that it is hard to discern any overall interpretive stance in Wikipedia history. One might expect—given the Randian politics of the founders and the strength of libertarian sentiments in cyberspace—a libertarian or conservative slant. But I did not find it. One can see occasional glimmers, as in the biography of Calvin Coolidge that says with apparent approval, “Coolidge was the last President of the United States who did not attempt to intervene in free markets, letting business cycles run their course.” This sentence was inserted early on by an avowed libertarian and it has survived dozens of subsequent edits. But Wikipedia also presents the socialist Eugene V. Debs in flattering terms; the only criticism is that he “underestimated the lasting power of racism.” At least one conservative blogger charges that Wikipedia is “more liberal than the liberal media.”38
If anything, the bias in Wikipedia articles favors the subject at hand. “Articles tend to be whatever-centric,” they acknowledge in one of their many self-critical commentaries. “People point out whatever is exceptional about their home province, tiny town or bizarre hobby, without noting frankly that their home province is completely unremarkable, their tiny town is not really all that special or that their bizarre hobby is, in fact, bizarre.” That localism can sometimes cause conflicts on nonlocal entries, as in the Olmsted profile, where a Wikipedian from Louisville complains on the “Discussion” page that the biography overestimates Olmsted’s work in Buffalo and ignores his work in—surprise!—Louisville.39
Moreover, the collective mode of composition in Wikipedia and the repeated invocation of the npov policy mean that it tends to avoid controversial stands of all kinds. Whereas there is much popular interest in lurid aspects of history, Wikipedia editors shy away from sensationalist interpretations (although not from discussion of controversies about such interpretations). The biography of Warren G. Harding cautiously warns of “innuendo” and “speculation” surrounding his extramarital affairs, expresses doubt about his alleged affair with Nan Britton, and insists that there is “no scientific or legal basis” for the rumors of Harding’s mixed “blood.” And while popular history leans toward conspiracy theories, Wikipedia seems more likely to debunk them. It judiciously concludes that there is “no evidence” that Roosevelt “knew all about the planned attack on Pearl Harbor but did nothing to prevent it.”40
Overall, writing is the Achilles’ heel of Wikipedia. Committees rarely write well, and Wikipedia entries often have a choppy quality that results from the stringing together of sentences or paragraphs written by different people. Some Wikipedians contribute their services as editors and polish the prose of different articles. But they seem less numerous than other types of volunteers. Few truly gifted writers volunteer for Wikipedia.Encarta, while less comprehensive than Wikipedia, generally offers better—especially, more concise—writing.
Even so, few would turn to Encarta or the Encyclopedia Britannica for good writing. Like other such works, Wikipedia employs the “encyclopedia voice,” a product, the former Encyclopedia Britannica editor Robert McHenry argued, of “a standardized process and standardized forms, and . . . a permanent editorial staff, whose members train their successors in what amounts to an apprenticeship.” It also reflects reference works’ general allergy to strongly stated opinions. More than forty years ago, Charles Van Doren, who became a senior editor at Encyclopedia Britannica after his quiz show debacle, complained that “the tone of American encyclopedias is often fiercely inhuman. It appears to be the wish of some contributors to write about living institutions as if they were pickled frogs, outstretched upon a dissecting board.” Contrast any modern encyclopedia entry with this one on John Keats by Algernon Charles Swinburne, in the (late nineteenth-century) ninth edition of the Encyclopedia Britannica: “The Ode to a Nightingale, one of the final masterpieces of human work in all time and for all ages, is immediately preceded in all editions now current by some of the most vulgar and fulsome doggerel ever whimpered by a vapid and effeminate rhymester in the sickly stage of whelphood.”41
Swinburne’s “bias” would have transgressed not only Wikipedia‘s npov but also the preference of conventional, modern encyclopedias for what McHenry calls “the blandness of mere information.” Indeed, the npov mimics conventional “encyclopedia style.” “Wikipedia users,” two social scientists conclude, “appropriate norms and expectations about what an ‘encyclopedia’ should be, including norms of formality, neutrality, and consistency, from the larger culture.” As a result, they find, over time Wikipedia entries become “largely indistinguishable stylistically from [those in] the expert-created Columbia Encyclopedia.“42
Conversely, the worst-written entries are the newest and least edited. As the “Replies to Common Objections” page explains: “Wikipedia has a fair bit of well-meaning, but ill-informed and amateurish work. In fact, we welcome it—an amateurish article to be improved later is better than nothing.”43That means you can encounter both the polished entry on Red Faber and the half-written article on women’s history. Less sophisticated readers may not know the difference.
They also may not realize when an article has been vandalized. But vandalism turns out to be less common than one would expect in a totally open system. Over a two-year period, vandals defaced the Calvin Coolidge entry only ten times—almost all with obscenities or juvenile jottings that would have not misled any visitor to the site. (The one exception changed his birth date to 1722, which was also unlikely to confuse anyone.) The median time for repairing the damage was three minutes.44 More systematic tests have found that vandalism generally has a short life on Wikipedia. The blogger Alex Halavais, graduate director for the informatics school at the University at Buffalo, inserted thirteen small errors into Wikipedia entries—including, for example, the claim that the “well-known abolitionist Frederick Douglass made Syracuse his home for four years.” To his surprise, vigilant Wikipedians removed all the mistakes within two and a half hours. Others have been more successful in slipping errors into the encyclopedia, including an invented history of Chesapeake, Virginia, describing it as a major importer of cow dung until “it collapsed in one tremendous heap,” which lasted on Wikipedia for a month.45 But vandals face formidable countermeasures that Wikipedia has evolved over time, including a “recent changes patrol” that constantly monitors changes reported on a “Recent Changes” page as well as “personal watchlists” that tell contributors whether an article of interest to them has been changed. On average, every article is on the watchlist of two accounts, and the keepers of those lists often obsessively check them several times a day. More generally, the sheer volume of edits—almost 100,000 per day—means that entries, at least popular entries, come under almost constant scrutiny.46
A portion of the revision history page for Franklin D. Roosevelt: The software for Wikipedia exhaustively records every change made to an entry and allows a visitor to the site to see each version of the article back to its origin.http://en.wikipedia.org/w/index.php?title=Franklin_D. _Roosevelt&action= history (March 28, 2006).
But, as a fall 2005 controversy involving an entry on the journalist John Seigenthaler makes clear, Wikipedia‘s controls and countermeasures are a work in progress, and vandalism in infrequently read entries can slip under the radar. In May 2005 Brian Chase altered the article on Seigenthaler to play a “joke” on a co-worker at Rush Delivery in Nashville, Tennessee, where Seigenthaler’s late brother had been a client. The not very humorous change suggested that Seigenthaler, who once worked for Robert Kennedy, was thought “to have been directly involved in the Kennedy assassinations of both John, and his brother, Bobby.” In September, Seigenthaler learned about the scurrilous charges and complained to Jimmy Wales, who removed them from both the active page and the page history. But, as Seigenthaler wrote in USA Today in late November, “the false, malicious ‘biography'” had “appeared under [his] name for 132 days.” Moreover, sites that mirror Wikipedia‘s content such as Answers.com and Reference.com retained the falsehoods for another three weeks. The episode received wide notice, with many Wikipedia critics echoing Seigenthaler’s charge that the online encyclopedia “is a flawed and irresponsible research tool” where “volunteer vandals with poison-pen intellects” abound.47
Wikipedia‘s defenders complained, in the words of Paul Saffo, director of the Institute for the Future, that Seigenthaler “clearly doesn’t understand the culture of Wikipedia.” Saffo and others argued that Seigenthaler “should have just changed” the false statements. But Seigenthaler pointed out that the lies were online for several months before he even knew about them and that he did not want to have anything to do with the flawed enterprise. A more persuasive defense, offered by others, acknowledged the flaws, but pointed out the relative ease of correcting them. After all, malicious gossip has long surrounded public figures, but it is very hard to track down and stop. Even when it appears in print publications, which are subject (as Wikipedia is not) to libel laws, the only remedy is going to court. In the case of Wikipedia, the defamatory statements about Seigenthaler were entirely expunged. Professor Lawrence Lessig of the Stanford Law School argued that defamation is a by-product of free speech and that while “Wikipedia is not immune from that kind of maliciousness . . . it is, relative to other features of life, more easily corrected.” As Wade Roush, an editor at TechnologyReview.com wrote in his blog, “the community-editing model gives us a newfound power to create wrongs—but also to reverse wrongs.”48
Still, the episode eroded Wikipedia‘s credibility and led to efforts at damage control. Jimmy Wales announced that Wikipedia would now require users to register before creating new articles. Of course, that rule would not have stopped Brian Chase because registration will not be required simply to edit an existing entry. Moreover, registration may actually provide less accountability; you need not report even an e-mail address to register, whereas unregistered users have their Internet Protocol (ip) addresses recorded, and it was such an address that made it possible to track down Chase. And Wikipedia still lacks any mechanism for guaranteeing an entry’s accuracy at the moment when you land on the site; a vandal or even a scholar trying to test the system might have just changed the “fact” that you are seeking. Wikipedians have discussed possible solutions to this problem. For example, visitors could have the option of viewing only a version of an article that had been “patrolled,” that is, checked for random vandalism, or users could have the choice of seeing an “approved” page or one “pending” approval from a certain number of editors.49
Wikipedia already offers a limited version of that choice by allowing you to check the page’s “history.” The wiki software allows you to compare every single version of an article going back to its creation. In a widely circulated critique of Wikipedia, the former Encyclopedia Britannica editor McHenry observed that “the user who visits Wikipedia . . . is rather in the position of a visitor to a public restroom. It may be obviously dirty, so that he knows to exercise great care, or it may seem fairly clean, so that he may be lulled into a false sense of security. What he certainly does not know is who has used the facilities before him.” McHenry is right about the “publicness” of Wikipedia, but why not choose a more uplifting analogy, like the public school or the public park? Moreover, he is wrong about not knowing what came before you. The “History” page tells you not only who used the facilities (at least their usernames or ip addresses) but also precisely what they did there. Indeed, simply taking information buried on the “History” page and making it more publicwould enhance Wikipedia—for example, the “Article” page might say, “This article has been edited 350 times since it was created on May 5, 2002, including 30 times in the past week.” It could even add that “very active Wikipedians” (those with more than one hundred edits this month) contributed 52 percent of those edits. Such information could be automatically generated, and it would give the reader additional clues to the quality of the entry. Another possible improvement would have readers rate the quality of individual Wikipedia entries, an approach used by a number of popular Internet sites, including Amazon.com (which enjoins visitors not just to review and rate books but also to answer the question “Was this review helpful to you?”) and Slashdot (which has a complex system of “moderation” that rates the quality of posted comments). During the Seigenthaler controversy, Wales announced that Wikipedia would be adding this feature soon.50
As Roush, Lessig, and others argued amid the Seigenthaler uproar, Wikipedia‘s lack of fixity also has a more positive face—it can be updated instantly. Wikipedians like to point out that after the Indian Ocean tsunami of 2004 they added relevant entries within hours, including animations, geological information, reports on the international relief effort, and comprehensive links. Of course, the ability to capture the news of the day is of less interest to historians, but Wikipedia has also quickly captured the latest historical “news.” You had to wait until the morning of June 1, 2005, to learn from your local newspaper that W. Mark Felt had been unmasked as “Deep Throat,” but even before the evening news on May 31 you could have read about it in Wikipedia‘s article on the “Watergate scandal.” Like journalism, Wikipedia offers a first draft of history, but unlike journalism’s draft, that history is subject to continuous revision. Wikipedia‘s ease of revision not only makes it more up-to-date than a traditional encyclopedia, it also gives it (like the Web itself) a self-healing quality since defects that are criticized can be quickly remedied and alternative perspectives can be instantly added. McHenry’s critique, for example, focused on problems in the entry on Alexander Hamilton. Two days later, they were fixed.51
Why Should We Care? Implications for Historians
One reason professional historians need to pay attention to Wikipedia is because our students do. A student contributor to an online discussion about Wikipedia noted that he used the online encyclopedia to study the historical terms for a test on early romanticism in Britain. Other students routinely list it in term paper bibliographies. We should not view this prospect with undue alarm. Wikipedia for the most part gets its facts right. (The student of British culture reported that Wikipedia proved as accurate as the Encyclopedia Britannica and easier to use.) And the general panic about students’ use of Internet sources is overblown. You can find bad history in the library, and while much misinformation circulates on the Internet, it also helps to debunk myths and to correct misinformation.52
Yet, the ubiquity and ease of use of Wikipedia still pose important challenges for history teachers. Wikipedia can act as a megaphone, amplifying the (sometimes incorrect) conventional wisdom. As Wikinfo (a fork, or spin-off, from Wikipedia) explains: “A wiki with so many hundreds of thousands of pages is bound to get some things wrong. The problem is, that because Wikipedia has become the ‘aol‘ [America Online] of the library and reference world, such false information and incorrect definitions of terms become multiple incompetences, propagated to millions of potential readers world-wide.” Not only does Wikipedia propagate misinformation but so do those who appropriate its content, as they are entitled to do under the gfdl. As a result, as the blogger John Morse observed, “when you search Google for some obscure term that Wikipedia knows about, you might get two dozen results that all say the same thing—seemingly authoritative until you realize they all spread from a snapshot of Wiki—one that is now severed from the context of editability and might seem more creditable than it really is.” The Web site Answers.com, which promises to provide “quick, integrated reference answers,” relies heavily on Wikipedia for those answers. And Google, which already puts Wikipedia results high in its rankings, now sends people looking for “definitions” to Answers.com. Can you hear the sound of one hand clapping?53
Wikipedia‘s ease of use and its tendency to show up at the top of Google rankings in turn reinforce students’ propensity to latch on to the first source they encounter rather than to weigh multiple sources of information. Teachers have little more to fear from students’ starting with Wikipedia than from their starting with most other basic reference sources. They have a lot to fear if students stop there. To state the obvious: Wikipedia is an encyclopedia, and encyclopedias have intrinsic limits. Most readers of this journal have not relied heavily on encyclopedias since junior high school days. And most readers of this journal do not want their students to rely heavily on encyclopedias—digital or print, free or subscription, professionally written or amateur and collaborative—for research papers. One Wikipedia contributor noted that despite her “deep appreciation for it,” she still “roll[s her] eyes whenever students submit papers with Wikipedia as a citation.” “Any encyclopedia, of any kind,” wrote another observer, “is a horrible place to get the whole story on any subject.” Encyclopedias “give you the topline”; they are “the Reader’s Digest of deep knowledge.” Fifty years ago, the family encyclopedia provided this “rough and ready primer on some name or idea”; now that role is being played by the Internet and increasingly by Wikipedia.54
But should we blame Wikipedia for the appetite for predigested and prepared information or the tendency to believe that anything you read is true? That problem existed back in the days of the family encyclopedia. And one key solution remains the same: Spend more time teaching about the limitations of all information sources, including Wikipedia, and emphasizing the skills of critical analysis of primary and secondary sources.
Another solution is to emulate the great democratic triumph of Wikipedia—its demonstration that people are eager for free and accessible information resources. If historians believe that what is available free on the Web is low quality, then we have a responsibility to make better information sources available online. Why are so many of our scholarly journals locked away behind subscription gates? What about American National Biography Online—written by professional historians, sponsored by our scholarly societies, and supported by millions of dollars in foundation and government grants? Why is it available only to libraries that often pay thousands of dollars per year rather than to everyone on the Web as Wikipedia is? Shouldn’t professional historians join in the massive democratization of access to knowledge reflected by Wikipedia and the Web in general?55American National Biography Online may be a significantly better historical resource than Wikipedia, but its impact is much smaller because it is available to so few people.
The limited audience for subscription-based historical resources such as American National Biography Online becomes an even larger issue when we move outside the borders of the United States and especially into poorer parts of the world, where such subscription fees pose major problems even for libraries. Moreover, in some of those places, where censorship of textbooks and other historical resources is common, the fact that Wikipedia‘s freedom means both “free beer” and “free speech” has profound implications because it allows the circulation of alternative historical voices and narratives. Some repressive governments have responded by restricting access to Wikipedia. China, for example, currently prevents its citizens from reading the English- or Chinese-language versions of Wikipedia. And it is probably not a coincidence that the first blocking of Wikipedia in China began on the fifteenth anniversary of the Tiananmen Square protests.56
Professional historians have things to learn not only from the open and democratic distribution model of Wikipedia but also from its open and democratic production model. Although Wikipedia as a product is problematic as a sole source of information, the process of creating Wikipedia fosters an appreciation of the very skills that historians try to teach. Despite Wikipedia‘s unconventionality in the production and distribution of knowledge, its epistemological approach—exemplified by the npov policy—is highly conventional, even old-fashioned. The guidelines and advice documents that Wikipedia offers its editors sound very much like the standard manuals offered in undergraduate history methods classes. Editors are enjoined, for example, to “cite the source” and to check their facts and reminded that “verifiability” is an “official policy” of Wikipedia. An article directed at those writing articles about history for Wikipedia explains (in the manner of a History 101 instructor) the difference between primary and secondary sources and also suggests helpfully that “the correct standard of material to generate encyclopedic entries about historical subjects are: 1. Peer reviewed journal articles from a journal of history; 2. Monographs written by historians (BA Hons (Hist), MA, PhD); 3. Primary sources.”57
Participants in the editing process also often learn a more complex lesson about history writing—namely that the “facts” of the past and the way those facts are arranged and reported are often highly contested. One Wikipedia guideline document reports with an air of discovery: “Although it doesn’t seem to be logical to worry about a Wikipedia article, people do battle over history and the way it is written all the time.” And such skirmishes break out all over Wikipedia. Each article contains a companion “Discussion” page, and on those pages, editors engage—often intensely—in what can only be called historiographic debate. Was Woodrow Wilson a racist? Did the New Deal resolve the problems of the Great Depression? Sometimes relatively narrow issues are debated (for example, William Jennings Bryan’s role in the passage of the Butler Act, which prohibited the teaching of evolution in Tennessee) that open up much broader issues (for example, the sources of antievolution sentiment in the 1920s).58
Was John Brown a murderer? On this portion of the discussion page for the article on John Brown, Wikipedians debate whether to use “killed” or “murdered” to refer to his actions and which word accords with the encyclopedia’s neutral-point-of-view policy. For Wikipedia’s thousands of contributors, such discussions are a form of popular historiographic debate.http ://en.wikipedia.org/wiki/Talk:John_Brown_(abolitionist)/ (March 8, 2006).
Wikipedia has even developed its own form of peer review in its debates on whether articles deserve “featured article” status. Those aspiring to have their articles receive that status—given to the best .1 percent of articles as judged by such criteria as completeness, factual accuracy, and good writing—are encouraged to request “peer review” in order to “expose articles to closer scrutiny than they might otherwise receive.”59 Then further public debate decides whether Wikipedians agree on awarding featured article status.
Thus, those who create Wikipedia‘s articles and debate their contents are involved in an astonishingly intense and widespread process of democratic self-education. Wikipedia, observes one Wikipedia activist, “teaches both contributors and the readers. By empowering contributors to inform others, it gives them incentive to learn how to do so effectively, and how to write well and neutrally.” The classicist James O’Donnell has argued that the benefit of Wikipedia may be greater for its active participants than for its readers: “A community that finds a way to talk in this way is creating education and online discourse at a higher level.”60
My colleagues at the Center for History and New Media interviewed people who regularly contribute to history articles on Wikipedia, and a passion for self-education comes through in numerous interviews. A Canadian contributor, James Willys Rosenzweig (no relation), observed that his “involvement in Wikipedia [is] a natural fit” because “I am interested in a broad variety of subjects, and I read for pleasure in as many fields as I can.” APWoolrich, a British contributor who left school at age sixteen and became an ardent self-taught industrial archeologist, answered the question “Why do I enjoy it?” with “It beats tv any day, in my view!”61
But APWoolrich is as enthusiastic about contributing to the education of others as to his own. Wikipedia, he told us, “accords with my personal philosophy of sharing knowledge, and it links me with the rest of humanity.” He believes we have a “duty” to share knowledge “without thought of reward.” “Wikipedia is the ‘Invisible College’ concept revived for the 21st century.” A blind high school student had a different reference point. “It is almost like playing a computer game but it is actually useful because it helps someone anywhere in the world get information that is uncluttered by junk,” he told us. “I think of myself as a teacher,” said Einar Kvaran, an uncredentialed “art historian without portfolio,” who spends about six hours a day writing articles about American art and sculpture. Like bloggers and amateur Web site developers, contributors to Wikipedia enjoy the opportunity to make their work public and to contribute to building the public space of the Web.62
Should those who write history for a living join such popular history makers in writing history in Wikipedia? My own tentative answer is yes.63 If Wikipedia is becoming the family encyclopedia for the twenty-first century, historians probably have a professional obligation to make it as good as possible. And if every member of the Organization of American Historians devoted just one day to improving the entries in her or his areas of expertise, it would not only significantly raise the quality of Wikipedia, it would also enhance popular historical literacy. Historians could similarly play a role by participating in the populist peer review process that certifies contributions as featured articles.
Still, my view is tempered by the recognition that the encounter between professional historians and amateur Wikipedians is likely to be rocky at times. That seems to have been particularly true in the early days of Wikipedia. Larry Sanger reported that some of earliest contributors were “academics and other highly-qualified people”—including two historians with Ph.D.s—who “were slowly worn down and driven away by having to deal with difficult people on the project.” “I feel that my integrity has been questioned,” the historian J. Hoffmann Kemp wrote in signing off in August 2002. “I’m too tired to play anymore.”64
Even Jimmy Wales, who has been more tolerant of “difficult people” than Sanger, complained about “an unfortunate tendency of disrespect for history as a professional discipline.” He saw the tendency reflected in historical entries that synthesize “work in a non-standard way” and “produce novel narratives and historical interpretations with citation to primary sources to back up their interpretation of events.” He noted that “some who completely understand why Wikipedia ought not create novel theories of physics by citing the results of experiments and so on and synthesizing them into something new, may fail to see how the same thing applies to history.”65
But the flip side of Wales’s respect for the historical discipline, as expressed in the ban on original research (and original interpretations), is that it seemingly limits professional historians’ role in Wikipedia. The “no original research policy” means that you cannot offer a startling new interpretation of Warren Harding based on newly uncovered sources. As a result, while Wikipedia officially “welcomes experts and academics,” it also warns that “such experts do not occupy a privileged position within Wikipedia. They should refer to themselves and their publications in the third person and write from a neutral point of view (npov). They must also cite publications, and may not use their unpublished knowledge as a source of information (which would be impossible to verify).”66
Even a comparison that focuses on the ban on original research understates the differences between professionals and amateurs. For one thing, historical expertise does not reside primarily in the possession of some set of obscure facts. It relies more often on a deep acquaintance with a wide variety of already published narratives and an ability to synthesize those narratives (and facts) coherently. It is considerably easier to craft a policy about “verifiability” or even “neutrality” than about “historical significance.” Professional historians might find an account accurate and fair but trivial; that is what some see as the difference between history and antiquarianism. Thus, the conflict between professionals and amateurs is not necessarily a simple one over whether people are doing good or bad history but a more complex (and more interesting) conflict about what kind of history is being done. Comparing the free Wikipedia and the costly and expensively produced American National Biography Online erects professional historical scholarship as a trans-historical and transcultural standard of history writing when we know that there are many ways of writing and talking about the past. What is particularly interesting and revealing about Wikipedia is its reflection of what we could call a “popular history poetics” that follows different rules from conventional professional scholarship.67
One noticeable difference is the affection for surprising, amusing, or curious details—something that Wikipedia shares with other forms of popular historical writing such as articles in American Heritage magazine. Consider some details that Wikipedians include in their Lincoln biography that do not make their way into McPherson’s profile: Lincoln’s sharing a birthday with Charles Darwin; his nicknames (the Rail Splitter is mentioned twice); his edict making Thanksgiving a national holiday; and the end of his bloodline with the death of Robert Beckwith in 1985. Not surprisingly, Wikipedia devotes five times as much space to Lincoln’s assassination as the longer American National Biography Online profile does.68 The same predilection for colorful details marks other portraits. We learn from the Harding biography that the socialist Norman Thomas was a paper boy for the Marion Daily Star (which Harding owned), that Harding reached the sublime degree as a Master Mason, and that Al Jolson and Mary Pickford came to Marion, Ohio, during the 1920 campaign for photo ops. It devotes two paragraphs to speculation about whether Harding had “Negro blood” and five paragraphs to his extramarital affairs. Meanwhile, key topics—domestic and foreign policies, the Sheppard-Towner Maternity and Infancy Act of 1921, immigration restriction, and naval treaties—are ignored or hurried over. We similarly learn that Woodrow Wilson belonged to Phi Kappa Psi fraternity and wrote his initials on the underside of a table in the Johns Hopkins University history department, but not about his law practice or his intellectual development at Princeton University.69
Wikipedia‘s view of history is not only more anecdotal and colorful than professional history, it is also—again like much popular history—more factualist. That is reflected in the incessant arguing about npov, but it can also be seen in the obsession with list making. The profile of fdr leads you not just to a roll of all presidents but also to a list of every secretary of the interior, every chairman of the Democratic National Committee, every key event that happened on April 12 (when Roosevelt died), and every major birth in 1882 (when he was born). From the perspective of professional historians, the problem of Wikipedian history is not that it disregards the facts but that it elevates them above everything else and spends too much time and energy (in the manner of many collectors) on organizing those facts into categories and lists.
Finally, Wikipedian history is presentist in a slightly different way from that of professional history—where, for example, a conservative turn in the polity leads us to reevaluate conservatism in the past. Rather, Wikipedia entries often focus on topics that have ignited recent public, not just professional, controversy. The topic of Lincoln’s sexuality—not mentioned by McPherson—occupied so much of the Wikipedia biography that in December 2004 a separate 1,160-word entry was created that focuses on C. A. Tripp’s controversial, then-recent book The Intimate World of Abraham Lincoln. The entry on the Spanish-American War examines in considerable detail whether the Maine was sunk by a mine (a subject in the news as the result of a 1998 National Geographic study) but pays no attention to the important (to professional historians) arguments of Kristin L. Hoganson’s book of the same year that “gender politics” provoked the war.70
That the latest article in National Geographic rather than the latest book from Yale University Press shapes Wikipedia entries reflects the fact that Wikipedia historians operate in a different world than historians employed in universities. Although Wikipedia enjoins its authors to “cite the source,” that policy is honored mainly in the breach—unlike in academic historical journals, where authors and editors obsess over proper and full citation. Moreover, the bibliographies offered after Wikipedia entries are often incomplete or out-of-date—a cardinal sin in professional history. Yet Wikipedians are mindful of a wider community of “historians.” It is just that for them the most important community is authors of other Wikipedia entries. And every article includes literally dozens of cross-references (links) to other Wikipedia articles.
An account of Lincoln’s life that focuses on debates about his sexuality and dwells on his birth date, nicknames, and assassination is not “wrong,” but it is not the kind of brief account that a professional historian such as McPherson would write. Professional historians who enter the terrain of Wikipedia will have an easy time correcting the year when the Supreme Court invalidated the nira but a much harder time eliminating Lincoln’s nicknames. Wikipedians would agree with professional historians that the Supreme Court decision happened on a particular day, but they might not agree that Lincoln’s nicknames are “unimportant” or “uninteresting.” And such historians will have to decide how much of their disciplinary “authority” they are prepared to “share” in this new public space.71
Although making people we generally view as our audience into our collaborators may prove unsettling, it will also be instructive. One history doctoral student at an Ivy League institution who has contributed actively to Wikipedia explained that “I use it primarily to practice writing for a non-academic audience, and as a way to solidify my understanding of topics (nothing helps one remember things like rewriting it).” He added, “I regard my Wikipedia contributions as informal and relatively anonymous, and use a much more casual demeanor than one would use in a professional setting (that is, I often tell people they don’t know what they’re talking about).”72 If Wikipedia teaches us (and our students) to speak more clearly to the public and to say more clearly what is on our minds, it will have a positive impact on academic culture.
But a much broader question about academic culture is whether the methods and approaches that have proven so successful in Wikipedia can also affect how scholarly work is produced, shared, and debated. Wikipedia embodies an optimistic view of community and collaboration that already informs the best of the academic enterprise. The sociologist Robert K. Merton talked about “the communism of the scientific ethos,” and communal sharing is an ideal that some historians hold and that many of our practices reflect, even while alternative, more individualistic and competitive, modes also thrive.73
Can the wiki way foster the collaborative creation of historical knowledge? One promising approach would leverage the volunteer labor of amateurs and enthusiasts to advance historical understanding. Historians have, of course, benefited from the labors of amateurs and volunteers. Think of the generations of local historians who have collected, preserved, and organized historical documents subsequently mined by professional historians. But the new technology of the Internet opens up the possibility of much more massive efforts relying on what the legal scholar Yochai Benkler has called “commons-based peer production.” The “central characteristic” of such production, wrote Benkler, “is that groups of individuals successfully collaborate on large-scale projects following a diverse cluster of motivational drives and social signals, rather than either market prices or managerial commands.” “Ubiquitous computer communications networks,” he argued, have brought about “a dramatic change in the scope, scale, and efficacy of peer production.”74 The most prominent recent example of such non-market-based peer production is free and open-source software. The Internet would now grind to a halt without such free and open-source resources as the operating system Linux, the Web server software Apache, the database MySql, and the programming language php.
Yet, as Benkler showed, the peer production of information is much broader than free software, and he offers Wikipedia as one notable example. Another—and one perhaps more relevant to professional historians—is the National Aeronautics and Space Administration’s (nasa) Ames Clickworkers project, which encouraged volunteers to “mark craters on maps of Mars, classify craters that have already been marked, or search the landscape of Mars for ‘honeycomb’ terrain.” In six months, more than 85,000 people visited the site and made almost 2 million entries. An analysis of the markings found that “the automatically-computed consensus of a large number of clickworkers is virtually indistinguishable from the inputs of a geologist with years of experience in identifying Mars craters.”75
Probably the closest historical equivalent to the nasa clickworkers are the legions of volunteer genealogists who have been digitizing thousands of documents. For example, volunteers working for the Church of Jesus Christ of Latter-day Saints digitized the records of the 55 million people listed in the 1880 United States census and the 1881 Canadian census and made them available for free at the church’s FamilySearch Internet Genealogy Service. Another volunteer effort, Project Gutenberg, has created an online repository of 15,000 e-texts of public domain books. Optical character recognition (ocr) software can relatively cheaply and automatically digitize print works, but it is generally only 95-99 percent accurate. To get a fully clean text is more expensive. Enter “distributed proofreaders”—a collaborative Web-based method of proofreading that breaks a work into individual pages to allow multiple proofreaders to work on the same book simultaneously. About half of the Project Gutenberg books have come out of this commons-based peer production.76
What if we organized a similar “distributed transcribers” to work on handwritten historical documents that otherwise will never be digitized? Volunteers could take their turns transcribing page images of the widely used Cameron Family Papers at the Southern Historical Collection that would be presented to them online. The same automated checking process used by Ames Clickworkers or among distributed proofreaders could be applied. A similar approach could be taken to transcribing the massive quantities of recorded sound—the Lyndon B. Johnson tapes, for example—that are enormously expensive to transcribe and cannot be rendered into text with current automated methods. Max J. Evans, the head of the National Historical Publications and Records Commission, has recently proposed something similar. He called for a corps of “volunteer data extractors” who would index and describe archival collections that are currently only minimally processed. Such an approach, he argues, would take “advantage of organized, or self-selected and anonymous users who can work at home and in remote locations.”77
The barriers to success in such a project are more social than technological. Devising the systems to present the page images or tapes online is not so difficult. It is harder to create the interest to involve volunteers in such a project. But who would have thought that 85,000 people would volunteer to look for Mars craters or that 60,000 people would write and edit entries for Wikipedia? Of course, denizens of the Internet are likely to be more excited about searching through Mars craters than through nineteenth-century women’s diaries. Still, such projects have shown the ability, as Benkler wrote, to “capitalize on an enormous pool of underutilized intelligent human creativity and willingness to engage in intellectual effort.”78
If the Internet and the notion of commons-based peer production provide intriguing opportunities for mobilizing volunteer historical enthusiasm to produce a massive digital archive, what about mobilizing and coordinating the work of professional historians in that fashion? That so much professional historical work already relies on volunteer labor—the peer review of journal articles, the staffing of conference program committees—suggests that professionals are willing to give up significant amounts of their time to advance the historical enterprise. But are they also willing to take the further step of abandoning individual credit and individual ownership of intellectual property as do Wikipedia authors?
Could we, for example, write a collaborative U.S. history textbook that would be free to all our students? After all, there is massive overlap in content and interpretation among the more than two dozen college survey textbooks. Yet the commercial publishing system mandates that every new survey text start from scratch. An open-source textbook would not only be free to everyone to read, it would also be free to everyone to write. An instructor dissatisfied with the textbook’s version of the War of 1812 could simply rewrite those pages and offer them to others to incorporate. An instructor who felt that the book neglected the story of New Mexico in the nineteenth century could write a few paragraphs that others might decide to incorporate.
This model imagines something open and anarchistic in the style of Wikipedia. Textbooks (not to mention scholarly articles) pose deeper problems of mediating conflicting interpretation than are faced by Wikipedia with its factualist emphasis. But commons-based peer production need not be so unstructured. After all, not everyone can rewrite the Linux kernel core. Everyone can contribute ideas and codes, but a central committee decides what is incorporated in an official release. Similarly, PlanetMath, a free online collaborative math encyclopedia, uses an “owner-centric” authority model in contrast to Wikipedia‘s “free form” approach. As one of the founders, Aaron Krowne, has explained, “there is an owner of each entry—initially the entry’s creator. Other users may suggest changes to each entry, but only the owner can apply these changes. If the owner comes to trust individual users enough, he or she can grant these specific users ‘edit’ access to the entry.” This has the potential disadvantage of discouraging open participation and requiring more commitment from some participants, but it gives a much stronger place to expertise by assuming that the “owner is the de facto expert in the topic at hand, above all others, and all others must defer.”79
Even so, the difficulties in implementing such a model for professional scholarship are obvious. How would you deal with the interpretative disputes that are at the heart of scholarly historical writing? How would we allocate credit, which is so integral to professional culture? Could you get a promotion based on having “contributed to” a collaborative project? There are no easy solutions. But it is worth noting that contributors to open-source software projects are not motivated simply by altruism. Their reputations—and hence their attractiveness as employees—are often greatly enhanced by participation in such projects. And we do reward people for collaborative professional work such as service on an editorial board. Nor are collaborative projects as free and frictionless as their greatest enthusiasts like to maintain. There are significant organizational costs—what the economists call “transaction costs”—to creating and maintaining such projects. Someone has to pay for the servers and the bandwidth and install and update the software. Wikipedia would have never gotten off the ground without the support of Wales and Bomis. More recently, it has launched fund-raising campaigns to cover its substantial and growing expenses.
Still, Wikipedia and Linux show that there are alternative models to producing encyclopedias and software than the hierarchical, commercial model represented by Bill Gates and Microsoft. And whether or not historians consider alternative models for producing their own work, they should pay closer attention to their erstwhile competitors at Wikipedia than Microsoft devoted to worrying about an obscure free and open-source operating system called Linux.
Roy Rosenzweig is the Mark and Barbara Fried Professor of History and New Media at George Mason University and director of the Center for History and New Media.
My thanks to Dan Cohen, Deborah Kaplan, and T. Mills Kelly for helpful comments on earlier versions of this essay, to Joan Fragaszy for research assistance, and to Susan Armeny for her—as always—superb editorial suggestions. Some of the research for this essay was supported by a generous grant from the Alfred P. Sloan Foundation.
Readers may contact Rosenzweig at firstname.lastname@example.org.
1 My thanks to Melissa Beaver of the Journal of American History for compiling these figures. The 32,000 works include about 7,000 dissertations, which are never coauthored, but they also include coedited books, which involve a lower level of collaboration than coauthored books or articles.
2 See Richard Hofstadter, The Age of Reform: From Bryan to F.D.R. (New York, 1955), 131–73.
3http ://en.Wikipedia.org/wikistats/EN/TablesArticlesTotal.htm (Sept. 5, 2005). This count covers the period from the creation of the article on Franklin D. Roosevelt in September 2001 through July 4, 2005. See http://en. wikipedia.org/wiki/Franklin_Delano_Roosevelt. I am citing Wikipedia articles by url and indicating the date accessed in parentheses because the articles continually change; readers can access the version I used by selecting the “history” tab and viewing the version from that date. All undated online resources were available when checked on Dec. 27, 2005.
4 Latest available numbers on visitors are for October 2004. The “official article count” for November 2005 is 2.9 million, 866,000 of them in English, according to http:/ /en.wikipedia.org/wikistats/EN/TablesUsageVisits.htm (March 14, 2006). But the English-language home page says 1,023,303 articles. See http://en.wikipedia.org/ wiki/Main_Page (March 14, 2006). Alexa rankings (available at http://www.alexa.com/) are from March 14, 2006. Information on number of employees was provided by Terry Foote (one of the employees) at a Hewlett Foundation meeting in Logan, Utah, on Sept. 27, 2005. See also Wikimedia Foundation, Budget/2005 http:// wikimediafoundation.org/wiki/Budget/2005 (Oct. 23, 2005). The statements of praise are quoted in Robert McHenry, “The Faith-Based Encyclopedia,”
HTML and XHTML
❮ PreviousNext ❯
XHTML is HTML written as XML.
What Is XHTML?
- XHTML stands for EXtensible HyperText Markup Language
- XHTML is almost identical to HTML
- XHTML is stricter than HTML
- XHTML is HTML defined as an XML application
- XHTML is supported by all major browsers
Many pages on the internet contain "bad" HTML.
This HTML code works fine in most browsers (even if it does not follow the HTML rules):
<title>This is bad HTML</title>
<p>This is a paragraph
Today's market consists of different browser technologies. Some browsers run on computers, and some browsers run on mobile phones or other small devices. Smaller devices often lack the resources or power to interpret "bad" markup.
XML is a markup language where documents must be marked up correctly (be "well-formed").
If you want to study XML, please read our XML tutorial.
By combining the strengths of HTML and XML, XHTML was developed.
XHTML is HTML redesigned as XML.
The Most Important Differences from HTML:
- XHTML DOCTYPE is mandatory
- The xmlns attribute in <html> is mandatory
- <html>, <head>, <title>, and <body> are mandatory
- XHTML elements must be properly nested
- XHTML elements must always be closed
- XHTML elements must be in lowercase
- XHTML documents must have one root element
- Attribute names must be in lower case
- Attribute values must be quoted
- Attribute minimization is forbidden
<!DOCTYPE ....> Is Mandatory
An XHTML document must have an XHTML DOCTYPE declaration.
A complete list of all the XHTML Doctypes is found in our HTML Tags Reference.
The <html>, <head>, <title>, and <body> elements must also be present, and the xmlns attribute in <html> must specify the xml namespace for the document.
This example shows an XHTML document with a minimum of required tags:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
<title>Title of document</title>
XHTML Elements Must Be Properly Nested
In HTML, some elements can be improperly nested within each other, like this:
<b><i>This text is bold and italic</b></i>
In XHTML, all elements must be properly nested within each other, like this:
<b><i>This text is bold and italic</i></b>
XHTML Elements Must Always Be Closed
This is wrong:
<p>This is a paragraph
<p>This is another paragraph
This is correct:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
Empty Elements Must Also Be Closed
This is wrong:
A break: <br>
A horizontal rule: <hr>
An image: <img src="happy.gif" alt="Happy face">
This is correct:
A break: <br />
A horizontal rule: <hr />
An image: <img src="happy.gif" alt="Happy face" />
XHTML Elements Must Be In Lower Case
This is wrong:
<P>This is a paragraph</P>
This is correct:
<p>This is a paragraph</p>
XHTML Attribute Names Must Be In Lower Case
This is wrong:
This is correct:
Attribute Values Must Be Quoted
This is wrong:
This is correct:
Attribute Minimization Is Forbidden
<input type="checkbox" name="vehicle" value="car" checked />
<input type="checkbox" name="vehicle" value="car" checked="checked" />
<input type="text" name="lastname" disabled />
<input type="text" name="lastname" disabled="disabled" />
How to Convert from HTML to XHTML
- Add an XHTML <!DOCTYPE> to the first line of every page
- Add an xmlns attribute to the html element of every page
- Change all element names to lowercase
- Close all empty elements
- Change all attribute names to lowercase
- Quote all attribute values
Validate HTML With The W3C Validator
❮ PreviousNext ❯