Mining an information explosion

Updated: 2012-02-19 08:36

By Steve Lohr(The New York Times)

  Print Mail Large Medium  Small 分享按钮 0

Welcome to the age of Big Data.

Big Data is shorthand for advancing trends in technology that open the door to a new approach to understanding the world and making decisions. There is a lot more data, all the time, growing at 50 percent a year, or more than doubling every two years, estimates IDC, a technology research firm.

Data analysts help businesses make sense of an explosion of data - Web traffic and social network comments, as well as software and sensors that monitor shipments, suppliers and customers - to guide decisions, trim costs and lift sales.

A report last year by the McKinsey Global Institute, the research arm of the consulting firm, projected that the United States needs 140,000 to 190,000 more workers with "deep analytical" expertise and 1.5 million more data-literate managers. In the developed economies of Europe, the report estimates, government administrators could save more than $149 billion in operational efficiency improvements alone by using Big Data.

In fields as varied as science and sports, advertising and public health, there has been a drift toward data-driven discovery and decision-making.

"It's a revolution," says Gary King, director of the Institute for Quantitative Social Science at Harvard University. "We're really just getting under way. But the march of quantification, made possible by enormous new sources of data, will sweep through academia, business and government. There is no area that is going to be untouched."

Mining an information explosion

The new megarich of Silicon Valley, at Google and Facebook, are masters at linking Web data- searches, posts and messages - with Internet advertising. A report by the World Economic Forum last month in Davos, Switzerland, declared data a new class of economic asset, like currency or gold.

It's not just that there are more streams of data, but entirely new ones. For example, there are now countless digital sensors worldwide in industrial equipment, cars, electrical meters and shipping crates. They can measure and communicate location, movement, vibration, temperature, humidity, even chemical changes in the air.

Link these communicating sensors to computing intelligence and you see the rise of what is called the Internet of Things. Improved access to information is also fueling the Big Data trend. In 2009, Washington, D.C., started Data.gov, which makes all kinds of government data accessible to the public.

The computer tools for gleaning knowledge from the Internet's vast trove are fast gaining ground. At the forefront are the rapidly advancing techniques of artificial intelligence like natural-language processing, pattern recognition and machine learning.

Those technologies can be applied in many fields. Google's search and ad business and its experimental robot cars use artificial-intelligence tricks. Both parse vast quantities of data and make decisions instantaneously.

The wealth of new data, in turn, accelerates advances in computing. Machine-learning algorithms, for example, learn on data, and the more data, the more the machines learn. Siri, Apple's talking, question-answering iPhone app, is a good example. With people supplying millions of questions, Siri is becoming an increasingly adept personal assistant.

Erik Brynjolfsson, an economist at the Massachusetts Institute of Technology, says that in business, economics and other fields, decisions will increasingly be based on data and analysis rather than on experience and intuition. "We can start being a lot more scientific," he observes.

Retailers analyze sales, pricing and economic, demographic and weather data to tailor product selections at particular stores and set price markdowns. Shipping companies mine data on truck delivery times and traffic patterns to fine-tune routing.

Online dating services constantly sift through their Web listings of personal characteristics, reactions and communications to improve the algorithms for matching people. Police departments use computerized mapping and analysis of variables like historical arrest patterns, paydays, sporting events, rainfall and holidays to try to predict likely crime "hot spots" and deploy officers there in advance.

In research published last year, Professor Brynjolfsson and two colleagues studied 179 large companies and found that those adopting "data-driven decision making" achieved productivity gains 5 percent to 6 percent higher than other factors could explain.

The predictive power of Big Data shows promise in fields like public health and economic forecasting. Researchers have found a spike in Google search requests for terms like "flu symptoms" and "flu treatments" a couple of weeks before there is an increase in flu patients coming to hospital emergency rooms in a region.

Global Pulse, a new initiative by the United Nations, wants to leverage Big Data for global development. The group will analyze social networks and text messages - using natural-language deciphering software - to help predict job losses, spending reductions or disease outbreaks in a given region. The goal is to use digital early-warning signals to guide assistance programs in advance to, for example, prevent a region from slipping back into poverty.

Big Data's models are explanatory simplifications. They are useful for understanding, but they have their limits. A model might spot a correlation and draw a statistical inference that is unfair or discriminatory, affecting the products, bank loans and health insurance a person is offered.

Yet there seems to be no turning back. "The culture has changed," says Andrew Gelman, a statistician and political scientist at Columbia University in New York. "There is this idea that numbers and statistics are interesting and fun. It's cool now."

The New York Times

(China Daily 02/19/2012 page10)