It's important to look beyond data at face value because sometimes that can be misleading, said astrophysicist Neil deGrasse Tyson, speaking at a Collibra conference Thursday.
Sometimes, accurate data gives false information, so good data leads to wrong conclusions. That was the message from famed astrophysicist Neil deGrasse Tyson, the director of the Hayden Planetarium and also an author and TV show host.
deGrasse Tyson made his comments during a talk on data through the lens of an astrophysicist, during Collibra's Data Citizens '21 event Thursday.
Scientists are increasingly able to gather data thanks to telescopes, which enable us to think about the structure of the galaxy, deGrasse Tyson said. Now in production is a large synoptic survey telescope that will have 3.2 gigapixels and will process 20 terabytes of data per night and run for 10 years, he said. By the end of its life, it will have produced 500 petabytes of data.
SEE: Navigating data privacy (free PDF) (TechRepublic)
"It's all about big data. We've always cared about data," and now we have the ability to process it with what will be one of most powerful telescopes in the world, he said. "The universe is not getting smaller and more powerful telescopes enable us to see farther into the galaxy."
The telescope will take 15-second exposures every 20 seconds, which will create the first-ever "movie" of the universe, deGrasse Tyson said. That process will repeat every three days and help scientists map dark matter in the universe.
However, scientists must be cognizant of when the data leads them differently and produces "accurate data that's misleading," he said. deGrasse Tyson punctuated his talk with several examples of how things don't always appear as they seem and how data can be misinterpreted when looking at frame rates.
In one instance, he showed a video portraying a helicopter that looks as though it is gliding downward slowly but is actually going much faster. "Watch out for accurate data that is incomplete and misleads you," he warned.
Time is not always as it seems
Then there is the issue of precise data that is false. deGrasse Tyson said his favorite question is when someone asks what time it is, and a person answers that it is the year 2021. "You'll say, 'That's not what I meant.' You'll say, 'That's accurate, but it's not precise.'"
If a person answers that it is 12 hours, 24 minutes and 309.74 seconds, that seems to be really precise, "but as soon as you utter that it's not precise. It's kind of useless precision" because time changes so rapidly.
He mentioned seeing a clock in New York City's Times Square in the 1970s that claimed to be accurate to a tenth of a second, but he later found out it was off by five minutes.
The point, deGrasse Tyson said, is "you can have bad data and not know it's bad data. Then you base all subsequent decisions on bad data."
He went into detail about the discovery of some of the planets in the solar system and issues with Newton's Law of Physics and bad coordinates.
He presented examples as an "appeal to empathy," and to stay open to other possibilities "because when you're on the frontier you just don't know … maybe data is good, maybe it's bad or maybe your idea of how to interpret it is bad."
Sometimes, accurate but imprecise data can hide phenomena, deGrasse Tyson said. For example, there are 86,400 seconds in a day based on 60 seconds per minute x 60 minutes in an hour x 24 hours in a day. This was defined for the year 1900.
But then we went to atomic time in 1972, which produced a measurement of 86,400.003 seconds in a day. While many would have thought that was precise enough, that can have consequences, such as in the amount of time it takes the earth to rotate.
If the rotation of the earth is incorrect, then a solar eclipse would be in the wrong place, deGrasse Tyson said. Since 1972, 17 leap seconds have been added to produce more accurate time. "Better data got us there," he said.
Another interesting example he cited was when there are two accurate, but conflicting sets of data. In an NFL game between the Seattle Seahawks and the Philadelphia Eagles, for example, there was a "Galilean transformation" toss where the ball went backward, deGrasse Tyson said. The quarterback pitched the ball to his running back in a backward toss. But because both players were running so fast, the Eagles called it a forward pass.
"Technically, it is, as seen in the reference frame on the field, but it's clearly a backward pass between the two players," deGrasse Tyson said. He opined about it on Twitter, saying that there were two authentic pieces of data—but it is all in how you interpret it.
"The arc of enlightenment in a rational, civilized world starts with data, but it's dangling and you want to understand your data so you can convert it into facts. But facts alone aren't themselves insightful."
Information gets processed and the next level is knowledge, where someone has awareness of a phenomenon for which they want to produce insights. "At the end of the factory line of data processing analysis and understanding, you want wisdom as the ultimate product," he said.
Bias in data
Tifenn Dano Kwan, CMO at Collibra, asked how data can mislead people, especially when it is accurate. That leads to trust issues, she said.
"That's a whole other cog in the wheel of what can go wrong with data," deGrasse Tyson said. Bias in facial recognition software has been documented, and there can be issues with any kind of data collected about people that may contain the bias of the programmer—or the organization funding the data.
While acknowledging that he doesn't have an "airtight answer," deGrasse Tyson advised people to experiment with data and ask themselves such questions as, "Is it capturing what the user sought?"
Or, "is it capturing something you didn't seek but should be paying attention to?" There should be a room full of people who query the data and ask what it means and is it revealing what they thought it should say? Another question to consider is if the same experiment is done again, will the same data set come forward?
"Get someone else to look at it," deGrasse Tyson said. "Someone else who doesn't look like you. Then you can query the data and characterize it and move forward, and get different data to supplement it. Then you inch your way forward."
He was also asked what the future may hold for the field of astrophysics.
"One of my fears is we'll acquire so much data we'll be awash in the data and can no longer justify obtaining data on anything because we still have unanalyzed data waiting for your attention," deGrasse Tyson said. "I hope that day doesn't come."
- Top 5 things you need to know about data science (TechRepublic)
- Be careful when crowdsourcing data: The crowd could give you something you don't want (TechRepublic)
- How to become a CIO: A cheat sheet (TechRepublic)
- Top 5 programming languages for systems admins to learn (free PDF) (TechRepublic)
- New Employee Checklist and Default Access Policy (TechRepublic Premium)
- ZDNet's top enterprise CEOs of the 2010s (ZDNet)
- CXO: More must-read coverage (TechRepublic on Flipboard)