Google to Reward Facts Over Fans
LOS ANGELES — Adult search engine marketers trying to keep pace with Google’s algorithm changes may have a new hurdle to overcome as the company reportedly explores a new fact-based website ranking system.
“It must be true, I saw it on the Internet...”
This is a tongue-in-cheek joke uttered in many variations — the common denominator being that the Internet is not the most trustworthy source of information, despite the fact that many users (especially younger ones, who consider comedian Jon Stewart’s “The Daily Show” as a legitimate news source) rely on the Internet as their primary (or only) source of news and other information.
To help address the glut of misinformation on the Internet by raising the discoverability of honest sites, Google is targeting truthfulness through an advanced algorithm that rewards a web page’s verity over its popularity, using a new “Knowledge-Based Trust” score.
“The software works by tapping into the Knowledge Vault, the vast store of facts that Google has pulled off the Internet. Facts the web unanimously agrees on are considered a reasonable proxy for truth,” Hal Hodson wrote for NewScientist.com. “Web pages that contain contradictory information are bumped down the rankings.”
This is a major shift from ranking sites based on the number of inbound links to it from other sites and social media channels. This old model is based on the assumption that better quality sites will be more widely linked to, and thus more relevant in regards to targeted keyword searches. While an acceptable strategy in a perfect world, the process is heavily tainted by salesmen spamming the search engine with purchased links, paid plugs and other non-genuine expressions of approval, intended to skew the search results ranking page, thus driving visitors to lower quality information and potentially fraudulent offers.
If Google has its way, however, every incorrect fact will push a page further down its rankings.
According to researcher Xin Luna Dong, the quality of web sources has been traditionally evaluated using exogenous signals such as hyperlink structures.
“We propose a new approach that relies on endogenous signals, namely, the correctness of factual information provided by the source,” Dong declares, explaining that a source that has few false facts would be considered as trustworthy.
To test a site’s verbal verity, facts are automatically culled from each source by information extraction methods that are commonly used to construct knowledge bases.
“We propose a way to distinguish errors made in the extraction process from factual errors in the web source per se, by using joint inference in a novel multi-layer probabilistic model,” Dong stated. “We call the trustworthiness score we computed Knowledge-Based Trust (KBT).”
“On synthetic data, we show that our method can reliably compute the true trustworthiness levels of the sources. We then apply it to a database of 2.8B facts extracted from the web, and thereby estimate the trustworthiness of 119M web pages,” Dong adds, noting that “Manual evaluation of a subset of the results confirms the effectiveness of the method.”
Such a shift would have a far reaching impact on adult website operators and affiliate marketers who may have played fast and loose with the facts — whether they are aware of it or not. For example, the common use of spinning syntax (Spintax) technology to generate unique text descriptions as a strategy for boosting search engine rankings may backfire, if the regurgitated text is no longer factual in nature — even if it was initially factual.
One possible scenario would be a site that spins model bios, such as a cam, pay or tube site randomizing data or using similar (but not the same) terms: such as depicting a model as having “straw colored hair,” when the “official” record lists her as “blond.” Depending upon how relational a fact-checking database may be, the more creative site, using factual data, may be penalized because its information is not the same as contained in the official record.
An interesting side note on this is that rather than pruning duplicate content from the Internet (another laudable Google goal), the use of truth as a ranking factor is likely to increase the amount of duplicate content online.
Another problem area is in the potential squelching of dissenting or original thought.
For example, the topic of abortion and when life begins; the controversy over man’s influence on global climate change; or research into the curative qualities of cannabis, can all be swayed as legitimate, relevant discourse is downgraded (and thus less discoverable) for not following the party line or accepted norms.
Although experts agree that Google will not be implementing a truth-based solution any time soon, it is clearly on the company’s agenda for the future — a future where commonly accepted “facts” may be the only information that Internet users are exposed to. It is a scary thought for historians that recall a past when the “fact” that the world is flat was well accepted — and those expressing contrary opinions were branded as heretics and rewarded with torture and death.
In the world of the future, it may be your traffic stats that are torturous to look at, and the death of your website — the expression of your unique thoughts — the ultimate suppression of creativity and dissent (unless you tell the truth as Google sees it).