A leak earlier this month comprising thousands of pages of Google’s internal documents offered an unprecedented glimpse into how the Google Search algorithm operates, according to a report in The Verge. The leak suggested that the company may not have been entirely transparent about its processes.
Google on May 30 confirmed the authenticity of these documents in a statement to the publication. “We would caution against making inaccurate assumptions about Search based on out-of-context, outdated, or incomplete information. We’ve shared extensive information about how Search works and the types of factors that our systems weigh while also working to protect the integrity of our results from manipulation,” Google spokesperson Davis Thompson said.
Notably, Google’s search algorithm is influential in determining which websites thrive and which falter. However, the specifics have remained a mystery, with journalists, researchers, and SEO professionals piecing together what they can.
Also Read | OpenAI confirmed to power generative AI features on iOS 18, Google still in the reckoning: Report
Insider Revelations
In an article on May 27, SEO expert Rand Fishkin said he received 2,500 pages of these leaked documents from a source who hoped it would counteract misleading information previously shared by Google employees about their search algorithm.
Fishkin and fellow SEO expert Mike King analysed the leaked documents in a blog post titled, ‘Leak source: An anonymous source shared thousands of leaked Google search API documents with me; everyone in SEO should see them’.
He said the documents detail Google’s search API and outline the information available to its employees. The technical content of these documents is dense and more comprehensible to developers and SEO experts, as per the Verge report.
Also Read | Google’s AI-powered search tells users to eat rocks, add glue on pizza. Netizens say, ‘take this offline’
He said that while the documents don’t directly prove that Google uses all the mentioned data and signals for search rankings, they do indicate what data Google collects from websites and searchers. As SEO expert Mike King noted in the leaked article post, this information provides indirect clues about Google’s priorities.
Key Insights from the Leak
The leaked documents cover topics such as the types of data Google collects, how it elevates certain sites on sensitive topics such as elections, and its handling of small websites. As per the report, some details in the documents contradict public statements made by Google representatives.
“‘Lied’ is harsh, but it’s the only accurate word to use here. While I don’t necessarily fault Google’s public representatives for protecting their proprietary information, I do take issue with their efforts to actively discredit people in the marketing, tech, and journalism worlds who have presented reproducible discoveries,” King said.
Also Read | Mint Explainer: Why Alphabet, Apple and Meta are in EU’s crosshairs
Google has yet to respond to The Verge’s requests for comment on these documents. Fishkin mentioned that Google has not disputed the validity of the leak but did ask for changes in the language used to describe certain events.
SEO Industry Impact
Google’s secretive algorithm has created an entire industry of marketers who meticulously follow Google’s public guidelines to optimise websites for better rankings, the Verge report said. It added that this has led to widespread criticism that Google Search results are becoming cluttered with low-quality content created to meet these guidelines. In response to these criticisms, Google often defends its position by citing its guidelines.
The leaked documents question the accuracy of Google’s public statements about its search operations. For example, King noted that Google has stated that it doesn’t use Chrome data for ranking pages, but the documents suggest otherwise. They mention Chrome in sections discussing how websites appear in Search.
Also Read | Google parent Alphabet soars past $2 trillion market cap on AI strength
Another debated topic is the role of E-E-A-T (experience, expertise, authoritativeness, and trustworthiness) in rankings. Although Google has claimed E-E-A-T isn’t a ranking factor, the documents show Google tracks author data, indicating some relevance to rankings. However, the Verge report said Google maintains that author bylines are meant for readers, not to influence rankings.
Looking Ahead
While the leaked documents don’t provide conclusive evidence of all of Google’s practices, they offer a rare, in-depth look at its search algorithm. The ongoing US antitrust case against Google, focusing on Search, has also led to the release of internal documents, shedding more light on Google’s operations.
Google’s lack of transparency about its algorithm has led to uniformity in website content, driven by SEO marketers attempting to decode Google’s hints. In his post, Fishkin criticised publications for uncritically accepting Google’s statements, urging them to scrutinise the company’s claims more rigorously.
Also Read | ‘Dance to own music, micro-focus small in context,’ says Sundar Pichai as big tech’s AI race heats up
“Historically, some of the search industry’s loudest voices and most prolific publishers have been happy to uncritically repeat Google’s public statements. They write headlines like ‘Google says XYZ is true,’ rather than ‘Google Claims XYZ; Evidence Suggests Otherwise’. Please, do better. If this leak and the DOJ trial can create just one change, I hope this is it,” Fishkin said.
You are on Mint! India’s #1 news destination (Source: Press Gazette). To learn more about our business coverage and market insights Click Here!
Source: Google Search’s elusive algorithm revealed in leaked documents? Details here