OpenAI has unveiled a new tool named GPTbot.
This revolutionary web crawler has been crafted to accumulate data from all corners of the internet, amplifying the precision and capabilities of AI models.
OpenAI says that granting GPTbot access to websites can play a pivotal role in refining AI models’ accuracy, increasing their overall potential, and enhancing safety measures. However, it has come to light that a substantial 15% of the world’s top 100 websites have opted to block GPTbot’s access.
GPTbot’s Impact and Adoption
Originality.AI has released data that reveals that within the initial fortnight following the launch of GPTBot’s documentation, nearly 10% of the globe’s most prominent 1000 websites chose to prevent GPTbot’s intrusion.
Notable sites such as Amazon, Quora, Wikihow, and several international news outlets have taken measures to thwart GPTbot’s presence on their platforms. This brings into question the potential accuracy and limitations of ChatGPT.
The Mechanism Behind GPTbot
GPTbot operates through a structured process starting with the identification of potential data sources. This step involves web crawling where the tool scours the internet to pinpoint websites containing relevant information. Once an appropriate source is found, GPTbot extracts relevant data from the identified website.
The collected information is then catalogued within a database, used for the training of AI models.
More from News
- Only 1 In 4 UK Retailers Are Actually Ready For AI In Commerce, Patchwork Finds
- Ads Are Coming to ChatGPT: How Will This Impact The UX?
- Why Is Taiwan Investing Billions Into The United States?
- Can Starlink’s Low-Cost Broadband Break BT’s Grip On The UK Market?
- UK Supports X Ban: Experts Answer Whether Ofcom Is Stepping On The Toes Of Free Speech?
- Why Are More UK Founders Choosing To Self Fund?
- One In Four UK Workers Say They Applied For Non-Existent Jobs, Known As Ghost Jobs
- Uganda’s Internet Blackout Shows How Modern Technology Can Be Used As A Political Tool
Versatility in Data Extraction
One of GPTbot’s standout attributes is its ability to extract data from an array of sources, spanning text, images, and code. In terms of textual content, GPTbot extracts information from websites, articles, books, and diverse documents.
Furthermore, its ability extends to image-based data, allowing it to discern objects depicted within images and decipher textual content. Impressively, GPTbot can even extract code from repositories hosted on GitHub, as well as other code sources scattered across the internet.
The Nexus with AI Models
OpenAI’s flagship product, ChatGPT, and similar generative AI tools draw information from the data culled from websites to fuel their training processes. Even prominent figures like Elon Musk, in a previous iteration of the social media platform now known as Twitter, had intervened to halt OpenAI’s data scraping from the platform.
The creation of GPTbot represents a leap forward in AI advancement. By capturing data from the expansive digital landscape, GPTbot is poised to usher in a new era of AI proficiency.
The decision of some top websites to bar GPTbot’s access showcases the complexities around data usage rights. As OpenAI continues its stride toward AI excellence, the interplay between data, innovation, and legal considerations remains a central point.