Skip to content
Business Company News, Information Technology

Kangaroo LLM Launches Massive Web Crawl to Build Australia’s First Open-Source AI Model

Kangaroo LLM 2 mins read

The Kangaroo LLM project today announced the launch of an extensive web crawling initiative to create Australia's first open-source artificial intelligence model. This ambitious effort will see the project's custom web crawler, "Kangaroo Bot," begin collecting data from 754,000 Australian websites starting September 25th onwards to build the VegeMighty dataset, a comprehensive corpus of Australian English content.

With over 4.2 million registered domains in Australia, this initial phase represents a significant step towards developing an AI model that genuinely understands and represents Australian language and culture.

"This initiative marks a pivotal moment in Australia's AI journey," said Vinod Bijlani, AI Practice Leader at Hewlett Packard Enterprise (HPE) and a key partner in the Kangaroo LLM consortium. "By ethically harvesting data from 754,000 websites in this first phase, we're laying the groundwork for an AI that will not only understand Australian English but will also grasp the nuances of our diverse digital landscape. This is more than just data collection; it's about capturing the essence of Australian online communication and culture."

Key aspects of the web crawling initiative include:

  1. Extensive Scope: Targeting 754,000 Australian websites in the first phase to create a diverse and comprehensive dataset.
  2. Ethical Data Collection: Adhering to responsible web crawling practices and respecting website owners' preferences.
  3. Transparency: Commitment to publishing the full list of websites to be crawled, fostering trust and open dialogue.
  4. Data Sovereignty: All collected data will be processed and stored within Australia, ensuring compliance with national regulations.
  5. Immediate Commencement: Web crawling will begin on September 25th, 2024.

The Kangaroo LLM project is committed to responsible data collection. Website owners who wish to opt out of the Kangaroo Bot crawl can do so by adding the following to their robots.txt file

User-agent: Kangaroo Bot
Disallow: /

"This extensive data collection effort is not just about creating an AI model; it's about building a foundation for Australia's AI future," Bijlani added. "We're inviting all Australians to be part of this groundbreaking journey, whether by allowing us to include their sites in our dataset or by following our progress."

The Kangaroo LLM consortium, which includes industry leaders such as Katonic, RackCorp, NextDC, Hitachi Vantara, and HPE, views this initiative as a crucial step towards establishing Australia as a leader in ethical AI development.

For more information about Kangaroo LLM, the web crawling initiative, or to check if your website is included in the crawl list, visit kangaroollm.com.au.

About Kangaroo LLM: Kangaroo LLM is a collaborative project to create Australia's first open-source large language model, specifically tailored for Australian English. Led by a consortium of leading Australian tech companies, the project aims to enhance AI sovereignty, foster innovation, and create new economic opportunities in the Australian tech sector.

More from this category

  • Business Company News, Oil Mining Resources
  • 07/10/2024
  • 12:03
Jane Morgan Management

Arizona Lithium Secures A$11 Million in Non-Dilutive Cash from Sale of Non-Core Acreage at Prairie Lithium Project

Perth, Australia – 7 October 2024 | Arizona Lithium Limited (ASX: AZL, AZLO, OTC: AZLAF) ("Arizona Lithium" or "the Company") is pleased to announce the successful completion of a non-dilutive cash raise of A$11 million through the sale of non-core mineral title and data at its Prairie Lithium Project in Saskatchewan, Canada. The sale to Homestead Energy Inc. enables the Company to focus on the development of its core lithium assets while preserving shareholder value. Highlights: A$11 Million Cash Proceeds: Arizona Lithium has secured A$11 million from the sale of approximately 40,000 acres of Crown Mineral Title and 11,600 net…

  • Contains:
  • Business Company News
  • 07/10/2024
  • 08:00
Monash University

New Study explores psychosocial risks of collaborative robots: Emphasising the need for worker engagement

The growing use of collaborative robots in the workplace may pose significant psychosocial risks to workers' mental health and their job security, but there are ways for organisations to smooth the transition, according to research from the Monash University Business School. Once confined to science fiction, collaborative robots, or cobots, are rapidly reshaping the Australian workplace, handling everything from heavy machinery to delicate surgical tasks. Automation, including cobots, is predicted to increase annual productivity growth in Australia by 50 per cent to 150 per cent. These technologies have the potential to add a further $170–600 billion per year to GDP…

  • Business Company News
  • 07/10/2024
  • 07:00
CPA Australia

Global AI thought-leaders to discuss risks, trends and opportunities as CPA Congress returns to Canberra

7 October 2024 Global AI thought-leaders to discuss risks, trends and opportunities as CPA Congress returns to Canberra CPA Congress 2024 will examine how…

  • Contains:

Media Outreach made fast, easy, simple.

Feature your press release on Medianet's News Hub every time you distribute with Medianet. Pay per release or save with a subscription.