Canada needs to close the productivity gap with other countries investing in world-class supercomputers by building a “made-in-Canada” supercomputer, says a Queen’s University researcher and international expert on high-performance computing.
At least one-quarter of the $2 billion the federal government allocated in Budget 2024 for a new AI Compute Access Fund and a Canadian AI Sovereign Compute Strategy should go toward building a “sovereign, secure” supercomputer in Canada, said Dr. Ryan Grant, PhD (photo at right).
“The supercomputer is a productivity machine,” Grant, who has helped design some of the world’s most powerful supercomputers, told Research Money. “It’s a wise investment of taxpayer dollars.”
Grant, who was a research associate professor at Sandia National Laboratories in New Mexico before joining Queen’s University in 2021, is an expert on high-performance networking and power management for extreme scale (“exascale,” or a billion-billion operations per second) computing systems. His research is being used in every currently announced upcoming exascale systems around the world.
Grant said that Canada having an actual sovereign compute strategy – like the nations investing in large and powerful supercomputers do – is as important as the country having its own Pan-Canadian AI Strategy, which it has.
Although Canada is recognized internationally for its AI researchers, “if we’d had a Canadian sovereign compute strategy, we could be world leaders in supercomputing as well. We’d own both, we would have been doing both,” he said.
Most countries – especially those at the cutting edge of AI development – use a mix of contracted cloud computing services, supercomputers dedicated to academic research, and “on-premises” domestic exascale supercomputers for industry and national security uses, Grant said. Exascale computers are the biggest and fastest systems in the world.
Canada has five major supercomputers that are part of a national research computing infrastructure for researchers for open research, and which is coordinated by the Digital Research Alliance of Canada.
These supercomputers are housed at McGill University, Simon Fraser University, University of Victoria, University of Toronto, and University of Waterloo.
As of September 2024, Canada’s most powerful supercomputer ranks 123rd in the world, making Canada the only G7 nation without a supercomputer in the Top 50.
The country doesn’t have its own supercomputer that can serve security-sensitive government departments and organizations, or industries with security-sensitive data – such as defence contractors – that they don’t want to trust to a contracted cloud service, Grant noted.
Cloud services can play a role in providing Canada with sufficient compute power, but they have several disadvantages compared with an on-premises supercomputer, he said.
“I think that cloud services could have a role long term as a supplemental third thing that we do as a country. But to go exclusively cloud brings up all sorts of disadvantages.”
“It’s not the way to fix the road, it’s how to patch a pothole today so we will then have a pothole again in a year,” Grant added.
One of the disadvantages of cloud services is that most of them are currently operated by foreign companies, such as Amazon, Google and Microsoft in the U.S.
“They are there to profit. They are not there to make sure everything is running as efficiently as possible in the system,” or to teach people how to use the system and get their data inputted and AI models up and running, Grant said.
Contracting cloud systems operated by providers in other countries also means that money flows out of Canada, often to a foreign multinational, he noted.
One of the biggest problems with cloud services is that a company’s cloud system is set up to lock people into a specific cloud – like a cellphone contract, Grant said. “You will have to actually build your product around that lock-in.”
“And once you’re locked into that technology, extracting yourself is really, really expensive. You have to rebuild your software, essentially.”
Also, the cloud provider can choose to raise its rates whenever it wants, or to not offer the service somewhere or offer it to someone else, Grant said. “If they’re not going to turn a profit, they’re not going to do it [make their cloud service available].”
Using a cloud service also means you don’t physically own your data and where it’s located, or prevent the cloud service from moving your data outside of Canada, he said.
If you are using data from a public system (for example, health data on Canadians) or highly security-sensitive data such as the Department of National Defence or Canada’s nuclear industry would have, once that data crosses the border and perhaps gets stolen, it is gone for good, he said. “To me, we’re not sovereign if we’re relying on a foreign provider, because that foreign provider can go 'poof' or have pressure put on them."
Other countries, such as the U.S., U.K., Japan and EU countries, spend billions of dollars every year to have their own on-site supercomputers, because they recognize the importance and benefits of doing so, he said.
Built-in-Canada supercomputer offers multiple benefits
Compared with cloud services, Canada building and owning its own high-performance, on-premises supercomputer offers multiple advantages and benefits, Grant said. “The cloud is like renting your house. On-premises is like buying your house.”
Data owned by Canadian governments, industries and organizations stays in Canada, where you know where it is and you can take steps to ensure its security.
“In that sense, we are in charge of our own destiny,” he said. “We have the only set of keys to the ‘house.’”
Canada’s supercomputer can be designed to be flexible, so it can be used by companies and organizations with data that isn’t security-sensitive and those with data that is, Grant said. “Lots of different businesses can be using this supercomputer, on schedules, without renting a thousand GPUs in the cloud.”
The supercomputer also can be set up with different security levels to accommodate the Department of National Defence and other agencies that have highly security-sensitive data, or to be used during times of national emergencies.
Having a made-in-Canada supercomputer at a newly created national supercomputing centre would enable Canadians to develop the expertise to build and operate such systems. Canadians AI developers and other companies in Canada can learn how to use the system to demonstrate the viability of their products at small scale, then scale them up and export them.
“It leverages our existing national AI strategy and keeps people here,” Grant said. “We’ve got people that are actually employed to help business grow and use computing, rather than people who just want to make money off of those people using that computing.”
Studies around the world show supercomputers offer huge returns on investment and are job-creating and talent-attracting machines, Grant said.
Canada also needs its own supercomputer to create “digital twins” of major equipment, infrastructure and new technologies, such as new submarines or small modular nuclear reactors, to work out problems on the digital twin before the actual things are built. Digital twins also are used in military and national disaster planning.
“If we’re going to get past the technologies of today, to the technologies of tomorrow that really solve all the problems, we need digital twins and digital twins at very large scale,” Grant said.
Learning how to build world-class supercomputers at exascale is crucial, because machines at this scale come with inherent problems that must be recognized and addressed, he noted.
This includes preventing “heat bubbles” from forming in the supercomputer’s liquid cooling system and preventing sudden system and power crashes that can trigger blackouts on electricity grids, because an exascale supercomputer uses so much power.
“We’re not prepared to be solving these problems, because we just haven’t built something big enough before to know it was a problem,” Grant said.
Many countries have supercomputers that operate on a “Go-Co” model, where the government owns the supercomputer and a nonprofit or non-corporate contractor runs it.
A modified Go-Co model, where government invests at the beginning to kick start the project, and then industry and other users are financially supporting the supercomputer’s operation, could be very effective in Canada, Grant said.
Everybody that uses it pays basically only the cost of the supercomputer's operation, he added, with a little extra tacked on as an investment in the next new supercomputer so that the system is continuously updated and is state-of-the art.
Instead of money flowing out of Canada to foreign cloud services providers, taxpayers’ dollars would stay in Canada and be invested in the country’s own supercomputer and future versions.
“It’s cheaper than if they [users] owned their own system,” Grant said. “If everybody’s pooling their money together, they get bigger scale, they get a better ‘car’ because everybody’s chipped in on it, and it’s always kept up to date.”
Implementing a modified Go-Co model also would create an organization responsible for engaging Canadians about all aspects of supercomputing, including how to save money on the next one and planning for future technologies.
Grant said Canada’s supercomputer needs to be located in a place that’s close enough to different users for them to connect to it, to travel to it when they need to, attracts talent, and has sufficient power to run the supercomputer.
Grant has assembled a team of supercomputing researchers at Queen’s University, at the Computing at Extreme Scale Advanced Research lab. Comprised of 18 experts, and adding another five this fall, his lab has quickly become one of the largest in the world focused on supercomputing architecture, and specifically software for supercomputing networks.
Canada could build a world-class supercomputer for $500 million or, if the budget was tight, at least an initial system for $300 million that could then be expanded, Grant said.
“I think that once we build one, and we start getting it up and running and everybody sees the benefits of it, there will be more interest in starting up more than one.”
Public-private partnership could build Canada’s supercomputer
Innovation, Science and Economic Development Canada conducted a consultation process on how to spend the $2 billion to boost Canada’s sovereign AI data processing capacity.
Queen’s University, in its submission, A Roadmap for Sovereign, Secure Supercomputing and AI Exascale Infrastructure, proposes a supercomputer project be built on campus though a public-private partnership. The project would be built in two phases, a “preview” phase and a “full system” phase.
Phase 1 involves the implementation of a preview system with 256 modern graphical processing units, or GPUs (the core of a supercomputer and critical for developing advanced AI models), to establish: infrastructure, training, processes, a business model, system architecture and performance standards.
Phase 2 expands to a full system with 2,048 modern GPUs, with scaling capabilities to meet demands and support research and industry applications.
The project aims to provide AI exascale computer cycle access to foster research and innovation, provide training and support sectors such as the life sciences, energy, export-controlled industries and government/defense that require sovereign secure compute, Queen’s submission says.
“By fostering collaboration across sectors, the project will train, attract and retain personnel in AI and AI Exascale Computers. It will support AI research across Canada by providing last-mile training (e.g., surge capacity) for AI models and serve as a testbed for assessing the feasibility of future hardware and software. In times of crisis, the Full System could be used for emergency preparedness or disaster relief.”
The Preview and Full System phases will integrate into Canada’s computing ecosystem, serving as a bridge between existing and planned academic and industry systems, the submission says.
“This integration will foster an environment for students, researchers and industry professionals to converge and mobilize innovations. The Full System will close the gap in highly secure AI exascale computing relative to the proportions of availability in the United States,” where sovereign, secure compute accounts for more than 30 percent of the U.S. overall supercomputing capacity.
Queen’s says in positioning its Preview and Full systems, the university consulted with more than 70 stakeholders, including AI institutes, universities, technical experts, venture capital firms, government, government labs (in both the U.S. and Canada), multinational industries, startups/SMEs and high-performance computing vendors.
“These discussions revealed a significant gap in AI Exascale Computer hardware and software expertise within Canada, essential for managing on-premises systems or optimizing cloud-based solutions.”
Additionally, the submission says, stakeholders face challenges accessing suitable GPUs in the cloud, with availability being limited and costs prohibitively high. Some AI algorithms and models cannot be run outside of Canada for security reasons, “further emphasizing the need for domestic infrastructure.”
Due to limited, almost non-existent modern compute resources with advanced AI capabilities, Canada is slowing new product development and the adoption of AI for key industries from life sciences, mining, advanced manufacturing and clean energy and ag-tech, the submission notes.
Queen’s submission argues that having a sovereign, secure, built-in-Canada supercomputer will have a “transformational” impact on the Canadian economy.
“Expanding AI Exascale Computer capacity would enable our investments in AI Exascale Computing and AI talent to move these skills into industries that build Canadian prosperity and bolster productivity.”
In a separate submission to the government, Universities Canada, which represents 96 institutions across Canada, also points out the need for “secure and sovereign” AI capabilities, and the limitations of contracted cloud services.
“Developing secure and sovereign artificial intelligence capabilities is crucial for Canada's future economic competitiveness and national security,” Universities Canada’s submission says. “By investing in homegrown AI infrastructure, Canada can reduce dependence of critical infrastructure and its economy on a handful of foreign technologies.”
Sovereign AI allows Canada to implement robust security and privacy safeguards, protect sensitive information, and maintain strategic advantages, the submission says. Sovereign AI also positions Canada as a leader in ethical AI development, allowing the country to shape global norms and standards to uphold Canadian values.
“As AI plays an increasingly crucial role across industries and society, it is imperative that the [federal government’s] AI Compute Access Fund considers long-term sovereignty of the infrastructure,” the submission says.
Universities Canada said it has heard from several members who’ve indicated that federal support for cloud computing credits would help address an immediate need for compute before new infrastructure is built.
However, these platforms often create proprietary dependencies and may also lead to challenges in ensuring that sensitive data respects Canadian privacy and security standards, the submission notes.
Should the fund include support through the form of cloud credits, the primary role of the AI Compute Access Fund should focus on reducing dependence on foreign compute technologies by increasing access to Canadian infrastructure, the submission says.
“Universities Canada therefore recommends that the fund prioritizes the development of infrastructure that is anchored at national facilities and public institutions to provide long-term resilience and assurances that the infrastructure will remain in Canadian hands and benefit from public accountability.”
Policies needed to commercialize and adopt AI innovation
The Council of Canadian Innovators (CCI) also recommends that the federal funding for AI infrastructure be used “to build sovereign compute capacity to protect sensitive data and critical industries.”
However, compute power – the basic equipment required for AI model development and deployment – is not the primary challenge facing most Canadian companies, Nick Schivao, the CCI’s director of federal affairs, said in a statement.
“Instead, the real obstacles lie in the commercialization of AI innovation and creating an ecosystem that encourages Canadian industry to adopt AI technologies,” he said.
To enhance Canada’s competitiveness, the federal government must focus its funding for AI computing infrastructure on domestic scaling firms, widespread adoption and pathways to move from research to commercialization, Schivao said.
One of Canada’s biggest areas for improvement is the commercialization of intellectual property, he said.
Despite a 57-percent increase in AI patent filings in 2022-2023, 75 percent of the patents generated through the Pan-Canadian Artificial Intelligence Strategy are owned by foreign firms. “This means that while Canadians are making great strides in research and invention, the economic benefits are being exported to other countries.”
To prevent further IP loss, the government must create the right conditions for scaling AI companies, focusing on IP management and commercialization, Schivao said. Canadian firms should be incentivized to retain ownership of their innovations and find pathways to bring those innovations to market.
Moreover, domestic adoption of AI technologies is critical, and public sector procurement should play a significant role here, he said.
The CCI recommends to the federal government that it:
See also: “Lack of supercomputing power is impairing Canada’s research and business innovation.”
R$