What is an AI Factory?

AI factories are the next stage in the evolution of the datacentre. While existing datacentres store vast amounts of mission-critical corporate or public sector data, AI factories provide competitive advantage by transforming data into actionable real-time insights.

Instead of data being the product, in AI factories intelligence is the product. AI factories achieve this by orchestrating AI projects from data preparation, through training, fine-tuning and inferencing. The latter two stages being particularly important as this is where intelligence is made.

In established AI practice, an organisation starts with its own data, develops an AI model, trains that model and uses it to inference data. For example, you feed a model thousands of X-Ray images from an image library, you train it to detect a particular cancer, and then deploy the model in hospitals to detect that cancer in patient X-Rays.

In contrast, an AI factory starts with pre-trained foundation models and applies a feedback loop known as a data flywheel to fine-tune the AI model and extract additional intelligence with each iteration. For example, an AI factory-empowered LLM will generate better text, images, audio or video every time it goes through the data flywheel than a static LLM.

AI factories and their ever-improving intelligence will usher in the upcoming eras of agentic AI and physical AI, where advanced and dynamic reasoning are key to human interaction and responses. For example, an automated chatbot that with each iteration is better able to understand the nuances of individual customers and react appropriately.

Such fast iteration is only possible by the AI factory breaking down the original data in the foundation models into numerical tokens. To continue the theme from above, an in agentic LLM short words would be single token, while multi-syllable words would be broken down into multiple tokens. For example, the word ‘bicycle’ would break down into two tokens: 1 for ‘bi’ and 13 for ‘cycle’, whereas the word ‘tricycle’ would break down into 56 for ‘tri’ and again 13 for ‘cycle’. The same logic applies when the word ‘cycle’ appears on its own, so it would again be represented as 13. The shared numerical value associated with ‘cycle’ helps the model understanding commonality between these words.

The faster the AI factory can process these tokens, and the more it learns per iteration of the flywheel, the faster the AI agent can respond to customer’s queries, both simple and complex and the more relevant these responses become.

The Building Blocks of AI Factories

Thus, while the performance of a traditional datacentre is measured in metrics such as bytes per second or FLOPS, the productivity of an AI factory is measured in how quickly it produces tokens. Rapid tokenisation is only possible via a complete hardware and software stack comprised of the following building blocks:

Powerful GPU-accelerated servers to efficiently process vast amounts of data
AI-optimised networking and data storage platforms designed for high throughput and low latency
Infrastructure management and workload orchestration software to coordinate AI workflows
End-to-end software stackfor seamless integration across all components
Professional services for deployment, hosting, maintenance and governance

These distinct elements are clustered together into a reference architecture known as NVIDIA SuperPOD, comprising either Blackwell-based GB300 or GB200 NVL72 GPU servers in a pre-configured rack infrastructure. These servers are complemented by high-throughput, low-latency NVIDIA Spectrum-X Ethernet or Quantum-X InfiniBand networking, alongside a choice of AI-optimised NVIDIA-certified storage and data management platforms.

The hardware is enhanced by the end-to-end NVIDIA AI Enterprise platform providing access to pre-trained frameworks, libraries and microservices, plus NVIDIA Run:ai cluster management software.

Finally, the SuperPOD infrastructure is supported by comprehensive installation, configuration, data science, management, monitoring and hosting services. It is this SuperPOD architecture that is scaled using multiple datacentre racks to create an AI factory.

Scan is an NVIDIA Elite Partner and the only UK-based NVIDIA Managed Solution Provider, making us the ideal trusted advisor to for your AI factory or any other project. Contact our AI experts on 01204 474210, at [email protected] or fill out the web form

What is an AI Factory?

The Building Blocks of AI Factories

Contact our AI team