e-INFRA CZ Blog logo
  • Blog 
  • Tags 
  1.   Blogs
  1. Home
  2. Blogs
  3. Training Neural Networks using MetaCentrum

Training Neural Networks using MetaCentrum

Posted on May 26, 2025 • 3 min read • 579 words
Jan Pospíšil   User Experience   AI   FR CESNET   ZCU  
Jan Pospíšil   User Experience   AI   FR CESNET   ZCU  
Share via
e-INFRA CZ Blog
Link copied to clipboard

Integrated Monitoring via Weights & Biases

On this page
What is Weights & Biases?   Setting Up WandB   Integrating WandB into MetaCentrum jobs   Example Job Script   Monitoring and Visualizing Results   Conclusion  
Training Neural Networks using MetaCentrum

Training Neural Networks using MetaCentrum with Integrated Monitoring via Weights & Biases  

💡 Note This blog post is brought to you directly by our users as part of our ongoing effort to share knowledge and real-world experience across the community. The content you’re about to read is the result of a project supported through the FR CESNET (Development Fund).

Training deep learning models is not just about raw compute – it is about observability, reproducibility, and efficiency. When you run neural network experiments at scale on MetaCentrum, keeping track of performance metrics, hyperparameters, and model versions can quickly get out of hand. That is where Weights & Biases (WandB) comes in. In this post, we will show you how to combine MetaCentrum’s HPC resources with WandB’s powerful experiment tracking to streamline your deep learning workflows.

What is Weights & Biases?  

Weights & Biases ( WandB) is a platform designed to help machine learning practitioners track their experiments, visualize results, and collaborate more effectively.

It provides tools for logging hyperparameters, metrics, and artefacts, making it easier to reproduce experiments and share results with your team.

Setting Up WandB  

To get started with WandB on MetaCentrum, follow these general steps for integrating WandB into workflows and training scripts:

  1. Install WandB: First, ensure you have the WandB library installed in your Python environment. You can do this using pip:
pip install wandb
  1. Initialize WandB: In your training script, initialize WandB with your project name and any relevant configurations:
import wandb

# Simple example of initializing WandB
wandb.init(project="your_project_name", entity="your_wandb_entity")

# More complex example of initialization with hyperparameters and tags
wandb.init(
	name="experiment_name",
	project="your_project_name",
	entity="your_wandb_entity",
	tags=["tag1", "tag2"],
	config={
		"learning_rate": 0.001,
		"batch_size": 32,
		"epochs": 10
	}
)
  1. Log Variables: Use WandB to log metrics during training (and possibly whatever else you want to track, like model weights, gradients, etc.). Here is a simple example of how to log loss and accuracy:
for epoch in range(num_epochs):
	
	# Training code here ...
	
	loss = compute_loss()
	accuracy = compute_accuracy()
	
	wandb.log({"loss": loss, "accuracy": accuracy})

Integrating WandB into MetaCentrum jobs  

When running your training scripts on MetaCentrum, you have already integrated the WandB initialization and logging into your training code by following the steps above. Now you need to ensure that your job scripts are set up to log you into WandB correctly and that the WandB run is properly configured to save logs and possible artefacts.

Example Job Script  

Here is an example of how you might set up a job script to run your WandB-enabled training script on MetaCentrum:

#!/bin/bash

# Prepare your data, set up the environment, load necessary modules, etc.

# Install WandB and log in
API_KEY="your_wandb_api_key_here"
pip install wandb
python -m wandb login --relogin $API_KEY

# Run your training script
python your_training_script.py

Monitoring and Visualizing Results  

Once your job is running, you can monitor the progress of your experiments in real-time on the WandB dashboard. You can visualize metrics, compare runs, and analyze hyperparameter effects directly in your web browser.

Once your job completes, you can also visualize the results and compare different runs to see how changes in hyperparameters affect model performance, as you can see in the example exported image below:

image

Conclusion  

Integrating Weights & Biases with MetaCentrum allows you to leverage the power of both platforms for efficient and effective deep learning workflows. By tracking your experiments, visualizing results, and collaborating with your team, you can significantly enhance your productivity and the reproducibility of your research.

 S3 Object Storage for MetaCentrum Workflows
Introducing CERIT-SC AI Tools 
On this page:
What is Weights & Biases?   Setting Up WandB   Integrating WandB into MetaCentrum jobs   Example Job Script   Monitoring and Visualizing Results   Conclusion  
Copyright © e-INFRA CZ | Powered by Hinode.
e-INFRA CZ Blog
Code copied to clipboard