Comparing Different Colab GPU Offerings

Google Colab a free python notebook environment offers GPUs and TPUs to speed up computations. Free users can access these accelerators, but with some limitations. If you pay more memory, and longer runtimes. The usage is tracked with compute units, which get used up based on how long you use the GPUs and TPUs.

I compared the different GPUs based on memory, training performance, and cost per hour.

Data and Calculations

GPU	Memory (GB)	Time FP32	Time FP16	Colab CU/h	Perf/CU FP32	Perf/CU FP16	USD/h
Tesla T4	16	657	156	1.76	0.8648	3.64	0.176
NVIDIA L4	24	211	76	4.82	0.9832	2.73	0.482
Tesla V100-SXM2-16GB	16	164	68	4.82	1.2651	3.05	0.482
NVIDIA A100-SXM4-40GB	40	81	27	11.7	1.0551	3.17	1.17

Memory: Total VRAM available for use. Noted down from colab resource viewer.

Time (FP32 / FP16): Example training test when fine-tuning an LLM using HuggingFace library. Time is the estimated time reported by the progress bar. FP32 and FP16 are floating point types. FP16 is much faster than FP32 but lower precision. The loss in precision is OK for most machine learning tasks. Though FP16 can't be used for all models especially not the ones trained on BF16 for technical reasons. See this post on the issue.

Colab CU/h: Colab Compute Units which get spent when using a GPU. Shown in the side bar in resource viewer.

Perf/CU (FP32 / FP16): GPU performance normalised by Compute Units it costs. Calculated as 1000 / (Colab CU * Time).

USD/h: US dollars that each GPU costs. Since $10 gives us 100 CUs, this column is calculated as (10/100) * Compute Units per hour.

Graphs

Tags: Data Science AI ML D3