gotch/ts
2023-07-23 12:36:55 +10:00
..
basic-example_test.go change package 'tensor' to 'ts' 2022-03-12 18:20:20 +11:00
benchmark-conv2d_test.go added Conv2D benchmark 2022-05-06 18:10:38 +10:00
data_test.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
data.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
error.go change package 'tensor' to 'ts' 2022-03-12 18:20:20 +11:00
foo1.gt fixed jit test and regenerated APIs 2022-03-16 12:34:38 +11:00
foo2.gt fixed jit test and regenerated APIs 2022-03-16 12:34:38 +11:00
image.go fetched from master 2023-07-04 23:26:20 +10:00
index_test.go change package 'tensor' to 'ts' 2022-03-12 18:20:20 +11:00
index.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
init.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
iter.go reworked gotch.dtype with more dtypes 2023-07-07 00:01:23 +10:00
jit_test.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
jit.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
module.go change package 'tensor' to 'ts' 2022-03-12 18:20:20 +11:00
must-tensor-generated.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
npy_test.go change package 'tensor' to 'ts' 2022-03-12 18:20:20 +11:00
npy.go reworked gotch.dtype with more dtypes 2023-07-07 00:01:23 +10:00
optimizer.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
other.go wired up ts.Drop() for directly free mem 2023-07-07 22:30:08 +10:00
patch-example_test.go fixed check null at tensor ops return slice of tensors and clean-up 2023-07-07 16:08:15 +10:00
patch.go fixed check null at tensor ops return slice of tensors and clean-up 2023-07-07 16:08:15 +10:00
print_test.go improved ts.Format() info 2022-04-28 17:39:46 +10:00
print.go reworked gotch.dtype with more dtypes 2023-07-07 00:01:23 +10:00
README.md cleanup 2022-05-06 18:21:52 +10:00
scalar.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
tensor_mem_test.go fetched from master 2023-07-04 23:26:20 +10:00
tensor_test.go added missing 2023-07-23 12:17:27 +10:00
tensor-generated.go changed []ts.Tensor -> []*ts.Tensor 2023-07-05 23:56:48 +10:00
tensor.go clean-up 2023-07-23 12:36:55 +10:00
util.go reworked gotch.dtype with more dtypes 2023-07-07 00:01:23 +10:00

BENCHMARK

Convolution 2D

Ref.

  1. https://tigress-web.princeton.edu/~jdh4/PyTorchPerformanceTuningGuide_GTC2021.pdf
  2. https://github.com/soumith/convnet-benchmarks

Benchmark tensor operation conv2d forward propagation:

  • input shape: [32, 64, 64, 64]
  • kernel: [64, 3, 3]
goos: linux
goarch: amd64
pkg: github.com/sugarme/gotch/ts
cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
BenchmarkConv2dCPU-8                 100          21198303 ns/op
BenchmarkConv2dCUDA-8                100           2201213 ns/op

CUDA 11.1
CuDNN 8.0.5

gotch

name          time/op
Conv2dCPU-8   21.2ms ± 0%
Conv2dCUDA-8  2.20ms ± 0%

Python Pytorch 1.11

conv2d-CPU(x):   56.7 ms
conv2d-CUDA(x):   38.0 ms

benchmark Python code below

import torch
import timeit

x = torch.randn(32, 64, 64, 64)

def conv2dCPU(x):
    conv1 = torch.nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=0, bias=False)
    return conv1(x)

def conv2dCUDA(x):
    x = x.cuda()
    conv1 = torch.nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=0, bias=False).cuda()
    return conv1(x)

t0 = timeit.Timer(
    stmt='conv2dCPU(x)',
    setup='from __main__ import conv2dCPU',
    globals={'x': x})

t1 = timeit.Timer(
    stmt='conv2dCUDA(x)',
    setup='from __main__ import conv2dCUDA',
    globals={'x': x})

print(f'conv2d-CPU(x):  {t0.timeit(100) / 100 * 1e3:>5.1f} ms')
print(f'conv2d-CUDA(x):  {t1.timeit(100) / 100 * 1e3:>5.1f} ms')