Overview
Atorch.Tensor is a multi-dimensional matrix containing elements of a single data type. PyTorch defines 10 tensor types with CPU and GPU variants.
Tensor Creation
Tensor.__init__()
Tensor.__init__()
Tensor.new_tensor()
Tensor.new_tensor()
Returns a new Tensor with data as the tensor data.
The returned Tensor copies data.
The desired data type. Default: same as this tensor.
The desired device. Default: same as this tensor.
If autograd should record operations.
Tensor.new_zeros()
Tensor.new_zeros()
Tensor.new_ones()
Tensor.new_ones()
Tensor Properties
Tensor.shape
Tensor.shape
Returns the size of the tensor.
A tuple-like object of integers representing tensor dimensions.
Tensor.dtype
Tensor.dtype
Returns the data type of the tensor.Available dtypes:
Data type (torch.float32, torch.int64, etc).
- Float: torch.float16, torch.float32, torch.float64, torch.bfloat16
- Integer: torch.int8, torch.int16, torch.int32, torch.int64
- Unsigned: torch.uint8, torch.uint16, torch.uint32, torch.uint64
- Complex: torch.complex64, torch.complex128
- Boolean: torch.bool
Tensor.device
Tensor.device
Returns the device where the tensor is located.
Device object (cpu, cuda:0, cuda:1, etc).
Tensor.ndim
Tensor.ndim
Tensor.numel()
Tensor.numel()
Tensor.requires_grad
Tensor.requires_grad
Returns True if gradients need to be computed for this Tensor.
Whether gradient tracking is enabled.
Mathematical Operations
Tensor.abs()
Tensor.abs()
Computes the absolute value of each element.Formula:
Tensor with absolute values.
out_i = |input_i|Tensor.add()
Tensor.add()
Tensor.matmul()
Tensor.matmul()
Tensor.sum()
Tensor.sum()
Tensor.mean()
Tensor.mean()
Shape Manipulation
Tensor.reshape()
Tensor.reshape()
Tensor.view()
Tensor.view()
Tensor.transpose()
Tensor.transpose()
Tensor.permute()
Tensor.permute()
Tensor.squeeze()
Tensor.squeeze()
Returns a tensor with all dimensions of size 1 removed.
If given, only squeeze this dimension if size is 1.
Tensor.unsqueeze()
Tensor.unsqueeze()
Returns a tensor with a dimension of size 1 inserted at the specified position.
Index at which to insert the dimension.
Device Transfer
Tensor.to()
Tensor.to()
Tensor.cuda()
Tensor.cuda()
Tensor.cpu()
Tensor.cpu()
Returns a copy of this tensor in CPU memory.
Gradient Operations
Tensor.requires_grad_()
Tensor.requires_grad_()
Tensor.detach()
Tensor.detach()
Returns a new Tensor detached from the current graph.
Detached tensor (shares storage, but no grad tracking).
Tensor.backward()
Tensor.backward()
Computes the gradient of current tensor w.r.t. graph leaves.
Gradient w.r.t. the tensor. If None and tensor is scalar, uses torch.ones_like(tensor).
If False, free the graph used to compute grads.
If True, graph of the derivative will be constructed for higher order derivatives.
Tensor.grad
Tensor.grad
This attribute is None by default. It accumulates gradients during backward().
Accumulated gradients.
Indexing and Slicing
In-place Operations
Operations with a trailing_ modify the tensor in-place:
Best Practices
Memory Efficiency
Memory Efficiency
- Use in-place operations (
add_,mul_, etc.) when safe to reduce memory - Prefer
view()overreshape()when possible (avoids copy) - Use
detach()when you don’t need gradients - Clear gradients with
tensor.grad = Noneinstead oftensor.grad.zero_()for better memory
Shape Operations
Shape Operations
- Use
view(-1)to flatten tensors - Use
unsqueeze()to add broadcasting dimensions - Use
contiguous()beforeview()if tensor is not contiguous - Prefer
reshape()overview()for more robust code
Device Management
Device Management
- Create tensors on target device directly:
torch.randn(3, 4, device='cuda') - Use
.to(device)for device-agnostic code - Set
non_blocking=Truefor async CPU→GPU transfers when possible
Related APIs
torch Module
Core PyTorch functions
Autograd API
Automatic differentiation
CUDA API
GPU operations