flopscope.
Guides

FFT Operations

Use this page to learn how to use fnp.fft operations and understand their FLOP costs.

You will learn:

  • How to use 1-D, 2-D, and N-D FFT operations and their cost formulas
  • How to choose between real and complex transforms to save FLOPs
  • How to query FFT costs before committing budget

Prerequisites

Cost model

FFT costs are based on the Cooley-Tukey radix-2 algorithm:

TransformCost FormulaExample (n=1024)
fft, ifft5nlog2n5n \cdot \lceil\log_2 n\rceil51,200
rfft, irfft5(n/2)log2n5(n/2) \cdot \lceil\log_2 n\rceil25,600
fft2, ifft25Nlog2N5N \cdot \lceil\log_2 N\rceil where N=n{1}n{2}N = n_\{1\} \cdot n_\{2\}varies
fftn, ifftn5Nlog2N5N \cdot \lceil\log_2 N\rceil where N={i}n{i}N = \prod_\{i\} n_\{i\}varies
fftfreq, rfftfreq0 (free)0
fftshift, ifftshift0 (free)0

Real-valued transforms (rfft, irfft, rfftn, irfftn) cost roughly half of their complex counterparts because they exploit conjugate symmetry.

Basic usage

import flopscope as flops
import flopscope.numpy as fnp

with flops.BudgetContext(flop_budget=1_000_000) as budget:
    # Generate a signal (costs numel(output) = 1,024 FLOPs)
    signal = fnp.random.randn(1024)

    # Forward FFT: 5 * 1024 * 10 = 51,200 FLOPs
    spectrum = fnp.fft.fft(signal)

    # Inverse FFT: same cost
    reconstructed = fnp.fft.ifft(spectrum)

    # Frequency bins (free)
    freqs = fnp.fft.fftfreq(1024)

    # Total: randn 1,024 + fft 51,200 + ifft 51,200 + fftfreq 0 = 103,424
    print(f"Total FFT cost: {budget.flops_used:,}")  # 103,424

Real vs complex transforms

When your input is real-valued (which is common in signal processing), prefer rfft over fft — it costs half as much:

import flopscope as flops
import flopscope.numpy as fnp

with flops.BudgetContext(flop_budget=1_000_000) as budget:
    signal = fnp.random.randn(1024)  # 1,024 FLOPs

    # Complex FFT: 51,200 FLOPs
    spec_complex = fnp.fft.fft(signal)

    budget_after_fft = budget.flops_used

    # Real FFT: 25,600 FLOPs
    spec_real = fnp.fft.rfft(signal)

    rfft_cost = budget.flops_used - budget_after_fft

    print(f"fft cost:  {budget_after_fft:,}")   # 52,224 (randn 1,024 + fft 51,200)
    print(f"rfft cost: {rfft_cost:,}")           # 25,600

The output of rfft has shape (n//2 + 1,) instead of (n,), since the negative frequencies are redundant for real inputs.

Multi-dimensional FFT

Use fft2 for 2-D transforms (e.g., images) and fftn for arbitrary dimensions:

import flopscope as flops
import flopscope.numpy as fnp

with flops.BudgetContext(flop_budget=10**8) as budget:
    # 2-D image (costs numel(output) = 65,536 FLOPs)
    image = fnp.random.randn(256, 256)

    # 2-D FFT
    spectrum_2d = fnp.fft.fft2(image)
    print(f"2D FFT cost: {budget.flops_used:,}")

    # N-D FFT with explicit shape
    volume = fnp.random.randn(32, 32, 32)
    spectrum_3d = fnp.fft.fftn(volume)

Windowed FFT pattern

A common signal processing pattern — window the signal before FFT to reduce spectral leakage:

import flopscope as flops
import flopscope.numpy as fnp

with flops.BudgetContext(flop_budget=1_000_000) as budget:
    signal = fnp.random.randn(1024)

    # Window function (counted — hamming costs n FLOPs)
    window = fnp.hamming(1024)

    # Apply window (counted — multiply costs n FLOPs)
    windowed = fnp.multiply(signal, window)

    # FFT (counted)
    spectrum = fnp.fft.rfft(windowed)

    print(budget.summary())

Query costs before running

from flopscope.flops import fft_cost, rfft_cost

# Check cost of a large FFT before committing budget
n = 2**20  # ~1 million points
print(f"Complex FFT: {fft_cost(n):,} FLOPs")   # 104,857,600
print(f"Real FFT:    {rfft_cost(n):,} FLOPs")   # 52,428,800

Common pitfalls

Symptom: Using fnp.fft.fft on real data when fnp.fft.rfft would suffice

Fix: rfft costs half as much. If your input is real-valued, always prefer rfft/irfft over fft/ifft.

Symptom: Unexpectedly high cost for multi-dimensional FFT

Fix: The cost scales as 5n{i}log2(n{i})5 \cdot \prod n_\{i\} \cdot \lceil\log_2(\prod n_\{i\})\rceil. A 256x256 2-D FFT processes 65,536 elements, not 256. Use fft_cost to estimate before running.

On this page